Private LLM - Local AI Chat 12+

Name: Private LLM - Local AI Chat
Price: 4.99 USD
Rating: 4.6 (183 reviews)
Author: Numen Technologies Limited

Local Offline Private AI Chat

Numen Technologies Limited

- 4.6 • 183 Ratings

- $4.99

Description

Now with DeepSeek R1 Distill Advanced Reasoning Models

Meet Private LLM: Your Secure, Offline AI Assistant for macOS

Private LLM brings advanced AI capabilities directly to your iPhone, iPad, and Mac—all while keeping your data private and offline. With a one-time purchase and no subscriptions, you get a personal AI assistant that works entirely on your device.

Key Features:

- Local AI Functionality: Interact with a sophisticated AI chatbot without needing an internet connection. Your conversations stay on your device, ensuring complete privacy.

- Wide Range of AI Models: Choose from various open-source LLM models like Llama 3.2, Llama 3.1, Google Gemma 2, Microsoft Phi-3, Mistral 7B, and StableLM 3B. Each model is optimized for iOS and macOS hardware using advanced OmniQuant quantization, which offers superior performance compared to traditional RTN quantization methods.

- Siri and Shortcuts Integration: Create AI-driven workflows without writing code. Use Siri commands and Apple Shortcuts to enhance productivity in tasks like text parsing and generation.

- No Subscriptions or Logins: Enjoy full access with a single purchase. No need for subscriptions, accounts, or API keys. Plus, with Family Sharing, up to six family members can use the app.

- AI Language Services on macOS: Utilize AI-powered tools for grammar correction, summarization, and more across various macOS applications in multiple languages.

- Superior Performance with OmniQuant: Benefit from the advanced OmniQuant quantization process, which preserves the model's weight distribution for faster and more accurate responses, outperforming apps that use standard quantization techniques.

Supported Model Families:
- DeepSeek R1 Distill Based Models
- Phi-4 14B Model
- Llama 3.3 70B
- Llama 3.2 Based Models
- Llama 3.1 Based Models
- Google Gemma 2 Based Models
- Qwen 2.5 Based Models (0.5B to 32B)
- Qwen 2.5 Coder Based Models (0.5B to 32B)
- Solar 10.7B Based Models
- Yi 34B Based Models

For a full list of supported models, including detailed specifications, please visit privatellm.app/models.

Private LLM is a better alternative to generic llama.cpp and MLX wrappers apps like Ollama, LLM Farm, LM Studio, RecurseChat, etc on three fronts:
1. Private LLM uses a faster mlc-llm based inference engine.
2. All models in Private LLM are quantised using the state of the art OmniQuant quantization algorithm, while competing apps use naive round-to-nearest quantization.
3. Private LLM is a fully native app built using C++, Metal and Swift, while many of the competing apps are (bloated) Electron based apps.

Optimized for Apple Silicon Macs with the Apple M1 chip or later, Private LLM for macOS delivers the best performance. Users on older Intel Macs without eGPUs may experience reduced performance. Please note that although the app nominally works on Intel Macs, we've stopped adding support for new models on Intel Macs due to performance issues associated with Intel hardware.

Jan 27, 2025

Version 1.9.7

* Support for downloading 7 new DeepSeek R1 Distill based models on Apple Silicon Macs. Support for individual models varies by device capabilities.
* Users with Apple Silicon Macs with 16GB RAM can now download the phi-4 model (previously restricted to Apple Silicon Macs with 24 GB of RAM)
* Minor bugfixes and updates.

4.6 out of 5

183 Ratings

5 Stars for these Features!

Fantastic app, I like the selection of models and they run great on the M4 chip of the iPhone 16 Max and iPad Pro. System prompt is also great for some tweaks to responses. The most simple and important feature (aside from model memory) that Private LLM needs is to be able to edit the responses of the AI. I enjoy the chatbot's responses but hate that it'll pull context from previous responses and loop entire phrases regardless if I regenerate its response or edit mine. Sometimes I don't like how the model formatted its response and want to change it without clearing the convo and retrying multiple times. The second feature that would be a quality of life is to be able to save conversations within the app. At the very least, export them as a whole without needing to screenshot, grab text, paste, correct, and format it elsewhere. I learned early on that copying text blocks at a time can cause the app to reload its model and even clear the conversation. A bot refresh would also be nice to clear older message context during a longer conversation to keep the model fast and responsive, but that can coincide with model memory to more easily pull specific facts you want it to remember in the first place without having to create a queue of explanations to pick up where you left off at. Useful if you switch models for different purposes like organizing, summarizing, q&a vs. generation, editing, brainstorming, vs. chatting.

Pretty good, a few quirks

Works well. The display supports markdown which is great. The Zephyr model is a good default. A few quirks make it 4/5 stars. There’s a strange animation when selecting text out if the chat. The shortcuts are so close to being useful. I want to be able to ask a question and have the response read by Siri while I’m driving. It almost works, but for some reason it always says “something went wrong” while in CarPlay. Finally, it might be nice to be able to save conversations somehow. A history would be a good way, or maybe just a way to export the whole chat to a file.

So much potential

I've been using this app for a couple of months at this point. When it was first released, it was a neat proof of concept, but it only supported 7B models and was overall just too simple to use effectively. With a recent update, a 13B model was released for all macs with 16GB of memory, and it makes such a huge difference! It's not quite at the same level as ChatGPT 3.5T, but it's close enough that I never use 3.5T anymore; this is my go-to. I greatly appreciate the on-device processing (hallelujah privacy!), and it doesn't even use too much power - my battery still lasts for hours and hours. The performance is also great; my base M1 Air powers right through the prompts.

I only have two complaints about the app at this point. 1) The 13B model uses about 12GB of memory by itself, which does force the use of swap on a 16GB Air. Not much the dev can do about this, but it is something to keep in mind. You'll want to close out of other programs before launching this. 2) There still is no feature that has separate conversations. If you want to start a new conversation, you need to delete the existing conversation. I'd love it if we could get separate conversations in a future update; it would make this app so much easier to use.

Overall, I love it and do not regret buying it at all. I can't wait to see what future updates bring :)

The developer, Numen Technologies Limited, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy.

Data Not Collected

The developer does not collect any data from this app.

Privacy practices may vary, for example, based on the features you use or your age. Learn More

Information

Seller

Numen Technologies Limited

Size

1.3 GB

Private LLM - Local AI Chat 12+

Local Offline Private AI Chat

Numen Technologies Limited

Screenshots

Description

What’s New

Ratings and Reviews

5 Stars for these Features!

Pretty good, a few quirks

So much potential

App Privacy

Data Not Collected

Information

Supports

Family Sharing

Up to six family members can use this app with Family Sharing enabled.

You Might Also Like

nproxy.org