onLM Runs Gemma 4, Qwen on iPhone
0We are all used to cloud based AI models. If you value your privacy, you may want to explore local models. onLM is a handy app that lets you run Gemma 4, Qwen 3.5, Phi 4 Mini, and other models on your device. You don’t need to sign up for an account or use a server to put these models to use. This app lets you chat privately, have your audio transcribed, and summarize long recordings.

onLM supports Gemma 4, Gemma 3, Qwen 3.5, Llama 3, Mistral 7B. It runs on your iPhone based on Apple’s MLX framework. This app is smart enough to detect your iPhone’s hardware to recommend the model that you can run with it. You will be able to organize and rename chats.

