Gemma Chat: Gemma 4 Local Vibe Coding Agent for Macs Using Apple’s MLX Framework
0
Apple’s MLX framework shouldn’t need any introduction to AI enthusiasts. It is a machine learning framework that treats CPU and GPU as a single shared space to let you run ML apps more efficiently. It offers faster local inference with lower overhead for model loading. Developers are using it to do all kinds of fun things. Take Gemma Chat for instance: it is a local coding agent that runs on your Mac via Apple’s MLX, so you won’t have to rely on API keys or a cloud connection to work.

Gemma Chat is an Electron app that helps you with vibe coding. Just describe what you want to do, and the rest is easy. It runs on MLX-LM. You will be able to switch between variants when needed. This model works offline once you have downloaded it. It supports speech to text via Whisper. For Gemma 4 31B, you are going to need over 32GB RAM though.
[where to get it]