by jegly
Private on-device AI suite for Android. Fork of Google AI Edge Gallery with llama.cpp, whisper.cpp, stable-diffusion.cpp, GGUF import, voice chat, vision AI, on-device image generation, biometric lock, encrypted history, and NPU/TPU/GPU acceleration.
# Add to your Claude Code skills
git clone https://github.com/jegly/BoxNo comments yet. Be the first to share your thoughts!
If this project helped you, please βοΈ star it to help others find it
Note: If you're using a custom ROM (LineageOS, GrapheneOS, CalyxOS), please use the custom-rom-support-v1.0.3 release instead.
Box is a security-hardened fork of Google AI Edge Gallery β with on-device image generation, voice mode (speech-to-speech AI chat), voice input, document analysis, vision AI, biometric lock, encrypted chat history, llama.cpp support, and GGUF model import.
Box is an independent community fork of Google AI Edge Gallery and is not affiliated with or endorsed by Google LLC. Google branding has been replaced throughout. All credit for the underlying platform goes to Google and the original contributors β this fork simply builds on top of their work.
Built OfflineLLM first β a privacy-first Android chat app with a llama.cpp backend.
This project (Box) forks Google's AI Edge Gallery to create a hybrid LiteRT / llama.cpp feature rich hybrid experience.
β Try the OfflineLLM app for pure llama.cpp on-device chat.
Box is an Android app for running AI entirely on-device β chat, voice mode, image generation, speech-to-text, document analysis, and vision, all without a network connection. It inherits the full feature set of the upstream Google AI Edge Gallery and layers on top: encrypted conversations, biometric lock, hard offline mode, and three additional native inference engines (llama.cpp, stable-diffusion.cpp, whisper.cpp) alongside LiteRT.
What makes Box unique? You can sit at your desk, tap two buttons, and have a real flowing voice conversation with an AI β no wake word, no account, no server, no subscription. It listens, thinks, and speaks back sentence by sentence before it's even finished generating. Point the camera at something and ask about it out loud. The AI sees it and answers. All of it runs on the phone in your hand, completely offline, faster than you'd expect.
I have now created a separate branch called custom-rom-support, along with a corresponding release section specifically for users on third-party operating systems. If you are using a custom ROM, please use the custom-rom-support branch/release instead of the main branch. This branch supports TPU/NPU acceleration on Tensor devices; however, Snapdragon acceleration remains untested. Please expect broken features if you are using a custom ROM and running the current release or branch from main. A separate APK and branch (custom-rom-support) are now available for users on third-party Android operating systems, including but not limited to LineageOS, GrapheneOS, and CalyxOS. Note: The primary reason for these limitations is that third-party operating systems typically lack AICore and system-level Text-to-Speech (TTS) components. As a result, features such as voice-to-voice mode and NPU/GPU acceleration are either unavailable or significantly impaired on these ROMs.
Box is a fork of Google AI Edge Gallery. The upstream project is excellent β Box just layers on additional capabilities:
| Area | What Box adds | |---|---| | Inference engines | llama.cpp (GGUF LLMs), stable-diffusion.cpp (image gen), whisper.cpp (STT) alongside LiteRT | | Model import | Import any local GGUF file β not limited to the curated download list | | NPU / TPU | All Snapdragon / Tensor / MediaTek variants bundled in one APK (upstream ships per-SoC) | | Voice mode / Vision mode| Free talk (continuous hands-free loop) and Vision talk (live camera + voice) | | Image generation | On-device Stable Diffusion via GGUF | | Speech-to-text | On-device Whisper STT | | Document analysis | Attach text files directly in chat | | Chat history | Persisted to a SQLCipher-encrypted Room database, resumable across sessions | | Security | Biometric app lock, hard offline mode, prompt sanitisation, audit log | | Agent skills | 20 built-in skills (upstream has 9) | | Math rendering | LaTeX expressions rendered as Unicode in chat |
Multi-turn conversations with on-device LLMs. Import any GGUF model or download LiteRT models from the built-in list. Supports Thinking Mode on compatible models. Full markdown rendering with LaTeX math support β Greek letters, operators, fractions, and notation are rendered as Unicode symbols. Conversations are persisted and resumable.
Recommended models: We highly recommend Gemma 4 E2B or Gemma 4 E4B (LiteRT) as your primary models β best-tested, support vision, voice, and documents, and run efficiently with GPU/NPU acceleration. Available to download dir