Android-MVVM-Architecture-Android-Voice-AI-SDK

Name: Android-MVVM-Architecture-Android-Voice-AI-SDK
Author: ahmedeltaher

Verified

Voice AI SDK is a reusable Android library that gives any app a full voice-driven AI conversation pipeline in minutes. Voice Assistant + Android Voide AI + SDK + MVVM + Kotlin

2,570stars

613forks

Kotlin

Installation

# Add to your Claude Code skills
git clone https://github.com/ahmedeltaher/Android-MVVM-Architecture-Android-Voice-AI-SDK

Getting Started

Guides for using ai agents skills like Android-MVVM-Architecture-Android-Voice-AI-SDK.

Security ReportVerified

Last scanned: 6/2/2026

{
  "issues": [],
  "status": "PASSED",
  "scannedAt": "2026-06-02T08:37:41.751Z",
  "npmAuditRan": true,
  "pipAuditRan": true
}

README.md

Frequently Asked Questions

What is Android-MVVM-Architecture-Android-Voice-AI-SDK?

Android-MVVM-Architecture-Android-Voice-AI-SDK is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by ahmedeltaher. Voice AI SDK is a reusable Android library that gives any app a full voice-driven AI conversation pipeline in minutes. Voice Assistant + Android Voide AI + SDK + MVVM + Kotlin. It has 2,570 GitHub stars.

Is Android-MVVM-Architecture-Android-Voice-AI-SDK safe to use?

Yes. Android-MVVM-Architecture-Android-Voice-AI-SDK passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install Android-MVVM-Architecture-Android-Voice-AI-SDK?

Clone the repository with "git clone https://github.com/ahmedeltaher/Android-MVVM-Architecture-Android-Voice-AI-SDK" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is Android-MVVM-Architecture-Android-Voice-AI-SDK written in?

Android-MVVM-Architecture-Android-Voice-AI-SDK is primarily written in Kotlin. It is open-source under ahmedeltaher on GitHub, so you can review or fork the full source.

Are there alternatives to Android-MVVM-Architecture-Android-Voice-AI-SDK?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh Android-MVVM-Architecture-Android-Voice-AI-SDK against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

spider commands

Android Voice AI SDK in Model-View-ViewModel (ie MVVM)

MVVM3

flowchart LR
    Microphone --> AudioRecord --> VAD --> STT --> ClaudeAI["Claude AI"] --> TTS --> Speaker

The Android Voice AI SDK is a reusable Android library that gives any app a full voice-driven AI conversation pipeline in minutes. It captures audio from the device microphone, transcribes speech to text, sends the transcript to Anthropic Claude for an intelligent response, and speaks the reply back to the user through text-to-speech — all wired together with a single VoiceAISDK.Builder call. The SDK ships ready-to-drop-in Jetpack Compose UI components, swappable STT/TTS engine adapters, on-device emotion detection, and security utilities including PII redaction and encrypted key storage.

Features

Layer	Capability
Audio Input	Voice Activity Detection (VAD), noise handling, streaming PCM capture
Recognition	Speech-to-Text (STT), language detection, speaker diarization
Understanding	Intent extraction, entity recognition, conversation context
Action	API orchestration, workflow execution, task automation
Response	LLM answer generation (Anthropic Claude)
Voice Output	Text-to-Speech (TTS), voice style selection, audio streaming
Safety	User consent, authentication, abuse prevention
Analytics	Conversation logs, session summaries, quality metrics

Requirements

Requirement	Version
Android Studio	Meerkat or newer
Minimum SDK	24 (Android 7.0)
Kotlin	2.0+ (project uses 2.3.21)
Anthropic API key	Required — obtain at console.anthropic.com

Quick Start

Step 1 — Add the dependency and manifest permissions

In your app build.gradle.kts:

dependencies {
    implementation("com.sdk:voice-ai-sdk:1.0.0")
}

In app/src/main/AndroidManifest.xml:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />

Step 2 — Hilt setup

Annotate your Application class with @HiltAndroidApp and your Activity with @AndroidEntryPoint:

@HiltAndroidApp
class MyApp : Application()

@AndroidEntryPoint
class MainActivity : ComponentActivity() { ... }

Step 3 — Add your API key to local.properties

local.properties is git-ignored, so your key never ends up in source control:

ANTHROPIC_API_KEY=sk-ant-...

Then expose it via BuildConfig in app/build.gradle.kts:

defaultConfig {
    buildConfigField(
        "String",
        "ANTHROPIC_API_KEY",
        "\"${project.findProperty("ANTHROPIC_API_KEY") ?: ""}\"",
    )
}

buildFeatures {
    buildConfig = true
}

Step 4 — Build the SDK

Provide the SDK through Hilt by creating an AppModule:

@Module
@InstallIn(SingletonComponent::class)
object AppModule {

    @Provides
    @Singleton
    fun provideVoiceAIConfig(): VoiceAIConfig =
        VoiceAIConfig(anthropicApiKey = BuildConfig.ANTHROPIC_API_KEY)

    @Provides
    @Singleton
    fun provideVoiceAISDK(
        @ApplicationContext context: Context,
        config: VoiceAIConfig,
    ): VoiceAISDK = VoiceAISDK.Builder(context)
        .anthropicApiKey(config.anthropicApiKey)
        .debugLogging(BuildConfig.DEBUG)
        .build()
}

Or construct the SDK directly without Hilt:

val sdk = VoiceAISDK.Builder(context)
    .anthropicApiKey(BuildConfig.ANTHROPIC_API_KEY)
    .debugLogging(true)
    .config { copy(systemPrompt = "You are a concise voice assistant.") }
    .build()

val session: VoiceAISession = sdk.createSession()
session.start()

Step 5 — Add the VoiceScreen composable

Use VoiceSessionPermissionGate to handle the RECORD_AUDIO runtime permission automatically, then place VoiceButton and ConversationView inside:

@Composable
fun VoiceScreen(viewModel: VoiceViewModel = hiltViewModel()) {
    VoiceSessionPermissionGate(
        rationale = "Microphone access is required for voice conversations.",
    ) {
        Column(
            modifier = Modifier
                .fillMaxSize()
                .padding(16.dp),
            verticalArrangement = Arrangement.SpaceBetween,
        ) {
            ConversationView(
                messages = viewModel.messages.collectAsStateWithLifecycle().value,
                modifier = Modifier.weight(1f),
            )
            VoiceButton(
                session = viewModel.session,
                modifier = Modifier.align(Alignment.CenterHorizontally),
            )
        }
    }
}

Architecture

The SDK is organised into six layers, each with a single responsibility:

Layer	Package	Responsibility
Audio	`audio/`	Raw PCM capture via `AudioRecord`, voice activity detection (VAD), audio level metering, and PCM-to-WAV conversion
STT	`stt/`	`SpeechToTextEngine` interface with a drop-in Android built-in implementation; plug in Whisper or any other engine
AI	`ai/`	`AIEngine` interface backed by `ClaudeAIEngine`, which wraps the official Anthropic Java SDK and maintains conversation history
TTS	`tts/`	`TextToSpeechEngine` interface with a drop-in Android built-in implementation; plug in ElevenLabs for premium voices
Session	`VoiceAISession`	Orchestrates the full pipeline — audio in, transcript out, AI reply, speech out — as a single coroutine-based lifecycle
UI	`ui/`	Ready-to-use Jetpack Compose components: `VoiceButton`, `ConversationView`, `VoiceSessionPermissionGate`, `WaveformVisualizer`, `LiveCaptionBanner`, `VoiceStatusIndicator`

Available Engines

Category	Engine	Class	Notes
STT	Android built-in	`AndroidSttEngine`	Default; free; uses `android.speech.SpeechRecognizer`; requires network
STT	OpenAI Whisper	`WhisperSttEngine`	Higher accuracy; POSTs PCM/WAV to OpenAI REST API; requires OpenAI key
AI	Anthropic Claude	`ClaudeAIEngine`	Default and only AI engine; uses `com.anthropic:anthropic-java`; model is configurable
TTS	Android built-in	`AndroidTtsEngine`	Default; free; uses `android.speech.tts.TextToSpeech`
TTS	ElevenLabs	`ElevenLabsTtsEngine`	High-quality natural voices; POSTs to ElevenLabs REST API; requires ElevenLabs key
Emotion	On-device	built-in	Lightweight on-device audio feature analysis; no external key required
Emotion	Hume AI	`HumeEmotionDetector`	Cloud-based; high accuracy across 7 emotions; requires Hume API key

Configuration Reference

All options are fields on VoiceAIConfig. Pass a config { } block to VoiceAISDK.Builder to override defaults.

Field	Type	Default	Description
`anthropicApiKey`	`String`	—	Required. Your Anthropic API key. Never hardcode; read from `BuildConfig` or encrypted storage.
`aiModel`	`String`	`"claude-3-5-sonnet-20241022"`	Claude model ID used for all AI turns.
`systemPrompt`	`String?`	`"You are a helpful voice assistant…"`	System instruction prepended to every conversation.
`inputMode`	`InputMode`	`HANDS_FREE`	`HANDS_FREE` activates VAD; `PUSH_TO_TALK` records only while button is held.
`locale`	`Locale`	`Locale.getDefault()`	Locale passed to the STT engine for language hints.
`silenceTimeoutMs`	`Long`	`1200`	Milliseconds of silence after speech before the STT turn is finalised.
`maxHistoryTurns`	`Int`	`20`	Maximum number of conversation turns kept in the Claude context window.
`piiRedaction`	`Boolean`	`false`	When `true`, strips phone numbers, emails, and credit-card numbers from transcripts before sending to the AI.
`emotionDete