Executive Overview
The Jarvis paradigm — an always-available, context-aware, voice-enabled personal AI — is no longer science fiction. Claude Code, combined with its MCP server ecosystem, headless mode, and remote control capabilities, provides every building block needed to construct a personal assistant that rivals the fictional original. This report synthesizes research across six domains: remote mobile access, voice interaction, messaging bridges, home automation, daemon orchestration, and the open-source tooling landscape.
- No coding required — entire system built through natural language conversation
- Built incrementally over weeks: task tracking first, then email, then daily planning
- Key insight: "Claude Code is an AI agent with access to your computer" — not just a chatbot
- Uses MCP servers for Gmail and Calendar integration
- Manages fitness/health/relationship goals with weekly reviews
- Maintains context across multiple client projects and follows up with contacts
- Architecture: Immutable daemon (body) vs mutable instructions (brain)
- telegram_daemon.py: Telethon client + event router + socket server (92 JSON-RPC methods)
- Memory: Hot memory via memory.md + long-term storage in Obsidian vault synced with git
- Self-deploy: Agent can modify its own CLAUDE.local.md, then trigger Coolify redeploy
- Security: Access control matrix based on chat_id, not message content (prompt injection defense)
- Scheduling: CRONTAB.md parsed every 60 seconds for scheduled tasks
- Cost: Hetzner VPS ~$5/mo + Claude API usage $20-50/mo
Remote Access from Mobile
Five distinct pathways exist for accessing Claude Code from a phone or tablet — from Anthropic's official remote control to raw SSH tunnels. Each trades off convenience against control.
/rc to get a QR code. Scan with Claude app (iOS/Android) for full session control. No inbound ports opened — all traffic via Anthropic API over TLS.- Available on Pro, Max, Team, Enterprise plans
- Server mode:
claude remote-control— stays running, waits for connections - Interactive mode:
claude --remote-control— full local + remote session --spawn worktree— each session gets its own git worktree--capacity <N>— up to 32 concurrent sessions- Auto-reconnects after network drops (up to ~10 minutes)
- Limitation: Can't START sessions from mobile, only continue existing ones
# Start remote control server on desktop claude remote-control # Or with worktree isolation + multi-session claude remote-control --spawn worktree --capacity 8 # Interactive: local + remote simultaneously claude --remote-control
- Setup time: approximately 20 minutes
- Use
moshinstead of SSH for connection resilience across network hops - Pain points: SSH drops on phone sleep/network transitions, tiny screen
- Tailscale creates a private mesh network — no port forwarding needed
- tmux/zellij keeps sessions alive when terminal disconnects
# Install Tailscale on both devices curl -fsSL https://tailscale.com/install.sh | sh tailscale up # On desktop: start tmux with Claude Code tmux new -s jarvis claude # From phone (Termux/Blink): connect mosh user@desktop-tailscale-ip -- tmux attach -t jarvis
- Android-specific approach using Termux terminal emulator
- Tailscale handles NAT traversal automatically
- tmux persistence means you can switch apps and reconnect
- Lower overhead than web terminal solutions
- ttyd: Lightweight, shares terminal as web app —
ttyd bash - WeTTY: Node.js-based, uses xterm.js, supports authentication
- GoTTY: Go-based, single binary, easy deployment
- Claude Code Skill for WSL2 Remote Web Terminal available on MCP Market
- Best paired with reverse proxy (nginx/caddy) + TLS for security
# Install ttyd brew install ttyd # macOS apt install ttyd # Ubuntu/Debian # Start web terminal with Claude Code ttyd -p 7681 -c user:pass bash -c "claude" # Access at http://localhost:7681 # Use Tailscale or ngrok for remote access
- Mobile IDE for Claude Code (iOS): CloudKit sync between iPhone and Mac
- Railway template: One-click
claude-code-sshcloud deployment - The Vibe Companion: Uses hidden
-sdk-urlflag, open-source web UI - Takopi: Routes Claude Code through Telegram for mobile interaction
| Method | Setup Time | Platform | Latency | Security | Reliability |
|---|---|---|---|---|---|
| Official Remote Control | 2 min | iOS, Android, Web | Low | TLS via Anthropic | Excellent |
| SSH + Tailscale + tmux | ~20 min | Any terminal | Low-Med | WireGuard | Good |
| Termux + Tailscale | ~15 min | Android | Low | WireGuard | Good |
| Web Terminal (ttyd) | ~10 min | Any browser | Medium | Depends on proxy | Fair |
| Third-Party Apps | Varies | iOS / Web | Varies | Varies | Varies |
Voice Interaction Layer
The voice layer transforms Claude Code from a terminal tool into something that speaks. Five approaches exist, ranging from simple TTS overlays to full wake-word-activated conversational pipelines running entirely on-device.
- Processes Claude Code "Stop" hooks to capture assistant messages
- Summarizes with LLMs to create 1-3 sentence Jarvis-style updates
- Speaks through
lspeakwith first-person JARVIS personality - Configurable per-project via TOML with mode-based behavior
- Low-friction: add to existing Claude Code workflow as a hook
git clone https://github.com/nickpending/clarvis cd clarvis bun install # Configure as Claude Code stop hook
- Always listening with wake word activation
- Natural conversation with automatic pause detection
- Voice commands for interrupt, skip, sleep
- Cross-platform: macOS, Linux, Windows
- Privacy-first: all processing happens locally
pip install samantha-voice-assistant # Add as MCP server to Claude Code claude mcp add samantha -- samantha
- Wake Word: Porcupine (custom "Hey Claude", "Hey Jarvis", etc.)
- STT: Cheetah Streaming Speech-to-Text (real-time)
- TTS: Orca Streaming Text-to-Speech
- All runs locally — no cloud dependency for the voice pipeline
- Only the LLM call goes to Anthropic API
pip install pvporcupine pvcheetah pvorca pvrecorder pvspeaker anthropic # Pipeline: Wake Word → STT → Claude API → TTS # See picovoice.ai for full implementation guide
npx install gives the assistant a voice. Good for quick experimentation.- Lowest barrier to entry for voice output
npxinstall — no complex configuration- TTS only (no STT/wake word)
- Good starting point before graduating to Samantha or Picovoice
- Whisper MLX: Apple Silicon optimized speech-to-text
- Kokoro TTS: High-quality local text-to-speech
- Pipeline: Whisper STT → Claude API → Kokoro TTS
- Known issue: Whisper accuracy degrades with accented speech
- More manual setup than Picovoice/Samantha but fully customizable
- Atom Echo: ~$13 hardware for dedicated voice input
- Custom wake words trainable via HA
- Streams TTS output to existing Sonos/Google/Alexa speakers
- Continuous conversation mode — no repeated wake words
- Requires Home Assistant instance with Claude integration
Messaging Bots & Channels
Messaging apps provide the most natural "Jarvis" interface — you text your AI the same way you text a person. Claude Code now has native Telegram and Discord support, plus community bridges for WhatsApp, Signal, Slack, iMessage, and Teams.
/telegram:configure, and text your AI agent like a person.- First-party support — most reliable integration path
- Full Claude Code session accessible from the messaging app
- Inherits all MCP server capabilities
- Setup: create Telegram bot via BotFather, then run
/telegram:configure
# 1. Create bot via Telegram @BotFather # 2. Get bot token # 3. In Claude Code: /telegram:configure # 4. Paste token when prompted # 5. Text your bot from any device
- 8 channels: WhatsApp, Telegram, Slack, Discord, Signal, iMessage, MS Teams, WebChat
- Live Canvas rendering for rich visual responses
- Multi-agent routing per channel — different behaviors per platform
- Gateway daemon runs via launchd (macOS) or systemd (Linux)
- Voice: can speak and listen on macOS/iOS/Android
- Operating cost: approximately $6/month
- Claude Code runs in response to Telegram messages
- Full tool use — file operations, git, terminal commands via chat
- Home automation integration via REST API calls
- Translation, research, anything accessible via command line
Home Automation Integration
Home Assistant is the clear winner for smart home integration with Claude. The official MCP server exposes 70+ tools — lights, thermostats, switches, automations — all accessible via natural language through Claude Code.
- Official integration at
home-assistant.io/integrations/mcp_server/ - 70+ tools covering all HA entities and automations
- Claude Code supports remote MCP servers natively — no proxy needed
- REST API + WebSocket for real-time state updates
- Control lights, thermostats, locks, covers, media players, and more
- Alternative to official integration with community maintenance
- May have features ahead of the official server
- Same 70+ tools covering HA entities
- HA addon — installs directly within Home Assistant
- Multi-provider AI support (not locked to Claude)
- Create and debug automations via natural language
- Config file management and debugging
- SSH access: run commands directly on HA host
- SSHFS mount: edit config files as local filesystem
- Custom MCP: HA API access for entity control
- Full "vibe-coding" workflow for smart home configuration
Headless & Daemon Mode
The key to an always-on Jarvis is daemon operation — Claude Code running headless, responding to events rather than interactive prompts. Headless mode (--print) is production-ready. A combined remote-control + headless mode is the missing piece for true cloud-native agents.
- Output formats:
text,json,stream-json - Tool control:
--allowedTools,--disallowedTools,--tools - Model/limits:
--model,--max-turns,--max-budget-usd - MCP support:
--mcp-config - Perfect for CI pipelines, cron jobs, event-driven wrappers
# Simple headless call claude -p "Summarize today's unread emails" # JSON output with tool restrictions claude -p "Check server status" \ --output-format json \ --allowedTools "Bash,Read" \ --max-turns 5 # With MCP servers and budget cap claude -p "Turn off all lights" \ --mcp-config ~/.mcp/ha.json \ --max-budget-usd 0.50
- Currently remote control requires interactive mode on the host
- Headless mode cannot accept remote control connections
- Combining them would enable: cloud VPS → headless daemon → phone control
- This is the most-requested feature for always-on agents
- Status: open feature request, no implementation timeline
- Chyros: Persistent, always-on background agent — leaked project name
- Conway: Unreleased always-on background AI — likely internal codename
- DAEMON mode: Persistent background process — natural evolution of headless mode
- All sourced from leaks/reports — no official confirmation
- Suggests Anthropic is actively working on this problem space
- Background processes persist beyond individual prompts
- Subagents enable parallel development workflows
- Hooks system for automatic pre/post actions
- Anthropic is explicitly investing in autonomous agent capabilities
MCP Server Ecosystem
MCP (Model Context Protocol) servers are the "hands and feet" of a Jarvis system. They extend Claude Code with structured access to external services — voice I/O, smart home control, browser automation, and system management.
- 30+ languages for speech recognition
- Remote access from phone/tablet via browser
- Optional Whisper streaming for low-latency transcripts
- One-command setup
npx @shantur/jarvis-mcp --install-claude-code-config --local
- Navigate to URLs, click elements, fill forms
- Screenshot pages for visual verification
- Evaluate JavaScript in page context
- Handle dialogs, file uploads, network request interception
- Built into Claude Code MCP ecosystem — no additional setup
- 70+ tools for smart home entity control
- REST API + WebSocket for real-time state
- Automation creation and management
- Scene activation, script execution
- File management: read, write, search, organize
- Git operations: commit, branch, merge, review
- Terminal commands: execute with output capture
- Docker: container management, logs, deployment
Architecture Patterns
Six reference architectures, ordered by complexity. Each is a viable path to a Jarvis-like system. Most real deployments combine elements from multiple patterns.
Claude Code
Termux / Blink
Session
Claude Code
Claude App
QR code scan initiates connection. No inbound ports. Auto-reconnect on network drops.
Telegram
Claude Code -p
Telegram
Porcupine / Vosk
Whisper
Kokoro / Orca
telegram_daemon.py
CLI -p
Scheduled tasks
Telegram, HA, Coolify
Hot + Long-term
Whisper STT
+ MCP Servers
TTS
Lights, Thermostat, Locks
Messaging Bridge
Playwright Automation
Git, Docker, Terminal
All patterns converge here. This is the endgame. Each MCP server is independently deployable.
Open-Source Landscape
| Project | Stars | Description | Category | Status |
|---|---|---|---|---|
| ClawdBot / OpenClaw | High | Multi-channel personal AI assistant (8 platforms) | Messaging | Active |
| Jarvis MCP | 75 | Browser-based voice I/O for AI tools | Voice | Active |
| Clarvis | 7 | Jarvis-style voice notifications via hooks | Voice | Beta |
| Samantha | 4 | Voice assistant MCP with wake word detection | Voice | Beta |
| isair/jarvis | — | 100% private AI voice assistant | Voice | Experimental |
| AgentVibes | — | Simple TTS plugin for Claude Code | Voice | Active |
| ha-claude | — | Smart home AI assistant addon for HA | Home Auto | Beta |
| homeassistant-mcp | — | HA MCP server for Claude agents | Home Auto | Active |
| awesome-claude-code-toolkit | — | 135 agents, 35 skills, 42 commands | Meta | Active |
Quick Start Guides
Copy-paste commands to get each component running. Start with Path A (fastest), then layer on additional capabilities.
# On your desktop/laptop claude remote-control # Scan QR code with Claude app # Done. Full session from phone.
# 1. Open Telegram, message @BotFather # 2. Send: /newbot # 3. Follow prompts, get token # 4. In Claude Code: /telegram:configure # Paste your bot token # Text the bot from any device
git clone https://github.com/nickpending/clarvis cd clarvis && bun install # Or for simpler TTS: npx agent-vibes
# Option 1: Samantha MCP (recommended) pip install samantha-voice-assistant claude mcp add samantha -- samantha # Option 2: Picovoice pipeline pip install pvporcupine pvcheetah pvorca \ pvrecorder pvspeaker anthropic # Option 3: Browser-based (Jarvis MCP) npx @shantur/jarvis-mcp \ --install-claude-code-config --local
# In Home Assistant: # Settings → Integrations → Add → MCP Server # In Claude Code MCP config: # Add your HA instance URL + token # See: home-assistant.io/integrations/mcp_server/
# 1. Provision Hetzner VPS (~$5/mo) # 2. Install Docker + Coolify # 3. Deploy telegram_daemon.py # 4. Configure: # - CLAUDE.local.md (brain) # - CRONTAB.md (schedule) # - memory.md (hot memory) # - MCP server configs # 5. Set access control matrix # See: okhlopkov.com/always-on-ai-agent-server-setup/
Cost Analysis
- Claude Pro subscription ($20/mo)
- Official Remote Control (included)
- Telegram bot bridge (free)
- AgentVibes TTS (free, open-source)
- Local desktop only
- Claude Pro ($20/mo)
- Hetzner VPS ($5/mo)
- Telegram daemon (always-on)
- Samantha voice MCP (free)
- Home Assistant (free, self-hosted)
- Tailscale (free tier)
- Claude Max ($100/mo) or API ($20-50)
- Dedicated VPS ($10-20/mo)
- ClawdBot multi-channel ($6/mo)
- Picovoice pipeline (free tier)
- Home Assistant + dedicated hardware
- Custom domain + TLS
- Coolify for self-deploy
Recommended Stack Breakdown
Final Verdict & Recommendation
The Jarvis interface is buildable today. Every component exists in production-ready or near-production form. The critical path has no blockers — only tradeoffs between convenience and control.
The recommended starting point is a three-layer stack: (1) Official Remote Control for immediate mobile access with zero infrastructure, (2) Telegram bot bridge for always-available messaging that survives desktop power-offs, and (3) Samantha or Clarvis for voice personality. This gives you 80% of the Jarvis experience at $25/month.
The missing piece is the combined remote-control + headless mode (GitHub issue #30447). Once Anthropic ships this, true cloud-native Jarvis becomes trivial: a $5 VPS running Claude Code in daemon mode, accessible from any device via Telegram, voice, or the Claude app. The leaked codenames (Chyros, Conway, DAEMON) suggest this is actively being developed.
For Galen's Midas Touch project specifically: the existing Telegram-based orchestrator architecture already implements Pattern 5 (Always-On Server Agent). Adding voice output via Clarvis hooks and Home Assistant MCP would complete the Jarvis analogy without architectural changes.
Recommended Implementation Order
claude remote-control). Set up Telegram bot via native Claude Code Channels. Install Clarvis or AgentVibes for voice output. Total cost delta: $0 (uses existing Pro subscription).