Codmir Assistant Vision
Where Codmir is headed as a local-first, privacy-preserving voice and desktop automation assistant.
Codmir Assistant Vision
Codmir is evolving from an AI coding companion into a local-first assistant that understands context, listens for your voice, sees the screen, and safely controls the desktop—while keeping data on your machine.
Principles
- Local-first: run core components offline (wake word, transcription, TTS, capture)
- Privacy-by-design: visible indicators (blue corners), explicit arming windows, and logs
- Composable: small workers, simple APIs, dockerized
- Interoperable: can route through n8n or direct HTTP
- Safe control: dry-run modes, per-action confirmations, and time-bound sessions
Capabilities
- Wake word “codmir” to start a session
- Low-FPS, change-aware screen capture with overlay indicator
- Microphone recording and offline speech-to-text (Vosk)
- Desktop actions: open URLs/apps, type, keys, clicks, focus windows
- Session persistence: audio + frames + transcript + metadata for review
Presentation Mode
- Demonstrate abilities on command (e.g., “open our app” on
http://localhost:3000) - Optional TTS responses (local espeak-ng by default; cloud providers documented only)
Safety
- Overlay shows when Codmir “can see the screen”
- Arm/disarm phrase for control (time-limited)
- Dry-run mode to preview actions without changing state
- Clear audit trail in session artifacts
Roadmap
- Wayland-native desktop control backends
- Rich intent routing and confirmations
- Pre/post-roll audio buffers and frame timelines
- In-app session viewer with search and summaries
Learn More
- Overview: Codmir Local Assistant
- Use Case: Voice Desktop Assistant (Local)
- Orchestration: Orchestrator