Chrome On-Device (Gemini Nano)
Claude Code Router can use Chrome's built-in Gemini Nano model — a ~4GB local LLM that runs entirely on your device with zero API cost and zero latency to external providers. This is accessed through Chrome's Prompt API via the Chrome DevTools Protocol (CDP).
How It Works
ccr chrome-bridgestarts a bridge process that connects to Chrome's Prompt API- The
chrome-on-devicetransformer translates requests between OpenAI Chat Completions format and Chrome's Prompt API format - Responses are streamed back through the router as standard SSE
- The model runs entirely locally in Chrome — no data leaves your machine
Prerequisites
- Google Chrome (Canary or Dev channel) with Gemini Nano enabled
- Node.js installed on the host (the bridge runs on the host, not inside Docker)
- Claude Code Router running
Setup
1. Enable Gemini Nano in Chrome
- Install Chrome Canary or Chrome Dev
- Go to
chrome://flags/#prompt-api-for-gemini-nanoand enable Prompt API for Gemini Nano - Go to
chrome://flags/#optimization-guide-on-device-modeland enable Enables optimization guide on device - Restart Chrome
- Wait for the model to download (check
chrome://components/for Optimization Guide On Device Model) - Verify the model is ready: open DevTools console and run:
It should return(await ai.languageModel.capabilities()).available
"readily"
2. Start the Bridge
Run the bridge process on your host machine (not in Docker):
ccr chrome-bridge
The bridge listens on http://127.0.0.1:9229 by default.
3. Configure Provider
Add the Chrome provider to your ~/.claude-code-router/config.json:
{
"Providers": [
{
"name": "chrome",
"baseUrl": "http://127.0.0.1:9229",
"apiKey": "dummy",
"models": ["gemini-nano"],
"transformer": {
"use": ["chrome-on-device"]
}
}
],
"Router": {
"background": "chrome,gemini-nano"
}
}
4. Restart
docker compose restart ccr
Features
- Zero API cost — Fully local inference
- Zero external latency — No network requests to providers
- Streaming and non-streaming — Full SSE streaming support
- Structured output — Uses
responseConstraintfor reliable JSON tool calls - Automatic stall recovery — If the model stalls (whitespace-only output), the bridge retries with higher temperature
Limitations
- Model quality — Gemini Nano is a small on-device model, best suited for simple tasks and background work
- Chrome dependency — Requires Chrome with Gemini Nano enabled; not available in all browsers
- Node.js on host — The bridge process must run on the host, not in Docker
- No external tools — Restricted to the model's built-in capabilities
Use Cases
- Background tasks — Route
background-scenario requests to Gemini Nano - Simple queries — Quick answers, text summarization, formatting
- Offline-capable — Works without internet access once the model is downloaded
Troubleshooting
Bridge won't connect: Ensure Chrome is running with --remote-debugging-port=9229 or the bridge can't connect to the Prompt API.
Model not available: Check chrome://components/ for the Optimization Guide model status. If it's not downloaded, wait and restart Chrome.
Slow responses: The first request may be slow as the model loads. Subsequent requests should be faster.
Architecture
┌──────────┐ ┌──────────────┐ ┌──────────────────────┐
│ Claude │────▶│ CCR Server │────▶│ chrome-on-device │
│ Code │ │ (Docker) │ │ transformer │
└──────────┘ └──────────────┘ └──────────┬───────────┘
│
▼
┌──────────────────┐
│ ccr chrome-bridge │
│ (host process) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Chrome Prompt API│
│ (Gemini Nano) │
└──────────────────┘