How to connect Silly Tavern to an API: OpenRouter vs local inference
Part 3 of “Using Silly Tavern.” Routes and tradeoffs—not a screenshot-for-every-menu guide; trust the labels in your build.
Pick a lane: cloud or local
| Priority | Lean toward |
|---|---|
| Low ops, pay per use, lighter PC | Cloud API (aggregators like OpenRouter, or a provider directly) |
| Privacy, offline, upfront hardware cost | Local inference (Koboldcpp, text-generation-webui, etc.) |
Cloud is takeout; local is your own kitchen. Neither is universally “better.”
Names, briefly
- OpenRouter: a popular API router/aggregator—model catalog and billing live on their site.
- Koboldcpp: a common local inference stack exposing an HTTP API.
- Oobabooga (text-generation-webui): another widely used local Web UI + API people call “oob.”
Names fork and versions drift—what matters is: is the server running, and does ST point at the right host:port.
Generic wiring steps (verbs)
- Obtain a base URL from the provider—or your local app’s listen address (e.g.
http://127.0.0.1:5000). - Create an API key if required; paste only in private settings.
- In ST’s connection panel, pick the right provider/compatibility mode if offered; fill URL + key.
- Save and send a test prompt; copy the exact error text when debugging.
Never embed keys inside character JSON or share screenshots of keys.
Local adds one gate
- Start Koboldcpp / Oobabooga alone; verify a browser/curl check works.
- Then paste the same URL into ST.
- If VRAM/RAM is tight, smaller models or shorter context beat random knob twiddling.
See also
Series: Using Silly Tavern · index
Manage local PNG character cards on Mac with Sillycard. Features: App Store and in-app copy.