Providers
Ollama
Run models locally with Ollama — no API key, no cloud dependency.
Setup
Install Ollama, then pull a model with tool-calling support:
ollama pull qwen2.5-coder:32b
ollama serveAdd the Vercel AI SDK provider for Ollama:
pnpm add ollama-ai-provider-v2import { AiSdkProvider } from "noumen";
import { createOllama } from "ollama-ai-provider-v2";
const ollama = createOllama({
// Point at a remote Ollama server by overriding baseURL:
// baseURL: "http://192.168.1.10:11434/api",
});
const provider = new AiSdkProvider({
model: ollama("qwen2.5-coder:32b"),
});Options
All connection-level options come from createOllama.
| Option | Source | Description |
|---|---|---|
baseURL | createOllama | Ollama server URL. Defaults to http://localhost:11434/api. |
headers | createOllama | Custom headers sent with every request. |
Environment variables
| Variable | Description |
|---|---|
OLLAMA_HOST | Override the Ollama server address (e.g. http://192.168.1.10:11434). The CLI shorthand (provider: "ollama") reads this to derive baseURL. |
When OLLAMA_HOST is set, the CLI auto-detects Ollama as the provider.
CLI usage
The CLI auto-detects a running Ollama server when no cloud API keys are configured:
# Ollama is detected automatically
noumen "refactor this module"
# Or be explicit
noumen -p ollama -m llama3.1:70b "explain this code"Recommended models
Ollama models must support OpenAI-compatible function calling for the full agent loop. Recommended models:
qwen2.5-coder:32b— strong coding and tool-calling (default).qwen2.5-coder:14b— lighter alternative with good tool-calling.llama3.1:70b— general-purpose with function calling.llama3.1:8b— fast local option for simpler tasks.mistral:7b— compact with tool support.command-r:35b— Cohere's model with native tool use.
Streaming
ollama-ai-provider-v2 speaks LanguageModelV2 directly — no OpenAI-compat layer, no HTTP-level translation in noumen. Streaming, tool calls, and structured output work identically to any other AI SDK provider.