.TH ASK 1 "2026-04-13" "subtract.ing" "User Commands" .SH NAME ask \- query your local model (Claude without the browser) .SH SYNOPSIS .B ask .I query .br .B ask .I backend query .SH DESCRIPTION .B ask sends a question to your local inference model and prints the answer. .PP If you are here, you probably got tired of browser latency, rate limits, or paying per token. This is the same capability, running on your machine, with no network round-trip. .PP The simplest usage: .PP .RS .nf ask "what system call creates a child process?" .fi .RE .PP This routes to your configured local backend (ollama, llama.cpp, lm studio) and returns the answer. No account. No API key. No outage. .SH SETUP First run will prompt for your local backend: .PP .RS .nf ask --setup .fi .RE .PP Supported backends: .TP .B ollama Easiest. If you have ollama installed, ask detects it. .TP .B llama.cpp Direct llama.cpp server. Lowest latency if already running. .TP .B lm-studio LM Studio's local server endpoint. .PP Once configured, ask just works. You do not need to specify the backend. .SH EXPLICIT ROUTING After you understand the stack, you can name the backend explicitly: .PP .RS .nf ask llama.cpp "what compresses files?" ask ollama "explain namespaces" ask curl "weather in Chicago" .fi .RE .PP .B curl is special: it routes externally, crossing a network boundary you do not control. Use it when you need live data or intentionally want to query a remote API. .PP Naming the backend is optional but makes the mechanism visible. When something breaks, the command itself shows you where. .SH DISCOVERY Many questions do not need inference at all. .B ask walks the tiers in order, stopping at the first hit: .PP .B T0: lookdown.tsv .RS Pattern match against the routing table. No model, no network. .RE .PP .RS .nf ask "list files" \(-> T0 hit: ls -la .fi .RE .PP .B T0.5: apropos .RS Definitional queries route to man pages. .RE .PP .RS .nf ask "what compresses files?" \(-> apropos compression \(-> gzip, bzip2, xz... .fi .RE .PP Only on T0/T0.5 miss does .B ask escalate to inference (T2 local model or T4 cloud). This is faster, works offline, and costs nothing. .SH EXAMPLES Basic query (uses configured default): .PP .RS .nf ask "how do I resize an image?" .fi .RE .PP Explicit local backend: .PP .RS .nf ask llama.cpp "what system call forks a process?" .fi .RE .PP Explicit external query: .PP .RS .nf ask curl "current bitcoin price" .fi .RE .PP Let discovery route first: .PP .RS .nf ask "what tool finds files by name?" \(-> apropos: find(1) .fi .RE .SH WHY THIS EXISTS You arrived here because the GUI got in the way. Latency, rate limits, cost, or outage pushed you to the terminal. .PP .B ask is what remains when you subtract the browser. Same capability. No friction. Runs on hardware you own. .PP The backend naming (llama.cpp, ollama, curl) exists so the command shows its mechanism. This matters when debugging, when teaching, or when you want a child to see exactly what is being invoked. A command that names its primitive cannot lie about what it does. .SH SEE ALSO .BR ollama (1), .BR llama.cpp (1), .BR curl (1), .BR apropos (1), .BR subtract (7) .SH AUTHOR 03-git .PP Signed by hodori@subtract.ing