.TH ASK 1 "2026-04-13" "subtract.ing" "User Commands"
.SH NAME
ask \- query your local model (Claude without the browser)
.SH SYNOPSIS
.B ask
.I query
.br
.B ask
.I backend query
.SH DESCRIPTION
.B ask
sends a question to your local inference model and prints the answer.
.PP
If you are here, you probably got tired of browser latency, rate limits,
or paying per token. This is the same capability, running on your machine,
with no network round-trip.
.PP
The simplest usage:
.PP
.RS
.nf
ask "what system call creates a child process?"
.fi
.RE
.PP
This routes to your configured local backend (ollama, llama.cpp, lm studio)
and returns the answer. No account. No API key. No outage.
.SH SETUP
First run will prompt for your local backend:
.PP
.RS
.nf
ask --setup
.fi
.RE
.PP
Supported backends:
.TP
.B ollama
Easiest. If you have ollama installed, ask detects it.
.TP
.B llama.cpp
Direct llama.cpp server. Lowest latency if already running.
.TP
.B lm-studio
LM Studio's local server endpoint.
.PP
Once configured, ask just works. You do not need to specify the backend.
.SH EXPLICIT ROUTING
After you understand the stack, you can name the backend explicitly:
.PP
.RS
.nf
ask llama.cpp "what compresses files?"
ask ollama "explain namespaces"
ask curl "weather in Chicago"
.fi
.RE
.PP
.B curl
is special: it routes externally, crossing a network boundary you
do not control. Use it when you need live data or intentionally
want to query a remote API.
.PP
Naming the backend is optional but makes the mechanism visible.
When something breaks, the command itself shows you where.
.SH DISCOVERY
Many questions do not need inference at all.
.B ask
walks the tiers in order, stopping at the first hit:
.PP
.B T0: lookdown.tsv
.RS
Pattern match against the routing table. No model, no network.
.RE
.PP
.RS
.nf
ask "list files"
  \(-> T0 hit: ls -la
.fi
.RE
.PP
.B T0.5: apropos
.RS
Definitional queries route to man pages.
.RE
.PP
.RS
.nf
ask "what compresses files?"
  \(-> apropos compression
  \(-> gzip, bzip2, xz...
.fi
.RE
.PP
Only on T0/T0.5 miss does
.B ask
escalate to inference (T2 local model or T4 cloud).
This is faster, works offline, and costs nothing.
.SH EXAMPLES
Basic query (uses configured default):
.PP
.RS
.nf
ask "how do I resize an image?"
.fi
.RE
.PP
Explicit local backend:
.PP
.RS
.nf
ask llama.cpp "what system call forks a process?"
.fi
.RE
.PP
Explicit external query:
.PP
.RS
.nf
ask curl "current bitcoin price"
.fi
.RE
.PP
Let discovery route first:
.PP
.RS
.nf
ask "what tool finds files by name?"
  \(-> apropos: find(1)
.fi
.RE
.SH WHY THIS EXISTS
You arrived here because the GUI got in the way.
Latency, rate limits, cost, or outage pushed you to the terminal.
.PP
.B ask
is what remains when you subtract the browser.
Same capability. No friction. Runs on hardware you own.
.PP
The backend naming (llama.cpp, ollama, curl) exists so the command
shows its mechanism. This matters when debugging, when teaching,
or when you want a child to see exactly what is being invoked.
A command that names its primitive cannot lie about what it does.
.SH SEE ALSO
.BR ollama (1),
.BR llama.cpp (1),
.BR curl (1),
.BR apropos (1),
.BR subtract (7)
.SH AUTHOR
03-git <git@jnous.com>
.PP
Signed by hodori@subtract.ing