A thinking-OFF agentic coder that does MORE with LESS.
35B TOTAL3B ACTIVESPARSE MoE
Tuned for Execution, Not Overthinking.
It reads files, picks tools, edits code, runs tests, reacts to errors, and ships — with fewer tokens, lower latency, and steadier behavior across long agent loops.
35B / 3B MoE
Sparse mixture-of-experts → fast local inference
NextN / MTP Head
Self-speculative decoding, ~250 tok/s on ONE GPU
Agent-Harness Native
Codex / OpenHands / Claude Code / OpenCode loops
Plot Twist: It's Better with Thinking **OFF**.
Across a held-out behavioral + long-horizon battery, thinking-OFF was best-or-tied on 9 of 11 axes.
0
THINKING HELPED
0
NO-OP
0
HURT
Thinking helps on DECISIONS & RECALL. It hurts on PRODUCTION.
SWE-Bench Verified — Thinking Off
0
RESOLVED OF SUBMITTED PATCHES · SANS ERRORS · 171 / 274 · slice 0:300
0
RESOLVED / SUBMITTED
0
NON-EMPTY PATCH RATE
1/3 to 1/10 the Tokens. 100% Completion.
OFF
~323 / ~966 tok/turn
ON
~1,023 / ~9,991 tok/turn
THINKING OFF
Behavioral~323 tok/turn
Long-horizon~966 tok/turn
Empty/truncated0%
THINKING ON
Behavioral~1,023 tok/turn
Long-horizon~9,991 tok/turn
Hardest tasksDELIVERED NOTHING
Head-to-Head: A Wash — Qwopus Edges the Coding, Far Cheaper.
Qwopus OFF
Ornith ON
Standout edge: clean compliance — no over-gating, no needless permission-asking.
Coder Training Sharpens the Direct Policy — And Leaves Reasoning Ungrounded.
1. More Coder Training
The no-think pathway gets sharper & more reliable
→
2. Reasoning Drift
The reasoning channel isn't outcome-grounded → decouples & drifts
→
3. Long-Horizon Compounding
Harmless on one-shot decisions, but COMPOUNDS into thrash, fixation, and non-delivery
The fix isn't more reasoning-trace SFT — it's outcome-grounded RL.
PROOF: IT SHIPS.
It built AETHER DOMINION — a complete, playable single-file sci-fi RTS — entirely through an OpenCode agentic loop, thinking-off, on a local RTX 5090.