---
layout: ../layouts/Base.astro
title: Projects — Jesse Anderson
description: Personal AI tooling, the rig, OpenClaw, the operating doctrine. Peer-leaning detail.
slug: projects
---

<main class="projects">

<PageHeader eyebrow="Projects" title="The rig, the tools, the doctrine.">
Higher detail than [/work](/work). The personal AI tooling layer — local LLM infra, the Codex CLI fork, multi-agent routing, the OpenClaw bridges, and the operating doctrine. Aimed at peers who care how the sausage is made.
</PageHeader>

<Project eyebrow="codex-local" title="What 9B can actually do." featured={true}>

The reason I'm bullish on local models: **codex-local** — my fork of the Codex CLI, tuned to babysit a local model through a structured task list — drives multi-turn, multi-file, multi-task coding sessions to completion using just **Qwen3.5-9B-IQ4_XS**.

<ProjectStats stats={[
    ['9B', 'parameters (Qwen3.5 IQ4_XS)'],
    ['8+', 'turns of sustained coherence'],
    ['4+', 'files edited per session'],
    ['Full', 'TODO list completion end-to-end']
]} />

The trick isn't the model — it's the scaffolding. Specifically:

- **Docs-first.** Every repo has an `AGENTS.md` and a `TASK_STATE.md`. The model reads them on every turn. Persistent intent survives compaction.
- **State in files, not in-context.** Long runs survive crashes, context exhaustion, model swaps, OAuth expirations. The agent rehydrates from disk.
- **Explicit task tracking.** TODO list in a file, model crosses items off as it goes. No "where was I" questions.
- **Verify before declare-done.** No green-by-inference. Run the test, read the output, look at the deployed version.

Most local-model coding agents stall past trivial single-edit tasks. With those four disciplines, 9B handles real work. The frontier model is no longer the moat — the scaffolding is.

The fork itself adjusted compaction behavior, model-routing interception, and exec-log inspection so a long unattended run could survive overnight without losing the thread.

</Project>

<Project eyebrow="The rig" title="Dual-GPU local-first AI infra.">

- **GPUs.** GTX 1080 (8GB) + RTX 3080. Two hosts, services routed by endpoint.
- **Runtime.** Ollama, tuned: `OLLAMA_FLASH_ATTENTION=1`, `OLLAMA_KV_CACHE_TYPE=q8_0`. Roughly 2× context fit at the same VRAM for the cost of a few percent of output quality.
- **Model picks against VRAM budgets:**
  - **Qwen3 4B** for the chat gate — small footprint, fast first-token latency.
  - **Qwen3.5-9B-IQ4_XS** for the worker — sweet spot for code on the 1080.
  - **qwen3.5:35b-a3b** for harder reasoning when the 3080 has headroom.
- **Topology.** Chat gate on one host, Qwen service on another. Cross-host routing via explicit endpoint config — no service mesh, no magic.

<ImagePlaceholder note="Dual-GPU rig — workspace photo TBD" />

</Project>

<Project eyebrow="OpenClaw bridges" title="3D scenes by voice.">

Built so my third son could direct his Roblox / Blender / Daz scenes through natural language — _"make Torus.1 50% thinner on the Z-axis"_ — and watch the live environment update. AI helper bridge through OpenClaw, with the language layer driving live scene state. Sits at the intersection of three threads: my AI work, my long-running 3DCG interest, and a kid who's already designing in Blender, Roblox Studio, and Suno.

<ImagePlaceholder note="OpenClaw → Blender mid-edit" />

</Project>

<Project eyebrow="coding-agent-router" title="Multi-agent CLI routing.">

Docs-first multi-agent CLI router. Routes prompts across specialized workers with shared context, built on the assumption that no single model is the right choice for every step in a long task. Pairs with codex-local for the local side and with cloud providers for harder work.

</Project>

<Project eyebrow="K.O.R.A." title="Same scaffolding, in production.">

Everything above runs in production at Kora Labs as **K.O.R.A.** — the autonomous Discord ops agent. Multi-stage pipeline, multi-model routing with graceful fallback, three-tier test battery, set-me-loose-all-night execution mode. The personal tools and the production agent share the same operating doctrine. See [/work](/work) for the platform context.

</Project>

<Project eyebrow="Operating doctrine" title="How I actually work with AI.">

- **"Mitigations, fallbacks, band-aids are the worst. Find the upstream fix."** *The* doctrine. Encoded in personal AGENTS.md, enforced across every repo.
- **Docs-first.** PRD → spec → architecture → AGENTS.md in every repo. Docs aren't notes; they're load-bearing artifacts the agent reads on every turn.
- **No green-by-inference.** Verify before declaring done. Check the deployed version, inspect the running process, read the Discord channel.
- **Persistent state in files.** TODO + TASK_STATE patterns survive crashes and compactions. Agents revive from disk.
- **AI as real infrastructure, not autocomplete.** Instrument it, test it in tiers (hermetic → real-process → live-canary), surface its silent failures (codex OAuth heartbeat) the same way you'd surface a dead database.
- **Long autonomous runs are normal.** *"Set me loose all night"* is a working mode, not a stunt.
- **Platform-engineering reflexes are non-negotiable.** Redundancy, failover, quick recovery, auto-scaling, reproducible deployment — with minimal overhead. Higher dollar cost beats higher human-operations cost.

</Project>

</main>
