Signal-to-system playbooks

Custom Model Training

Practical guidance for teams deciding when a frontier signal deserves a trained model, a dataset, an eval gate, or a safer rollout path.

Core Playbooks

Start with decision quality, then move into data quality, then enforce ruthless evaluation before launch.

Nanochat / SLM Series

Field notes and a four-part series connecting Eric's local-inference experiments, pico-LLM work, nanochat, small-language-model research, and practical ability training.

Field note Writing loop

Write As You Go. Schedule Ahead.

A project-writing field note on capturing lessons while the system is still warm, drafting ahead, and turning publication into a reusable learning loop.

Read field note →
Field note NPU

Tinkering with a NPU

A first look at the Snapdragon Elite X1E80100 Qualcomm Hexagon NPU as a local-inference path for low-cost agentic tokens.

Read field note →
Field note NPU + Gemma

NPU and Gemma4

A local-inference field note on Gemma, ONNX, battery-powered AI, and the harness needed to turn a small local model into a useful agent.

Read field note →
Field note Gemma harness

Experimenting with a Custom Gemma Harness

A short field note on Snapdragon NPU experiments, local Gemma sessions, and the slash commands that make a tiny-model harness usable.

Read field note →
Part 1 Pico models

Why Tiny Specialists Matter

The preserved 64 MB RPG-state experiment reframed around constrained domains, token contracts, failure-first evals, and capability-per-megabyte.

Read part 1 →
Part 2 nanochat

The Depth Dial and Miniseries

How nanochat's depth dial turns training into a comparable family of compute-optimal models instead of one-off checkpoint luck.

Read part 2 →
Part 3 Economics

GPT-2 Economics Under $100

What changes when GPT-2-level capability becomes cheap enough to repeat, and why data and evals become the real constraint.

Read part 3 →
Part 4 Abilities

Training Small Model Abilities

How synthetic data, token-visible task design, and identity tuning turn small models into useful narrow specialists.

Read part 4 →

Operating Principles

  • Use the smallest sufficient intervention: if prompt design solves it, do not train.
  • Ground decisions in production pain: train against real failures, not vibes.
  • Version everything: data, prompts, eval sets, and model artifacts need traceability.
  • Gate every release: no pass on evals means no launch, even when deadlines scream.
  • Measure drift continuously: a model can degrade quietly while dashboards still look pretty.