When Custom Training Is Actually Worth It

The order of operations

Prompting first: fastest iteration loop, cheapest experimentation, easiest rollback.
RAG second: closes knowledge gaps without changing model weights.
Fine-tune third: good for style consistency, formatting reliability, and repetitive behavior corrections.
Custom training last: reserved for domain behavior that cannot be reached by orchestration layers.

Rule: if you cannot prove the failure in a reproducible eval set, you are not ready to train. You are still debugging requirements.

Situation	Best first move	Training needed?
Wrong answer because missing source facts	RAG with source quality controls	Usually no
Output format inconsistency across repeated tasks	Prompt contract + schema validation	Sometimes fine-tune
Domain-specific reasoning patterns missing	Task decomposition + eval harness	Possibly yes
Edge-case safety failures under pressure	Red-team evals + policy scaffolding	Maybe, after controls