
This is still evolving at a fast clip. ChatGPT image2 imagines form for all the characters in a style reference based on an orchestrator observing the nuances of each agent without any character references or context priors. TL;DR: I thought it was a fun image.
What makes this useful as a field note is the constraint: the image is not working from a traditional character sheet. It is inferring shape, mood, and grouping from the behavior around the agents. That kind of translation from orchestration context into visual form is still rough, but it is also exactly where the interesting frontier keeps showing up.
The practical read is simple: style references are becoming less like static prompts and more like compressed observations. When the system can turn behavior, roles, and implied relationships into an image that feels coherent, the boundary between prompt, memory, and art direction gets softer.