I experimented with the concept of procedural generation of Waldo-style scavenger images with Flux models with rather disappointing results. (unsurprisingly).
If you asked me what I expected, since this one has "thinking", it'd be that it would've thought to do something like generate the image without Waldo first, then insert Waldo somewhere into that image as an "edit"