upvote
Weird that they restrict the resolution so much. Does it fall apart with more detail (when zoomed in) or does the cost just skyrocket?
reply
It's usually based on what they've been trained on. There aren't very many models that'll do higher resolutions outside of Seedream but adherency is worse.
reply
Processing power, not training. The larger the scene in 2ď the more you need to compute. The resolution itself is not flexible. Imagine painting a white canvas. It is still a pixel per pixel algo which costs LLM GPU power while being the easiest thing to do without it.

You can create larger images by creating separate parts you recombine. But they may not perfectly match their borders.

It is a Landau thing not a trading thing. The idea of LLM is to work on the unknown.

reply
It depends on the model. Diffusion models, which are among the more popular approaches, are typically trained at a specific image resolution.

For example, SDXL was trained on 1MP images, which is why if you try to generate images much larger than 1024×1024 without using techniques like high-res fixes or image-to-image on specific regions, you quickly end up with Cthulhu nightmare fuel.

reply
Need a model trained on closeup/macro shots of everything, to use for upscaling, then run that, as a kernel, over the whole image.
reply
Exactly what I was thinking
reply
actually gpt-image-2 is VERY flexible with the resolution. You can use arbitrary resolution within the max pixel budget.
reply
Generate a lower resolution image and upscale to the resolution you need.
reply
deleted
reply
It can generate 3840x2160
reply
Interesting, I wonder why larger outputs are more expensive than smaller square ones on v2, while it’s the other way around in v1.
reply