undefined

points

by raincole17 hours ago |

comments

by rc16 hours ago|

[-]

Isn’t it still? Antidotally, I work with lots of creators who still prefer it because of its subjective qualities.

by Mashimo16 hours ago|

prev|

[-]

What ever happend to midjourney?

by Lalabadie12 hours ago|

parent|

[-]

No external funding raised. They're not on the VC path, so no need to chase insane growth. They still have around 500M USD in ARR.

In my (very personal) opinion, they're part of a very small group of organizations that sell inference under a sane and successful business model.

by aenvoker9 hours ago|

parent|

[-]

Not on the VC path. Not even on the max-profit path. Just on the "Have fun doing cool research" path.

I was a mod on MJ for its first few years and got to know MJ's founder through discussions there. He already had "enough" money for himself from his prior sale of Leap Motion to do whatever he wanted. And, he decided what he wanted was to do cool research with fun people. So, he started MJ. Now he has far more money than before and what he wants to do with it is to have more fun doing more cool research.

by spaceman_202010 hours ago|

parent|

prev|

[-]

Aesthetically, still unmatched

by cubefox27 minutes ago|

parent|

[-]

Apparently image models have to choose between aesthetics and photorealism. Many aren't good at either.

by echelon7 hours ago|

parent|

prev|

[-]

They're working on a few really lofty ideas:

1. real time world models for the "holodeck". It has to be fast, high quality, and inexpensive for lots of users. They started on this two years ago before "world model" hype was even a thing.

2. some kind of hardware to support this.

David Holz talks about this on Twitter occasionally.

Midjourney still has incredible revenue. It's still the best looking image model, even if it's hard to prompt, can't edit, and has artifacting. Every generation looks like it came out of a magazine, which is something the other leading commercial models lack.

by wongarsu16 hours ago|

parent|

prev|

[-]

They have image and video models that are nowhere near SOTA on prompt adherence or image editing but pretty good on the artistic side. They lean in on features like reference images so objects or characters have a consistent look, biasing the model towards your style preferences, or using moodboards to generate a consistent style

by vunderba11 hours ago|

parent|

prev|

[-]

A lot of people started realizing that it didn’t really matter how pretty the resulting image was if it completely failed to adhere to the prompt.

Even something like Flux.1 Dev which can be run entirely locally and was released back in August of 2024 has significantly better prompt understanding.

by cubefox23 minutes ago|

parent|

[-]

Yeah, though I there is the same issue the other way round: Great prompt understanding doesn't matter much when the result has an awfully ugly AI fake look to it.

by vunderba16 minutes ago|

parent|

[-]

That's definitely true, and the medium also really makes a big difference as well (photorealism, digital painting, watercolor, etc.).

Though in some cases, it is a bit easier to fix visual artifacts (using second-pass refiners, Img2Img, ultimate upscale, stylistic LoRAs, etc.) than a fundamental coherency problem.

by cubefox7 minutes ago|

parent|

[-]

I was disappointed when Imagen 4 (and therefore also Nano Banana Pro, which clearly uses Imagen 4 internally to some degree) had a significantly stronger tendency to drift from photorealism to AI fake aesthetics than Imagen 3. This suggests there is a tradeoff between prompt following and avoiding slop style. Perhaps this is also part of the reason why Midjourney isn't good at prompt following.

by raincole16 hours ago|

parent|

prev|

[-]

Not much, while everything happened at OpenAI/Google/Chinese companies. And that's the problem.

by KeplerBoy16 hours ago|

parent|

[-]

How is it a problem? There simply doesn't seem to be a moat or secret sauce. Who cares which of these models is SOTA? In two months there will be a new model.

by waldarbeiter15 hours ago|

parent|

[-]

There seems to be a moat like infrastructure/gpus and talent. The best models right now come from companies with considerable resources/funding.

by esperent14 hours ago|

parent|

[-]

Right, but that's a short term moat. If they pause on their incredible levels of spending for even 6 months, someone else will take over having spent only a tiny fraction of what they did. They might get taken over anyway.

by raincole13 hours ago|

parent|

[-]

> someone else will take over having spent only a tiny fraction of what they did

How. By magic? You fell for 'Deepseek V3 is as good as SOTA'?

by Gud12 hours ago|

parent|

[-]

By reverse engineering, sheer stupidity from the competition, corporate espionage, ‘stealing’ engineers and sometimes a stroke of genius, the same as it’s always been

by qingcharles10 hours ago|

parent|

prev|

[-]

They still have a niche. Their style references feature is their key differentiator now, but I find I can usually just drop some images of a MJ style into Gemini and get it to give me a text prompt that works just as well as MJ srefs.

by gamma-interface11 hours ago|

prev|

[-]

The pace of commoditization in image generation is wild. Every 3-4 months the SOTA shifts, and last quarter's breakthrough becomes a commodity API.

What's interesting is that the bottleneck is no longer the model — it's the person directing it. Knowing what to ask for and recognizing when the output is good enough matters more than which model you use. Same pattern we're seeing in code generation.

by sincerely5 hours ago|

parent|

[-]

PLEASE STOP POSTING AI GENERATED COMMENTS

by SV_BubbleTime9 hours ago|

parent|

prev|

[-]

SOTA shifts, yes. But the average person doing the work has been very happy with SDXL based models. And that was released two years ago.

The fight right now outside of API SOTA is who will replace SDXL to be the “community preference”

It’s now a three way between Flux2 Klein, Z-Image, and now Qwen2.

by echelon7 hours ago|

parent|

prev|

[-]

I'm happy the models are becoming commodity, but we still have a long way to go.

I want the ability to lean into any image and tweak it like clay.

I've been building open source software to orchestrate the frontier editing models (skip to halfway down), but it would be nice if the models were built around the software manipulation workflows:

https://getartcraft.com/news/world-models-for-film