I hope there is something in the report for everyone, we included a fair bit on the actual training and data infrastructure usually not written about much, that I think will be interesting to people here. There's more that didn't fit, happy to answer questions!
We also had 0-day support from people like Ostris and ComfyUI from the open source community
We recommend training off the undistilled, Raw checkpoint, and then applying the LoRA to the Turbo model for inference.
The original Flux.1 Krea is actually in my GenAI Showdown benchmark site from all the way back in July of last year (which feels like a lifetime in this space), so I’m looking forward to putting this new one through its paces.
I am Diego Rodriguez, Co-founder & CTO at Krea.
We are releasing the weights and a _juicy_ technical report---at least given current industry standards. In it we describe data curation/captioning, model architecture, post-training, RL pipelines, prompt expansion, style references, and our infrastructure in great detail.
When it comes to theweights themselves, there's actually 2 releases:
* Krea 2 Turbo. This model is both guidance- and timestep- distilled for faster inference.
* Krea 2 RAW. This model is actually meant to be hackable/fine-tunable
One of the things we think the (open) LLM community does well is release models in different sizes and also at different stages of the training pipelines; we are releasing two checkpoints at both the mid-training and post-training stage. This is rare in the image & multimedia community, so we can't help it but to feel proud of this release.
We are on par with Nano Banana in terms of image quality as per Artificial Analysis text-to-image benchmarks (https://artificialanalysis.ai/image/leaderboard/text-to-imag...).
We also attached a permissive license for individuals and small businesses.
Useful links:
- Marketing page around the OSS release: https://www.krea.ai/krea-2-open-source
- Huggingface model: https://www.krea.ai/krea-2/huggingface
- GitHub repository: https://www.krea.ai/krea-2/github
- Reddit AMA: https://www.reddit.com/r/StableDiffusion/comments/1udnm0a/we...
- Technical report: https://www.krea.ai/blog/krea-2-technical-report Thank you and I hope you enjoy this release---happy hacking!
Some of our team members will be answering questions since we are at the front page for now (thank you HN!).
Happy hacking!
I also like the "keep the manifold wide" approach of trying to make a model capable of many styles as opposed to getting it "dialed in" for a dozen of style presets.
But it does feel very much like "fighting the past war" - now that advanced "image-to-image"/"agentic composition" models like Nano Banana 2 or Images 2.0 are out there in force.
I seriously doubt that the basic Qwen 3 VL in cross can get anywhere near that level of I2I. And robust I2I is very desirable - editing, adjustment, character consistency, the generalization of whatever you're doing with style transfer now (underexplained BTW).
Trying to hit that level of I2I is not by any means easy, but it's pretty clear to me that this is where the next frontier for image models lies. Feels like Ideogram might be building up to it, but I'm yet to see it anywhere else in open weight space.
Also, we are on par with them in t2i benchmarks, check the artificial analysis link I posted in my top comment.
And you cannot re-train nano banana or ChatGPT to understand your brand, which is what our customers complain about constantly.
Plus open-source! It’s hard to do an apple to apple comparison.
"Edit model" is a part of it, yes. So is style transfer. But less as an endpoint and more of a subset of what advanced I2I enables.
"Re-train to understand your brand" is a fine marketing pitch, but in practical terms, it's hard to justify burning a LoRA for most uses. Enthusiasts absolutely do it, but enthusiasts are built different. Robust I2I can accomplish a lot of the same, but with a workflow that's closer to "drag and drop your references" than to "try to get a LoRA to do what you wanted it to do on a very slim set of images".
Modern LoRA pipelines are getting closer to "reliable" and "braindead simple", but you can't escape the "wait N hours for the GPUs to churn" of fine tune no matter what you do. And iteration time kills - a lot of the value of AI in workflows is that it does what it does fast and allows you to iterate at speed.
You can think of "LoRA vs I2I" as of an image twin of "SFT vs in-context learning" of LLM land. Both are useful, neither substitutes for the other fully, but there's a reason why most reach for the latter way before they reach for the former.
I like the T2I from what I've seen, mind. Perhaps more than Images 2.0 or even NB2. I just think that focusing solely on T2I to the exclusion of advanced editing and composition capabilities is a very 2024 thing.
I tried two of the Krea 2 models in LM Studio, but loading the downloaded models errored out. (Maybe I'm doing it wrong, since it's an image model.)
Previously: https://news.ycombinator.com/item?id=47800562
you can try it right away at krea.ai/image (warning: you need to sign-up)
May I ask how much did the training cost you?
Please edit out swipes from your HN comments, as the guidelines request: https://news.ycombinator.com/newsguidelines.html.
Edit: your account has unfortunately been breaking the site guidelines like this in other places as well (e.g. https://news.ycombinator.com/item?id=48567675). Can you please fix this? I don't want to ban you, but we've already had to ask you this before.