Nvidia Nemotron is also an open training source model, though a portion of its dataset remains proprietary.
Quoting lambda's comment:
> Note that the Nemotron models are generally stronger than Olmo and K2 Think V2 (according to Artificial Analysis benchmarks), and there is a lot of overlap in their datasets (lots of datasets are based on the same sources with different filtering, Olmo and K2 Think V2 both have used some Nemotron datasets).
> But yeah, Nemotron is a modern and fairly capable LLM, even the 122b is more capable than Deepseek R1 (a 671b model) on most benchmarks, and there's also the recently released 550b Ultra now.
In fact, if the frontier companies had taken their approach, it would have started much slower, but I think we would be far more advanced by 2035. Instead we have a majority of society that wants to see AI fail.
Do you talk to regular people? I work out of coffee shops routinely and literally like 90% of laptops have ChatGPT or Claude open. I was shocked at how many of my friends love the silliest of AI features (like Slack bot summarizing your day or your upcoming meetings), and a lot of decks, proposals, SOW's, etc. are (at least in part) generated with AI these days.
Young people who want to have secure jobs and who have any kind of experience with creativity see AI coming for their livelihoods and their joy simultaneously.
Middle-aged IT industry people like me, many of us are grudgingly learning it but believe it to be an obvious net negative the way it is currently deployed; it feels like we're automating all the wrong stuff.
I wouldn't go around talking as if people think AI is great. A solid proportion of the population would be tempted to push AI influencers under buses and trains.
Everyone wants at least some of the utility.
Few want to reach the end of the road we’re said to be walking. AI companies and the CEOs of megacorps. Everyone else is being sold a doomsday scenario (true or not).
https://www.pewresearch.org/chart/about-a-quarter-of-u-s-adu...
It all, of course, depends on what people mean by "AI" (I think the question basically defeats itself, it's akin to asking someone about "databases", given that it covers image generation, self driving cars, TikTok feeds, drug discovery and chatbots) but AI sentiment at large is more negative than positive.
https://www.pewresearch.org/chart/americans-predict-ais-impa...
So, depending on where you sit: Sure, most people will use "AI", meaning a chatbot (probably ChatGPT: https://www.pewresearch.org/chart/americans-report-using-cha...). 90% in coffee shop land, why not.
But that does not mean that they are not weary of the consequences, and are growing more weary. I think, predictably, the better situated you are and the more your direct livelihood is at stake. That's just the animal we are.
Does that mean that we should have slowed down? Matter of opinion. My take: Absolutely not. The people who need it the most around the world will have dramatically improved lives, because of access to better medical advice or information about institutions and systems, to start things and help them in their daily lives.
But this is one of those unique situations where wary (cautiousness, concernedness, preparedness, tinged with fear) and weary (exhaustion with a mental component) are overlapping into one horrible thing.
So I'm not correcting you because I think basically both are right: we are going through both of these at once because anxiety is what Scam and Wario in particular are selling.
I was at my daughter's football game, and another father from the club came up to me and asked if I were in IT and knew how AI worked. He then asked if I could help him setup an AI agent to generate passive income.
We're at the equivalent of December 2017 for crypto. Hang on to your hats!
Was it a two part question converted into one with a gate at the beginning, or was a general question about occupations and abilities?
I hate cars but I still drive to the office 1x / week because I have to.
Or is it just vibes?
https://gizmodo.com/people-hate-ai-even-more-than-they-hate-...
Was discussed just recently, and there are multiple articles and surveys on AI sentiment.
IsaacSim was (and might still be) the best robotic learning sim and I ran MLAgents.
It's always funny to see people tempted to call open-blobs/open-weights, which are literally shareware like WinRAR or Adobe PDF Viewer, open source, and then need to invent a new term for what is actually open source.
I empathize with this but curious what would make any other country a better safehaven for your data? I personally like the EU's approach to data safeguards, but are there other locales/data protections you have in mind that would keep your data "safe".
I purchase open model tokens for agent programming assistance, and I like lumo+ for everything else.
Another option is DuckDuckGo’s Duck.ai subscription, but I slightly prefer ProtonMail’s lumo+ packaging as a product.
How about deporting people without a hearing or opportunity to present evidence about their charges. And then violating the judges order to turn the planes around.
How about systematically ignoring judicial rulings.
How about detaining people based on the color of their skin and spoken language/accent.
How about violating the emoluments clause of the constitution by accepting a personal airplane.
How about sending your son in-law, who hasn’t been appointed to any office with the advice and consent of congress as required by the constitution.
How about refusing to seat elected congress members for reasons for months.
How about singling out companies like intel for targeted trade restrictions and then demanding equity in order to lift them.
What about threatening to delay or deny a merger of a media company unless your ally is allowed to buy them.
What about refusing to enforce the TikTok ban until you can arrange a buy out to an ally.
What about a formal market with a known price for pardons and commutations.
What about stating multiple wars without congressional approval.
What about creating a fake department named Doge that withholds funds apportioned by congress and breaks contracts that have explicit obligations for payment that results in more termination fees and losses than the savings. All without congressional approval.
How about threatening to withhold federal funds from states with governors of the opposing political party but not your own? Remember the president is supposed to execute the law congress passes not make law or arbitrarily enforce it based on their own political needs or values.
Not to detract from your general point about the US, your first point is something that's happened recently in Switzerland:
https://truthout.org/articles/swiss-police-arrest-deport-pal...
[1] https://www.bvger.ch/en/newsroom/media-releases/fedpol-must-...
There are always incidents in all democracies with millions of people, that contravene the expectations of rule of law and integrity of its systems.
The US has degenerated significantly in the past few years, to the point that when someone asks “can you give examples”, I expect a disingenuous ploy more than genuine ignorance. The list of breaches is so long, that listing it results in numbness and exhaustion of the mental muscles responsible for being aghast.
Compel you to reveal your secrets, including your passwords by threatening to arrest and detain you without legal proceedings for an unspecified period.
Deny your basic human rights, particularly at the borders, especially if you aren’t a citizen.
And more.
It is a commonly accepted "fact" right now, outside the US, that the US is not to be trusted (right now), due to some orange guy, and his mates, manipulating markets, running their mouths, doing all kinds of criminal and/or infantile shit.
I'd say there is quite a bit of evidence for this all around.
I think it’s valid to not trust the US with your data. But if the reason is some TDS “Orange Man Bad”, it’s you that’s acting infantile.
Ask intel, paramount, TikTok or anthropic if they feel law will be applied equally to all companies.
Ask the blue states that had fema funding withheld when it went to red states.
Ask black families that haven’t gotten reparations when Jan 6 rioters that beat and killed cops to over turn an election will get almost $2b in reparations and then had the Supreme Court throw out their votes in Louisiana in the middle of an election to overturn the voting rights act, redraw districts, overturn their own case law and the principle that judicial review shouldn’t happen too close to an election so they could redraw the districts.
Business leaders are sucking up to curry favor. That by definition isn’t the rule of law it’s the rule of dispensation. It’s the spoils system.
If you have a counter argument you’d better make it now or you will tip your hand.
Frankly, I'm surprised there's not more urgency on the part of Europeans to reduce dependence on US tech. I don't like it. I'm an American in tech. But, the US can't be trusted, at this time. And, given how irresponsible tech leadership has been, in kowtowing to Trump, I don't see how they can reasonably be trusted, either.
I invest in startups and companies at every stage are losing contracts in Europe specifically for this risk. I can’t say who but it’s a multi front trend.
I am also assembling the largest in home robotics training data set available which will be open source.
Want to help?
I was hoping the European AI companies and projects like Mistral and Apertus would, you know, do something good. But, their models trail not only US models, but Chinese models, including smaller ones, by a significant amount. I guess there's also the ethical component. Mistral is reportedly not plagiarizing like US companies, and isn't distilling US models like the Chines companies. Cheating gives one a leg up if there are no referees.
Anyway, I work for a robotics company, and I'm always interested in what's happening with open robotics stuff, including AI.
and really, the topic here is reducing a transgressive President from infringing tech activities elsewhere (used to be mainly about surveillance, but then trump happened).
They decided that spying on me in a commune in Hawaii, and then following me after to other public spaces was fine. I'm certain something was put in my food based on behavior I saw in communal meals, and I can't say I took video or photo evidence though I wish I did.
I'm of Pakistani descent, held a former secret clearance, and I did not break any oaths or violate any laws though the way I was treated was certainly how the above person described rule of law: our spy agencies for example operate completely without accountability and regularly commit atrocious behavior against US citizens beyond just me.
Let's say Gemini gets to AGI by tomorrow, will my Google account access, or Gemini apps access and data be blocked if I'm not a US citizen? (Anthropic did it with a 5% better model).
If US is classifying the model access based on citizenship, that's similar to treating it as a Defense capability.
You can already imagine Anthropic working with a bunch of shady brokers to "remedy" this situation.
This particular order wasn't actually about citizenship at all. It seems the administration simply believed restricting the order to non-citizens would make it easier to defend in court, but they made it knowing full well that the only way to implement it would be to completely shut off access for everyone.
Stallman was correct in the 80s and is correct now about libre software
From a practical perspective, I'm not sure any servers are safe anywhere...depending on who may want your data.
I'm surprised there isn't a lot more attention to encrypted, distributed, erasure-encoded stores.
> What most people miss IMO is that this is not a team who is doing this for the fourth time like virtually any other LLM provider and who could learn from its own past experiences. I bet if the team would do another model training they could get way better results at one fourth of the costs.
i doubt they are including a lot of training data labeled with the language.
"how to say X in language Y" is a different task from saying X in language Y
My last hope for soverign AI is from Chinese open models
If you want to mix models like this, check out https://github.com/deepbluedynamics/nemesis8
Going forward would be such open source, open data and open recipe models possibly someday even with the training being crowd sourced if not inference like the BitTorrent model.
Lastly, even Chinese models (GLM, Deepseek, MiMax) work really really good and any user would testify that they do not miss OpenAI/Anthropic/Gemini at all if they're using those Chinese models which is argument enough that with such models, no one is going to miss Chinese models as well.
The training data and the Apertus LLM may contain or generate information that directly or indirectly refers to an identifiable individual (Personal Data). You process Personal Data as independent controller in accordance with applicable data protection law. SNAI will regularly provide a file with hash values for download which you can apply as an output filter to your use of our Apertus LLM. The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output. We strongly advise downloading and applying this output filter from SNAI every six months following the release of the model.
Also even after you do that, and start a chat, you currently get:
"JSON.parse: unexpected character at line 1 column 1 of the JSON data"
so it's not quite there yet.Who confirms those requests are legit?
Why the emphasis on sovereign? Open is good enough. No?
Why do we need capabilities in Europe? Because Trump and Xi can't be trusted to keep providing us with new frontier models in the next years.
Nothing below that really seems to be good for anything other than training for specific tasks. I have not been impressed by the earlier Apertus 8B model, which doesn't feel like it really responds to nudges.
I am a strong believer in smaller models, so I might try one of these out of curiosity to see if it might do useful things in limited contexts.
> Fully open model: open weights + open data + full training details including all data and training recipes
There are equally open, much more useful models out there: https://artificialanalysis.ai/?models=nvidia-nemotron-3-ultr...
That doesn't mean much to the many people I know of who refuse to use a technology that they see as being unethically created using the work of others without compensating them.
I continue to hope that someone will train a "vegan" model on licensed or out-of-copyright data so those people can experience the benefits of this class of technology.
(I compare them to vegans because, like vegans, I think their ethical position is credible and has merit even though I do not choose the same ethical framework for myself.)
How many normal people do you know who use "ChatGPT"? A lot, probably.
How many even know what "Gemma" is, let alone have downloaded llama.cpp, a GGUF file from Hugginface, and run "llama-server" from a text console with all the correct command arguments? How many are thinking about this use case when speccing out their next computer? Where is the breathless marketing copy boasting x tok/s?
We are sleepwalking into slavery.
Yes, I realise this isn't "running a local model", but it's using models that can be grabbed and run locally. For my pipelines, I feel far more confidence when I use an open model (even one like GLM-5.2 that would be expensive for me to run) since I have a backup plan if the hosted/cloud option becomes unworkable for me. If that happens to me with Opus, I have zero options.
This choice is made for us. The deciding factors will be convenience and economics.
My sense is that just like Web 2.0 SaaS we are destined for servitude.
A better strategy is to play an assymetrical game IMO. Don't let your would-be master write the rules by which you play.
What do you mean by this? Do you have an example in the given context?
You would also be shocked what's possible on a 64GB Mac Studio, which isn't that unattainable.
I can see this as a future battleground but access to frontier models (which you cannot run locally) seems a lot more relevant today.
It's important that people get used to the idea that your interactions with a language model are a highly personal thing. LLMs can perceive and categorize us in ways we can't even imagine, far more violently than the simple algorithmic feeds which have already corroded public discourse so much. LLMs can control us. LLMs warp the information landscape more radically than even the internet did. Even now you are likely underestimating their role in future society.
The principles of software freedom are becoming existentially important.
Of course the frontier will always be unattainable, but that's like pointing out that I couldn't buy my own Cray supercomputer.
That’s a bit hyperbolic…
Yep. I'm an old time Linux sysadmin, but I am COMPLETELY baffled as to what I can or cannot run on my 32GB R9700 with 128GB main CPU memory.
If I want something Claude or Codex like what do I use that would be useful? If I want a chat system, what do I use? Images--apparently ComfyUI for setup but after that what do I do?
I don't even mind spinning up something in the cloud for a bit, but I need to know how I'm going to get data up and down without racking up massive bandwidth charges.
I'd love to do some tinkering, but the field is moving so fast and so full of charlatans that cleaning the dross out is almost impossible.
I don't have recommendations for images because I haven't played with those.
The jokes write themselves.
no, actually, from the docs it sounds mainly motivated by the country's unique linguistic requirements.
the swiss have no gpus
I can run the 8B version of this swiss-ai model on a ten year old GPU. For the larger one, $2000 consumer hardware can run it fine. Beyond that, there are plenty of places where time on a GPU can be rented, and if the model is good, there will be hardware to run it.
My charitable reading of GP's point is that the bottleneck for true compute sovereignty is the chips, not the models.
There were a number of use cases where we needed to use Gemini (audio modality), and Ultra has been a VERY cost-effective alternative once we got through the nuances.
Not looking good so far
I think a problem with open-weight models is that while you can improve them, you are not going to create the next generation of LLMs by fine-tuning. We are at the mercy of frontier labs for access to SOTA LLMs. For example, Anthropic recently started requiring identity verification for Claude [0], same for OpenAI [1].
If one day China's distillation labs stop releasing their LLMs as open-weight, I doubt American labs will continue to release free LLM weights without that competition.
That's where fully open pipelines shine: they enable the community to create the next generation of SOTA LLMs. That is the only way LLMs truly become sovereign.
This notion that Chinese labs are merely distilling frontier models is quite an unwarranted slur. Those labs have published WAY more useful research than US labs on RL techniques, novel model architectures, training pipelines, etc. They have also hit intelligence-per-parameter densities that US labs have yet to attain.
Apart from that, merely training a model on outputs from another model, off policy and without the logits, doesn’t really work that well.
The Chinese labs know how to build frontier level models. GLM-5.2 shows that they no longer even need Nvidia chips to do it.
Chinese labs are basically just telling everyone, out in the open, what they're doing and how to do it, and the answer from American frontier labs is "Well, they couldn't possibly be getting the results they're getting without just distilling our models," and the American labs aren't even trying to do some of the stuff like DS's aggressive caching to get costs down.
it happens to all models…when the internet is increasingly generated, things happen
I disagree with this use of SOTA, and this topic is why.
Anthropic and OpenAI have “cutting-edge” models. These are beyond the state of the art but they are closed, secretive, hard to quantify.
The “state of the art” is open source, open weights models that can be inspected, studied, shared and critiqued, because that is what is meant by “the art” —- it is the knowledge and principles and evidence and materials available to all. The “state of the art” is the highest point of that.
I wish we could make this distinction and stop blessing two secretive, unverifiable loss-making companies with so much power.
(Putting that aside, I suspect — without evidence, mind you - that the endless march to solving models by making them bigger is not the solution anyway.)
Chinese's model like GLM is getting better for coding task and its cheaper. Microsoft Github copilot have to switch billing to token based. the cost of AI have increased since agent come into play. whoever can offer cheaper token to do task will win.
even Microsoft is looking into Deepseek for cheap token.
https://www.axios.com/2026/06/16/microsoft-copilot-cowork-to...
But "state of the art" implies the highest state of general availability, not just in terms of access to some product, but of use of the ideas, concepts, methodologies etc.
Anthropic and OpenAI have "cutting edge" models; the state of the art is behind the cutting edge.
The state of the art is the best open source, open weights model available. More or less by definition.
I am probably tilting at windmills here.
But the way SOTA is generally understood by other users of the language, it refers to exactly the team, technology, & techniques defining the cutting edge in any field, regardless of the whether the technology & techniques are available outside of that team...
https://english.stackexchange.com/questions/239963/do-state-...
its things you would be trained in as part of a bachelor's degree and some graduate coursework