upvote
I agree, i don't think it is the core problem.

Meta doesn't seem to be able to produce anything close to a frontier model. The selling of compute capacity seems to be acceptance of "compute is wasted on this crappy avocado model, we'd be better off allowing something better to run".

The problem is clearly in the model architecture, the training and the data fed into the model which is causing them to give up on using their compute exclusively for their own models. They can't get it right so may as well sell the compute to someone that can.

reply
If their training base is dominated by Facebook and Instagram posts then it makes sense that their model is full of shit.
reply
A modern instance of that old saw "you are what you eat".
reply
Meta has made some very strange decisions in terms of who it's hired to lead various aspects of AI, including the model-building efforts. Also lots to marvel at re: its ability to coordinate (or not coordinate) various efforts by all these big brains.

Can't help but think that Meta's digital networking expertise is built atop a human-networking clusterf*ck

reply
I was never really sold their acquihire of Alexandr Wang as their head of AI being a coherent strategic decision. I just don’t see how his experience and background actually applies for frontier LLM model building.

I think there would easily be a few other hundred engineers and execs at frontier labs who are more in the loop for cutting edge architecture/secret sauce - with a track record of actually doing it - that could be had for a fraction of the price.

reply
From the outside Meta's attempts to pivot from open source releases to fast follow closed models fell flat when they tried to prematurely monetize it. They could have owned the open weight model world but tried to pivot to closed weight chatbots before an actually viable revenue model appeared.
reply
Does meta have the research talent to create a SOTA frontier model? Yann LeCun has left Meta and I don’t think either alexandr wang or zuck have enough credibility to attract talent to create one.
reply
it's possible Yann LeCun wasn't the right guy either. He seemed to be more focused at finding the next model architecture rather than iterating on the current LLM architecture to build a competitive frontier model.
reply
deleted
reply
If Meta is selling their compute and Twitter is selling their compute and the stuff doesn't do anything you don't need an economics degree to figure out what's going to happen to the price of compute. In particular because 'compute' is a euphemism given that this is far from general purpose capacity, those are specialized chips that largely do one thing

All these companies are going to sit on their gazillion data centers once the mania dies down and will have a big problem about what to do with their mountain of hardware

reply
well, Google refused to increase Meta quote of tokens, even Google can't supply so many (paid) tokens as Meta is burning
reply