undefined

points

by Hello999990118 hours ago |

comments

by RobotToaster10 hours ago|

[-]

> I'm curious why this isn't getting much attention from larger companies.

I can see two potential reasons:

1) Most of the big players seem convinced that AI is going to continue to improve at the rate it did in 2025, if their assumption is somehow correct by the time any chip entered mass production it would be obsolete.

2) The business model of the big players is to sell expensive subscriptions, and train on and sell the data you give it. Chips that allow for relatively inexpensive offline AI aren't conducive to that.

by 2 hours ago|

parent|

[-]

deleted

by JKCalhoun11 hours ago|

prev|

[-]

Apple should have done this yesterday. A local AI on my phone/Macbook is all I really want from this tech.

The cloud-based AI (OpenAI, etc.) are todays AOL.

by wmf2 hours ago|

parent|

[-]

https://developer.apple.com/documentation/FoundationModels

by fennecbutt3 hours ago|

parent|

prev|

[-]

They did do it yesterday.

And it produced fake headlines and summaries including the threat of lawsuits from involved person(s).

Apple usually waits until somebody else has refined a technology to "invent" it, but I guess they couldn't wait for this one.

by Aurornis9 hours ago|

parent|

prev|

[-]

The die size is huge. This isn’t the kind of chip that would go into your MacBook, let alone an iPhone.

It’s for cloud based servers.

by adeelk938 hours ago|

parent|

[-]

And computers used to be the size of a room. I think they can get it to iPhone size in the future, this is an early prototype.

by MarsIronPI6 hours ago|

parent|

[-]

Well, there's a limit to how small we can make transistors with our current technology. As I understand it, Intel is already running into those limits with their new CPUs (they had to redesign the fins IIRC). I can imagine that without an actual breakthrough in chip manufacturing the size could stay large. That's not to say that a breakthrough won't happen, though.

by wmf5 hours ago|

parent|

prev|

[-]

That's the part that people are missing: it won't get smaller. It already required heroic optimization to get 8B on one megachip. Taalas is more expensive but faster. It is cheaper per token when running 24x7 but not cheap to buy. It will never be small and never be cheap.

by JKCalhoun1 hours ago|

parent|

[-]

"It will never be small and never be cheap."

Will your comment age well? We'll see.

We might all be surprised if (somehow, ternary logic?) models come down drastically in size. It doesn't have to be the hardware getting more dense.

by post-it10 hours ago|

parent|

prev|

[-]

The hardware isn't there yet. Apple's neural engine is neat and has some uses but it just isn't in the same league as Claude right now. We'll get there.

by roncesvalles16 hours ago|

prev|

[-]

Well even programmable ASICs like Cerebras and Groq give many-multiples speedup over GPUs and the market has hardly reacted at all.

by brainless14 hours ago|

parent|

[-]

Seems both Nvidia (Groq) and OpenAI (Codex Spark) are now invested in the ASIC route one way or another.

by mips_avatar10 hours ago|

parent|

prev|

[-]

The problem with groq was they only allowed LORA on llama 8b and 70b, and you had to have an enterprise contract it wasn't self service.

by fooker15 hours ago|

parent|

prev|

[-]

> market has hardly reacted at all

Guess who acqui-hired Groq to push this into GPUs?

The name GPU has been an anachronism for a couple of years now.

by fuck_google14 hours ago|

parent|

[-]

[dead]

by IshKebab11 hours ago|

parent|

prev|

[-]

Cerebras gives a many multiple speedup but it's also many multiples more expensive.

by theptip4 hours ago|

prev|

[-]

> I'm curious why this isn't getting much attention from larger companies

I would be shocked if Google isn’t working on this right now. They build their own TPUs, this is an extremely obvious direction from there.

(And there are plenty of interesting co-design questions that only the frontier labs can dabble with; Taalas is stuck working around architectural quirks like “top-8 MoE”, Google can just rework the architecture hyperparameters to whatever gets best results in silico.)

by hrn_frs7 hours ago|

prev|

[-]

> I'm curious why this isn't getting much attention from larger companies.

Time is money and when you're competing with multiple companies with little margin for error you'll focus all your effort into releasing things quickly.

This chip is "only" a performance boost. It will unlock a lot of potential, but startups can't divide their attention like this. Big companies like google are surely already investigating this venue, but they might lack hardware expertise.

by 8 hours ago|

prev|

[-]

deleted