undefined

points

by ipdashc1 days ago |

comments

by topspin1 days ago|

[-]

I have also been thinking about this a lot, and share your belief that this is inevitable.

Taalas has a running demo here: https://chatjimmy.ai/

It's eye opening: generated an AVX-512 optimized Mersenne Twister in C in 0.076s, 13,706 tok/s. Too fast for the tok/s to be terribly accurate.

by mdp202117 hours ago|

prev|

[-]

> It's odd to me that I haven't heard anything about this approach ... I wonder if it's being worked on in secret, if there's something about it that makes it infeasible

The studies and efforts are ongoing and public, and there are technical hurdles to be faced - but the relevant works go back in time quite a lot and there is heightened interest in it now.

It seems that you simply took the "hyped headlines" for the whole of the work.

by ipdashc4 hours ago|

parent|

[-]

> It seems that you simply took the "hyped headlines" for the whole of the work.

Well, yeah, that's what I'm saying. It's odd that there haven't been any major headlines (customer interest, competitors' announcements, etc) other than their initial demo. Good to hear it's being worked on though!

by mdp20213 hours ago|

parent|

[-]

Did we not play with MNIST and placed some calculated bet on NNs well before Yann LeCun started the fire with the explosive success of the Convolutional NNs?

I'd say it pretty consistently starts in the underground.

The real revolution in the context is that it /could/ be done practically - overcoming the hurdles. But for what the interest in the matter is concerned, I'd say there almost cannot be a greater interest at this stage: making NNs efficient. This must be absolutely evident, as evident it is that the separation of memory and processor is against the idea of NNs, as evident as it is that multiplication is achievable just physically.

Of course many have seen that and got on studying it. As soon as it will be optimally practical...

by coder54316 hours ago|

prev|

[-]

> It's odd to me that I haven't heard anything about this approach since.

It has only been four months since they unveiled their first prototype. I don't understand your confusion. Chip development does not happen overnight...?

Their initial blog post laid out a roadmap, so theoretically they should have another thing to demonstrate this summer.

by ipdashc4 hours ago|

parent|

[-]

In the sense of interested customers, online discussion, other companies doing the same thing, etc. Of course it takes time to get actual results, but from an outsider's perspective it's surprising that it was basically just their initial demo and that's more or less it so far. Excited to see if they come out with something this summer though!

by mdp202116 hours ago|

parent|

prev|

[-]

You are focusing on Taalas, but (specific) analogue computing, electronic NNs, compute-in-memory etc. - the field including the contextual approach - backdate to Rosenblatt.

by coder54316 hours ago|

parent|

[-]

Yes, I’m focused on the topic at hand that the person I replied to was also talking about.

The person I replied to was acting as if Taalas was ancient history. I was pointing out it has only been a few months.

by mdp202113 hours ago|

parent|

[-]

I'd say the original remark was more general («this approach (baking LLMs/weights into silicon directly) [... as if] worked on in secret») - which is salient, because when I investigated weeks ago, I found a large number of attempts to CIM and to general branching from Von Neumann architecture for the purpose of optimizing NNs implementations in HW.

Universities are studying, startups are proposing - the «approach» is under the big headlines level but quite lively. Not just Taalas, not just their way - which remains remarkable in the scene as the HW is achieved, working, online, available... and amazing.

by coder5437 hours ago|

parent|

[-]

CIM does not bake the weights into silicon. The level of optimization that you can do down to the last transistor when the weights are fixed is on an entirely different level than CIM where you still need general purpose ALUs all over the place.

by mdp20214 hours ago|

parent|

[-]

> CIM does not bake the weights into silicon

If that were the extent of the terms, then what could we call "baking the weights into silicon"? Setting parts of the circuits to determined values for multiplication is is like printing a Read-Only Memory. (And you compute at it: Compute In Memory.)

> CIM where you still need general purpose ALUs all over the place

If that were so, then why do taxonomists present analogue computing as part of CIM? Ohm's Law does not constitute an "ALU" the way you intend it.

Simply, I used CIM, "Compute In Memory", for lack of a better term - for "store data there where you modify data", for "beyond Von Neumann's separation of data storage and processor".

by coder5433 hours ago|

parent|

[-]

EDIT: It's just not even worth arguing this point, so deleting my original, much longer comment. Abstract taxonomies can claim that Taalas is CIM, but this entirely and utterly misses the point, and misses what makes Taalas' approach special. If you told a room full of chip architects to go build "CIM for AI", they would not build a Taalas-like totally specialized chip, therefore it is not sufficient, and just muddies the conversation from my point of view. People have been doing "CIM" for decades and yet I've never seen anyone build a totally specialized chip at the scale of Taalas. And yes, you can (in theory) build an analog version of any computer, so of course you can build analog CIM, but "analog compute" is not inherently CIM, so conflating the two is just confusing.

by mdp20213 hours ago|

parent|

[-]

I can't check everything right now, but for example, the divulgational from Rakesh Kumar mentiones "Analogue CIM".

And I do not get your rant about "analog computing", which has everything to do with NNs (otherwise, well, prove it): they started with that - they are basically that in fact. Analogue computing is a very great temptation since it would solve the issues of inefficiency in digital NNs. Unfortunately, it has drawbacks which are massive for big NNs. Taalas' seems to be the best compromise.

by wmf1 days ago|

prev|

[-]

Good models will require multiple Taalas chips but Groq and Cerebras also require a lot of chips and that hasn't stopped them.

by ipdashc4 hours ago|

parent|

[-]

> Good models will require multiple Taalas chips

I guess that makes sense. Is this feasible, or does the added latency between chips kill any of the performance gains?

by wmf3 hours ago|

parent|

[-]

Using multiple chips seems to work fine for Cerebras and Groq so it should also work for Taalas. It does sounds challenging to reach >10K tok/s but latency could be below 1 us which is a small part of the token budget.