undefined

points

[-]

No. You cannot. It's the wrong tool for the problem.

That little "add" of yours has the overhead of: having an LLM emit it as a tool call, having to pause the LLM inference while waiting for it to resolve, then having to encode the result as a token to feed it back.

At the same time, a "transformer-native" addition circuit? Can be executed within a single forward pass at a trivial cost, generate transformer-native representations, operate both in prefill and in autoregressive generation, and more. It's cheaper.

by delta_p_delta_x2 hours ago|

parent|

[-]

I can't tell if this is satire or not. A1 top-tier.

by tovej15 hours ago|

parent|

prev|

[-]

giggle

by mcdeltat21 hours ago|

prev|

[-]

"smallest supercomputing cluster that can add two numbers"

by nurettin20 hours ago|

prev|

[-]

I mean, yeah, no need to put a bunch of high powered cars in a circular track to watch them race really close to each other at incredible speeds, causing various hazards, either. Especially since city buses have been around for ages.

by delta_p_delta_x19 hours ago|

parent|

[-]

I would similarly criticise a race car being used to do a city bus' job of getting a lot of people from point A to B.

Although the converse would be interesting, racing city buses.

by pitaj18 hours ago|

parent|

[-]

Nobody has suggested using this for addition tasks in production. It's an academic exercise. What are you on about?