upvote
I'm not too familiar with the hardware world, but does EML look like the kind of computation that's hardware-friendly? Would love for someone with more expertise to chime in here.
reply
Yes actually, it is very regular which usually lends itself to silicon implementations - the paper event talks about this briefly.

I think the bigger question is whether it will be more energy-optimal or silicon density-optimal than math libraries that are currently baked into these processors (FPUs).

There are also some edge cases "exp(exp(x))" and infinities that seem to result in something akin to "division by zero" where you need more than standard floating-point representations to compute - but these edge cases seem like compiler workarounds vs silicon issues.

reply
A similar function operating on the real domain for powers and logs of 2 would be extremely hardware friendly. You can build it directly out of the floating point format. First K significand bits index a LUT. Do that for each argument and subtract them.

It gets a bit more difficult for the complex domain because you need rotation.

reply
This paper seems to suggest that a chip with 10 pipeline stages of EML units could evaluate any elementary function (table 4) in a single pass.

I'm curious how this would compare to the dedicated sse or xmx instructions currently inside most processor's instruction sets.

Lastly, you could also create 5-depth or 6-depth EML tree in hardware (fpga most likely) and use it in lieu of the rust implementation to discover weight-optimal eml formulas for input functions much quicker, those could then feed into a "compiler" that would allow it to run on a similar-scale interpreter on the same silicon.

In simple terms: you can imagine an EML co-processor sitting alongside a CPUs standard math coprocessor(s): XMX, SSE, AMX would do the multiplication/tile math they're optimized for, and would then call the EML coprocessor to do exp,sin,log calls which are processed by reconfiguring the EML trees internally to process those at single-cycle speed instead of relaying them back to the main CPU to do that math in generalized instructions - likely something that takes many cycles to achieve.

reply
You could also make an analog EML circuit in theory, using electrical primitives that have been around since the 60s. You could build a simple EML evaluator on a breadboard. Things like trig functions would be hard to reproduce, but you could technically evaluate output in electrical realtime (the time it takes the electrical signal to travel though these 8-10 analog amplifier stages).
reply
deleted
reply