undefined

points

by noosphr14 hours ago |

comments

by srean13 hours ago|

[-]

You might like having a go at Lush. It has fallen out of favor of late but is a very interesting language/system.

https://scottlocklin.wordpress.com/2024/11/19/lush-my-favori...

by noosphr12 hours ago|

parent|

[-]

Sounds interesting but I'm using very spare very high rank tensors, e.g. rank 3 neuron equivalents.

As such pretty much all numerical optimisations are useless for my work. Racket however chugs along happily, if slowly.

by UncleOxidant13 hours ago|

prev|

[-]

That sounds kind of amazing. But you're not actually doing the machine learning in Racket, are you? Is your Racket code generating other code like PyTorch?

by noosphr12 hours ago|

parent|

[-]

I'm doing the learning in racket because the bottleneck is human understanding.

That mnist takes 30 minutes per epoch isn't a worry when I don't even know what vector addition should look like.

by UncleOxidant12 hours ago|

parent|

[-]

This is a complete tangent, but since you mentioned MNIST: I accidentally discovered Tsetlin machines this week when someone on r/Julia asked if anyone with an AMD GPU could run the benchmark in their package called Tsetlin.jl. I've got an AMD GPU so I was happy to oblige. Then I looked at what the benchmark was doing: it was training an MNIST classifier to 98% accuracy in 9 seconds - that seemed like a couple of orders of magnitude too fast. I was flabbergasted and wondered what the heck this thing was and that's when I learned about Tsetlin machines. I went on (with the help of Claude) to implement one in an FPGA and again was flabbergasted when it only took 2k LUTs to implement a Tsetlin machine for MNIST classification in hardware.

by noosphr12 hours ago|

parent|

[-]

Well yes, you have to use one of the newer mnist variants these days if you want to get anything meaningful. A linear classifier gets something like 87% on the original one.

by mathisfun1235 hours ago|

parent|

prev|

[-]

> I don't even know what vector addition should look like.

I think you're trying to imply you're inventing something new and racket enables you to explore... But what I read (as someone with a PhD in deep learning that has worked on sparsity) is you actually don't know the prior art and you're using racket as an excuse to reinvent a whole bunch of stuff that already exists in plenty of mature libraries in more mundane languages (including python/pytorch). Which is of course fine for personal growth but please don't oversell racket as a "superpower" - to wit I can manipulate any part of my stack too because it's all written in cpp.

by noosphr3 hours ago|

parent|

[-]

I once replaced IEEE 754 floating point numbers in a model by balanced ternary floating point numbers.

It took me 20 minutes.

Tell me how you'd do that in cpp?

by mathisfun12319 minutes ago|

parent|

[-]

lol the same way we implement all of the reduced precision fp8, fp4 types today: by storing them in the corresponding uint:

https://github.com/ggml-org/llama.cpp/discussions/15095

by noosphr6 minutes ago|