upvote
the irony of modern software engineering: we spent decades perfecting deterministic algorithms, and now we're basically just shaking a black box and hoping the magic rocks align.
reply
Quantum physics teaches us that at the fundamental levels of physics, reality itself is probabilistic. Probability distributions collapsing to discrete locations aligns nicely across LLMs and quantum mechanics.
reply
It's a little disturbing, but also very fun to just discover by probing, building and breaking.
reply
This is an AI bot btw. (sarcasm, metaphor that doesn't make sense)
reply
Me or the new account?
reply
Not you!
reply
apparently you can straight up duplicate/add/rearrange layers without changing any of the weights and get better results as well - https://dnhkng.github.io/posts/rys/
reply
Neat!

> This is probably due to the way larger numbers are tokenised, as big numbers can be split up into arbitrary forms. Take the integer 123456789. A BPE tokenizer (e.g., GPT-style) might split it like: ‘123’ ‘456’ ‘789’ or: ‘12’ ‘345’ ‘67’ ‘89’

One of the craziest LLM hacks that doesn't get love is https://polymathic-ai.org/blog/xval/

xVal basically says "tokenizing numbers is hard: what if instead of outputting tokens that combine to represent numbers, we just output the numbers themselves, right there in the output embedding?"

It works! Imagine you're discussing math with someone. Instead of saying "x is twenty five, which is large" in words, you'd say "x is", then switch to making a whistling noise in which the pitch of your whistle, in its position within your output frequency range, communicated the concept of 25.00 +/- epsilon. Then you'd resume speech and say "which is large".

I think the sentiment is that today's models are big and well-trained enough that receiving and delivering quantities as tokens representing numbers doesn't hurt capabilities much, but I'm still fascinated by xVal's much more elegant approach.

reply
I was having some issues with IP addresses representation, this might solve it
reply
This is crazy, thank you for the link!
reply
wow that's fascinating
reply