undefined

points

[-]

Yes. That is, it is if you imagine a magically good self attention mechanism that could decide what in the context to attend to at any one moment. Then it would be like working with a polymath that has incredible memory. Or bringing in that aged but still senior Chief of Staff of a large company that knows where every body was buried, and why every decision was made at the time it was made, or a professor of film that has seen and can remember thousands of films.

Shockingly, we seem to have found a self attention mechanism of that quality, it just has the sad property of growing at O(N^2) where N is the context length.

by AureliusMA3 hours ago|

prev|

[-]

I remember when a large context was 8k! Nowadays that would seem extremely small, because we have new use-cases that require much larger context sizes. Maybe in the future, we will invent ways to use inference on very large contexts that we cannot even imagine today.

by AntiUSAbah2 hours ago|

prev|

[-]

Thats either the R&D part of this chip or Nvidia has the use case.

Nvidia uses ML for finetuning and architecturing their chips. this might be one use case.

Another one would be to put EVERYTHING from your company into this context window. It would be easier to create 'THE' model for every company or person. It might also be saver than having a model train with your data because you don't have a model with all your data, only memory.

by alansaber1 hours ago|

prev|

[-]

Yes because scaling tends to pay off unexpectedly

by faangguyindia2 hours ago|

prev|

[-]

imagine if you were making a database software and u could fit source code of all existing databases and their github issues in context.

by withinboredom3 hours ago|

prev|

[-]

For larger codebases ... maybe it will cut down on "let me create a random number wrapper for the 15th time" type problems.

by Weryj3 hours ago|

parent|

[-]

You should already have skills which mention these utilities.

But maybe that’s enough tokens to feed an entire lifetime of user behaviour in for the digital twin dystopia?

by withinboredom3 hours ago|

parent|

[-]

"type problems" was doing the heavy lifting there, not literally "this utility".