You can then reconstruct the original image by doing the reverse, extracting frames from the video, then piecing them together to create the original bigger picture
Results seem to really depend on the data. Sometimes the video version is smaller than the big picture. Sometimes it’s the other way around. So you can technically compress some videos by extracting frames, composing a big picture with them and just compressing with jpeg
Interesting, when I heard about it, I read the readme, and I didn't take that as literal. I assumed it was meant as we used video frames as inspiration.
I've never used it or looked deeper than that. My LLM memory "project" is essentially a `dict<"about", list<"memory">>` The key and memories are all embeddings, so vector searchable. I'm sure its naive and dumb, but it works for my tiny agents I write.