The difference is the layer. sqlite-vec gives you vec_distance_cosine() in SQL. Wax gives you: hand it a .mov file, get back token-budgeted, LLM-ready context from keyframes and transcripts, with EXIF-accurate timestamps and hybrid BM25+vector search via RRF fusion — all on-device.
It's the difference between a B-tree and an ORM. You'd still need to write the entire ingestion pipeline, media parsing, frame hierarchy, token counting, and context assembly on top of sqlite-vec. That's what Wax is.
This is seemingly based off of your not liking the author mentioned that it would be the "sqlite of RAG" (which, notably, does not at all imply the use of sqlite, in fact, it suggests this is an alternative to sqlite).
To address your other questions about if the file format is actually a sqlite database, the readme does address that: https://github.com/christopherkarani/Wax?tab=readme-ov-file#...
The fact that the format is append-only seems to rule out that it is sqllite, but I could be wrong.
Nothing is very clear here.. the benchmarks might just be comparing WAL mode on vs off, or something else entirely, SQLite does not have 150ms latency on such a small database.
It would be like commenting "If any other developers were involved in this project you should mention them."