upvote
It most likely isn't, but it seems like this project was more for learning purposes than for anything else. In that case, why not go for the "production-ready", "highly scalable" solution? I sometimes do the same for my personal projects. I over-architect them not because it's necessary, but because I want to get my hands dirty and learn something new.
reply
For voice conversations the issue can be more latency than filling the context. Without knowing the site is hard to say, but if he had multiple pages worth of text (dunno, type of cars, procedures, some emotional story, etc.) and a "slower" model, it might be worth it to use RAG to preselect fast a small portion and use LLM to refine the answer.
reply
Yeah, just stuff the entire website and pricing table into the context window.
reply
Completely agree. I think the whole thing can fit in context.
reply
Yeah, this architecture is completely unnecessary.
reply