I tried to do RAG on my laptop just by setting it all up myself, but the actual LLM gave poor results (I have a small thin-and-light fwiw, so I could only run weak models). The vector search itself, actually, ended up being a little more useful.
If you don’t mind a little instability while I work out the bugs, might be interested in my project: https://github.com/rmusser01/tldw_server ; it’s not quite fully ready yet but the backend api is functional and has a full RAG system with a customizable and tweakable local-first ETL so you can use it without relying on any third party services.
Oh! Same! I made an R / Shiny powered RAG/ Researching app that hooks into OpenAlex (for papers) and allows you to generate NotebookLM like outputs. Just got slides with from-paper images to be injected in, super fun. Takes an OpenRouter or local LLMs (if that's your thing). Network graphs too! https://github.com/seanthimons/serapeum/