upvote
That directory is huge already! I guess the index.md helps the agent find what it needs, but even the markdown file is very long - this would consume a ton of tokens.

Also I wonder who/what decides what papers go in there.

In the blog post, the agent is allowed to do its own search.

reply
Check out the Researcher and Process Leads skill in ctoth/research-papers-plugin. I have basically completely automated the literature review.
reply
Having a "indexed global data collection" of the markdown would be a kumbaya moment for AI. There's so much data out there but finite disk space. Maybe torrents or IPFS could work for this?
reply
I'm actually sort of working on this! https://github.com/ctoth/propstore -- it's like Cyc, but there is no one answer. Plus knowledge bases are literally git repos that you can fork/merge. Research-papers-plugin is the frontend, we extract the knowledge, then we need somewhere to put it :)
reply
Wow this is amazing. Did you write all those MD files by hand, or used an LLM for the simple stuff like extracting abstracts?
reply
I used https://github.com/ctoth/research-papers-plugin to produce the annotations. The thing that's really cool is how they surface the cross-links in the collection, for instance look at https://github.com/ctoth/Qlatt/blob/master/papers/Fant_1988_...

Claude is much faster and better at reading papers than Codex (some of this is nested skill dispatch) but they both work quite incredibly for this. Compile your set of papers, queue it up and hit /ingest-collection and go sleep, and come back to a remarkable knowledge base :)

reply