Personally I had similar results when searching for known terms for certain concepts that I didn't know the name for. And usually I had to guide the process to find the actual expression used in the domain (it usually made up a lot of fancy and well-fitted names itself). And sometimes it helped to gonback and change the query. Harder if you have to wait that long, though :)
So.. I guess I would trust its process (mentioning "research" a lot) more than its chosen result :) Not sure if it you wanted it to be a synthesizer or rather an assistant. I'd go with what is closer to your intention or what you think it resulted in in the end.
An interesting observation might be that guiding that (I don't dare to say it, but anyway) research process might still be an important part and the network's self-evaluation might not be as good as one would need it to be at this point in time. I'd guess it's about adding ones personal judgement early to not end up at the wrong spot after a long processing time. Then again you are using a certain selection of simple and more complex models, so I can't say if there might be a way to have that kind of harsher judgement that one would apply oneself emulated in the proces and what side effects might come form that choice of models you made.
In the end I was just surprised by the number of picky replies your post got, so I just thought that you discussing the perfect description with the LLM might make be a fun solution. Personally I am a big fan of interactively talking to LLMs at the moment though, so I might be the wrong guy to know your intended audience and use case enough. Just couldn't see how using the term "research" would be the problem.
I re-built with CrewAI but was limited so now doing the same again but using AutoGen.
The criticism is important to this, as is any research project. That's why good research works, if it is challenged. I'm implementing that "challenging" directly into the LLM as separate agents.
Balancing token efficiency with regular check-ins and critique is important, and something I want to ensure the user has control over.
Nearly to version 0.1.3, and much to learn still.
I loved that use case! I will keep testing, refining and hoping others will do until we get there. Persistence is what matters, to try our best and strive towards collaboration, openness and accuracy.
Obscurity is dispelled by augmenting the light of discernment, not by attacking the darkness.” - Quote by Socrates.