I bet there's money to be made for building a drop-in to either of those two that requires less memory, would save companies a bundle, and make other companies a bundle as well.
That's because it's much, MUCH faster to do it that way, though if you can deal with certain type of latency trade offs for throughput something like turbopuffer can do wonders for your costs.
> why would you not want to index?
Because if you don't need an index it wastes RAM, as you've learned. Maintaining indices also has a cost. Index only what you need.
In the sense of the blog post: A senior with decent DB experience would have told you. ;)
In all fairness this was my first job a few years ago as a developer, I deep dove MongoDB but I was also one of the only devs using it at this place.
My previous experience with MongoDB had been in college and more limited.
I am not experienced with MongoDB, I don't know if previous comment reports were the users fault or MongoDB's. But one thing is clear to me, complaining it uses too much RAM and not knowing the reasons for it, is a user problem. A common mistake is to setup a DB and expect it just magically does works. DBs are complicated beasts, you have to know how to deal with them.
I think these are realistic expectations for most apps. Obviously the likes of Netflix and Uber get orders of magnitude more, but 99.9% of apps aren't a Netflix or an Uber, and you don't have to optimize for scaling until your app is on a trajectory to become one, and putting your database on an SSD already let's you handle several thousand concurrent users with ease.
Of course everything depends on use case and constraints. I highlight the extremes here, the initial confusion was why DBs require so much RAM. Traditional DBs are optimized around RAM, that's where they perform best. You can abuse that, but it's not the best they can be in terms of latency, predictability and stability.
At some point they added the docValues configuration option per-field to do the transformation during indexing and store it to disk instead, so none of it had to be stored in the heap. Instead what you're supposed to do is rely on the OS disk cache, which handles eviction automatically, so you can run with significantly less memory but get performance improvements by adding memory without having to change any configuration further.