undefined

upvote

points

by amtamt7 hours ago |

upvote

by compumike6 hours ago|

[-]

> Redis (and memcache) are memory caches and should be treated like that

If you haven't come across Kvrocks yet, it may be worth a look: https://github.com/apache/kvrocks https://kvrocks.apache.org/ . It's a database with a Redis-compatible wire protocol, but the database is stored on disk. This means your working set is not limited by RAM and can be a few orders of magnitude larger! On modern SSDs this is still very fast. I think it improves the durability story as well. But the big win is the orders of magnitude larger database space.

As I've been improving my side project https://totalrealreturns.com/ recently I've ended up using both Redis and Kvrocks together. Redis is great for small global state that needs to be super fast. Kvrocks is great for larger bulk data storage (large precomputed datasets), but also supports a lot of the Redis data structures as well as Lua scripts.

reply

upvote

by n_e7 hours ago|

[-]

Redis is used for plenty of things, not just memory caches.

For example if you use it for session storage, you can't have your application read from a random instance that may or may not contain the session.

reply

upvote

by tossandthrow5 hours ago|

[-]

This case is exactly what he talks about. To get HA just setup more than one redis cache - or rebuild the session if it was lost in the redis cache.

reply

upvote

by 9dev5 hours ago|

[-]

It’s not. Imagine a web app that stores your user information in a session store, mapped by your cookie-provided session ID. Your web app searches redis 1 for the session id, but since that key is on redis 2, the lookup fails and the application thinks there is no such session, and rejects the request.

Now you could solve this specific case by sharding by prefix, or by querying all instances, but then you still do not have high availability: if the instance a specific session is on is down, these users cannot authenticate. At that point you’re better off with a single instance.

reply

upvote

by olavgg5 hours ago|

[-]

But that is his point. If you cannot find the session id in redis, you login again. If your Redis server crash, you start a new one and everyone just login again. No data is lost.

reply

upvote

by 9dev5 hours ago|

[-]

Sure the data is lost. A session commonly holds arbitrary state, and even if it’s just the login information. This is ridiculous.

reply

upvote

by tossandthrow2 hours ago|

[-]

Obviously these are application decisions.

You, obviously, don't commit important data only to a session that you can loose, if the application does not allow it.

We use redis as infrastructure. To route events and as a cache.

For us redis could go down and we would merely see a degradation of our service with no data loss.

I recommend using redis like that. And then use a database that supports transactions for real data problems.

But we are different. And that's OK.

reply

upvote

by 9dev1 hours ago|

[-]

This discussion is a bit weird. We started off from, Redis should have better availability guarantees. Specifically to avoid the degradation of service you described.

But that requires running on multiple instances, which in turn requires to share the data across all replicas.

reply

upvote

by trumpdong4 hours ago|

[-]

If you consider it important, you have to store it in a real database. No buts. If you don't consider it important, sharded redis works fine.

reply

upvote

by 9dev4 hours ago|

[-]

Redis is a real database. If I wasn’t convinced it could retain data I hand it, I wouldn’t use it in the first place.

Just because it works for your use case right now doesn’t mean there isn’t room for improvements to support others too.

reply

upvote

by trumpdong3 hours ago|

[-]

> Redis is a real database.

Oh good, then you don't need to do any of the stuff that you suggested to do

reply

upvote

by tossandthrow2 hours ago|

[-]

I don't think you understand what HA means.

The app would look up in both databases. If it exists in any, there would be a session.

Thisnis strictly different from partitioning which I think you are mixing it up with.

Paritioning is for performance not HA

reply

upvote

by n_e2 hours ago|

[-]

> The app would look up in both databases. If it exists in any, there would be a session.

And if you find the session with differing values in both databases, how do you know which one is up-to-date?

You need an algorithm to pick which data is right, such as electing a master instance.

And that brings us back to the original discussion: to manage sessions (unlike caches) in a highly available way, you need to setup HA (or reimplement it, which obviously is a bad idea). You can't read round robin from multiple non-HA instances.

reply

upvote

by tossandthrow2 hours ago|

[-]

Yes, you are pointing out exactly how HA is difficult.

There is a whole slew of downstream things you need to take into consideration.

reply

upvote

by 9dev2 hours ago|

[-]

That’s the precise point I’m making

reply

upvote

by marklubi4 hours ago|

[-]

For the project I've been working on for more than 15 years, we make extensive use of the pub/sub functionality for distributing live data. Pub/sub scales well across the cluster. Publish to one, and it goes out to subscribers on any of the nodes that they've connected to.

Will millions of users, high availability is critical for this functionality.

reply

upvote

by 9dev7 hours ago|

[-]

Redis doesn't necessarily have to be used as a cache. Streams, for example, make it a great message queue; but a single-node message queue is a single point of failure and thus not viable for many setups.

reply

upvote

by acejam6 hours ago|

[-]

That's why you run Redis Sentinel in production

reply

upvote

by 9dev5 hours ago|

[-]

That you do. Until you realise that there is only a single writer in that scenario, it doesn’t address any sharding concerns, you need to use compatible clients that opt into the sentinel protocol, during failover you’ll see client errors… there’s lots of room for improvement on redis HA.

reply

upvote

by lukaslalinsky5 hours ago|

[-]

With the amount of problems I had using Redis Sentinel, I really wish there was another way. On multiple occasions, with completely different deployments, it got itself into a non-repairable state where the only option was to drop it and setup the replicas manually. I was hoping someone would do a Patroni-like project for Redis, but I've not found it yet. I've moved all persistent data to PostgreSQL and use a number of Valkeys behind Envoy proxy as a cache.

reply

upvote

by yxhuvud2 hours ago|

[-]

Redis have many use cases, and acting as a cache is only one of them. One very common usage is as a backend for background worker jobs. That can need HA.

reply

upvote

by __s6 hours ago|

[-]

Years ago I enabled durability on redis & used it as database for an online card game

reply