upvote
60 * 7 is not all that great either if you get cascading and clumping as people type at the same time- coalescing the outbound updates still seems like a good idea and since the game is turn based you know it's not really going to affect gameplay. You've basically made yourself a first person shooter networking problem for a game that's slower than WoW. That feels like overkill in terms of self-imposed obstacles.
reply
Ahhhh I see what you mean now. You just gave me some good ideas. Alas because of the nature of my game, it will always have first person shooter esque networking problems despite it being turn-based. But it's good to know that I'm dealing with a non-trivial level of throughput.
reply
there should only ever be 1 person typing at once (in a given room)

Have you verified that is the case?

reply
Yep just triple checked. If distributing the load on a single server by adding more backend containers doesn't decrease ping then maybe this is just the natural upper bound for my particular game... The only shared bottleneck between all backend containers I can think of right now is at the OS or network interface layer, but things still lag even when I tried increasing OS networking limits:

  net.core.wmem_max = 16777216
  net.core.rmem_max = 16777216
  net.ipv4.tcp_wmem = 4096 65536 16777216
  net.ipv4.tcp_rmem = 4096 87380 16777216

Perhaps the reality for low latency multiplayer games is to embrace horizontal scaling and not vertically scaling? Not sure.
reply
Networking bottlenecks are not always on your box - they could be on the router your box is talking to. Or, depending on load, the ethernet packets themselves could be crowding the physical subnet. Do you have a way to mock 500 users playing the game that would truly keep all the traffic internal to your OS? Because if that works, but the lag persists for real players, the problem is external to your OS.
reply
Good point. I actually don't know what performance looks like with 500 real users. The way I'm mocking right now is by running a script on my local machine that generates 500+ bots that listens to events to auto join + play games. I tried to implement the bots to behave as closely to humans as possible. I'm not sure if this is what you mean by keeping traffic internal to my box's OS, but right now this approach creates lag. I didn't consider whether spinning up hundreds of websocket connections from a single source (my local machine) would have any implications when load testing hm
reply
Networking often scales better horizontally.

Computation can sometimes scale well vertically but proprietary OS’s are more likely to be tuned for it…as a premium feature.

reply