upvote
Hum... The biggest hype on process engineering by the 00s was buffer size reduction. Because those buffers interfere with each other in chaotic ways, and they tend to turn "small problems that blow up soon, with small consequences" into "huge accumulated problem, that blows up hours after it appeared, with business-risking consequences".

Queues are pretty similar.

reply
Reducing buffer size puts back pressure on the whole system, which can be valuable to manage load (but often throttles faster stages and that throttling makes people uncomfortable). A meaningful metric is how much of the buffer is used at any given time and the throughout. If the buffer is backed up, that says there's a bottle neck on the consumption side of the buffer and more bandwidth is needed there. For whatever reason, adjusting buffer sizes is the more common action taken. A buffer provides throughput management but it also provides info/metrics about the operation of the system.
reply
You can also observe this in games like Dyson Sphere Program, (which is all workers and queues and buffers) where adding a buffer storage section of a belt only hides the fact that you are under-producing one of the components required.

The buffer smooths out bursty flow but you don't want that in the middle of the pipeline, as it actually represents mid-pipeline inefficiency. You should actually be fixing the upstream or downstream problem.

[1] or other automation games like Factorio, Mindustry

reply
I'll note that speedrunners absolutely buffer mid-pipeline in Factorio, and not just for hand-crafting purpouses. Sometimes you're waiting for R&D, sometimes you're just running half the machines for twice as long, giving you the same output while saving on build costs. The actual bottlenecks are constantly shifting. "I'm not speedrunning!" you might say, but every regular game could've started as a speedrun that could've gotten you to where you are faster.

Understanding the tendency of mid-pipeline buffers to hide problems is useful, but scorning them entirely is also suboptimal.

reply
That's a very good analogy. Queues are not there to solve overload (and they never were), they are there as an *architecture tool* that allows decoupling and *can* (not always) ease the scaling of the queue process (workers).

I think the back-pressure should always be implemented from the very beginning, as it also helps with defining the requirements of what the service should be able to handle

reply
You're essentially describing what a silicon engineer would call independent "clock domains" (the stations) and "clock-domain-crossing signals" (the workpieces.) And, indeed, you would also tend to handle clock-domain-crossing signals by sticking an async FIFO between the two clock domains.
reply
In manufacturing you have mass. Stuff has weight to it and sometimes I think it would be best to imagine data has mass.

In the widget factory there is the option to put stuff in the warehouse until you need it. Great in principle, but if you are the guy having to do the heavy lifting to get stuff crammed into the warehouse, and retrieved, then you can end up wondering why you are in the job, which promised so much more than spending all day in the warehouse rather than making stuff.

With web applications we will gladly get gigabytes of stuff from the other side of the world, just in case we need it. If all of that data weighed grams or even tonnes, then we would do things very differently, to be more like the Toyota Way, with just-in-time and the rest of it.

Hence my suggestion when building for the web, imagine every byte has mass. Design accordingly.

reply