undefined

upvote

points

by applfanboysbgon1 days ago |

upvote

by cogman1022 hours ago|

[-]

> Numeric characteristics are absolutely still a consideration for game designers even in 2026, one that influences what numbers they use in their game designs. The good ones, anyways.

I used to think like this, not anymore.

What convinced me that these sort of micro-optimizations just don't matter is reading up on the cycle count of modern processors.

One a Zen 5, Integer addition is a single cycle, multiplication 3, and division ~12. But that's not the full story. The CPU can have 5 inflight multiplications running simultaneously. It can have about 3 divisions running simultaneously.

Back in the day of RCT, there was much less pipelining. For the original pentium, a multiplication took 11 cycles, division could take upwards of 46 cycles. These were on CPUs with 100 Mhz clock cycles. So not only did it take more cycles to finish, couldn't be pipelined, the CPUs were also operating at 1/30th to 1/50th the cycle rate of common CPUs today.

And this isn't even touching on SIMD instructions.

Integer tricks and optimizations are pointless. Far more important than those in a modern game is memory layout. That's where the CPU is actually going to be burning most it's time. If you can create and do operations on a int[], you'll be MUCH faster than if you are doing operations against a Monster[]. A cache miss is going to mean anywhere from a 100 to 1000 cycle penalty. That blows out any sort of hit you take cutting your cycles from 3 to 1.

reply

upvote

by moregrist18 hours ago|

[-]

> Integer tricks and optimizations are pointless.

They’re not pointless; they’re just not the first thing to optimize.

It’s like worrying about cache locality when you have an inherently O(n^2) algorithm and could have a O(n log n) or O(n) one. Fix the biggest problem first.

Once your data layout is good and your cpu isn’t taking a 200 cycle lunch break to chase pointers, then you worry about cycle count and keeping the execution units fed.

That’s when integer tricks can matter. Depending on the micro arch, you may have twice as many execution units that can take integer instructions. And those instructions (outside of division) tend to have lower latency and higher throughput.

And if you’re doing SIMD, your integer SIMD instructions can be 2 or 4x higher throughput than float32 if you can use int16 / int8 data.

So it can very much matter. It’s just usually not the lowest hanging fruit.

reply

upvote

by Dylan1680714 hours ago|

[-]

> And if you’re doing SIMD, your integer SIMD instructions can be 2 or 4x higher throughput than float32 if you can use int16 / int8 data.

Your float instructions can also be 2x the throughput if you use f16. With no need to go for specific divisors.

For values that even can pack into 8 bits, you rarely have a way to process enough at once to actually get more throughput than with wider numbers.

I'm sure there's a program where it very much matters, but my bet is on it not even mildly mattering, and there basically always being a hundred more useful optimizations to work on.

reply

upvote

by benchloftbrunch11 hours ago|

[-]

Problem with f16 is that hardware support is still "new" and can't be relied on in consumer grade CPUs yet.

reply

upvote

by Pannoniae21 hours ago|

[-]

This is all true but IMO forest for the trees.... For example the compiler basically doesn't do anything useful with your float math unless you enable fastmath. Period. Very few transformations are done automatically there.

For integers the situation is better but even there, it hugely depends on your compiler and how much it cheats. You can't replace trig with intrinsics in the general case (sets errno for example), inlining is at best an adequate heuristic which completely fails to take account what the hot path is unless you use PGO and keep it up to date.

I've managed to improve a game's worst case performance better by like 50% just by shrinking a method's codesize from 3000 bytes to 1500. Barely even touched the hot path there, keep in mind. Mostly due to icache usage.

The takeaway from this shouldn't be that "computers are fast and compilers are clever, no point optimising" but more that "you can afford not to optimise in many cases, computers are fast."

reply

upvote

by WalterBright21 hours ago|

[-]

I originally got into writing compilers because I was convinced I could write a better code generator. I succeeded for about 10 years in doing very well with code generation. But then all the complexities of the evolving C++ (and D!) took up most of my time, and I haven't been able to work much on the optimizer since.

Fortunately, D compilers gdc and ldc take advantage of the gcc and llvm optimizers to stay even with everyone else.

reply

upvote

by tialaramex8 hours ago|

[-]

The thing which would really help IMNSHO is to nail down the IR to eliminate weird ambiguities where OK optimisation A is valid according to one understanding, optimisation B is valid under another but alas if we use both sometimes it breaks stuff.

reply

upvote

by WalterBright2 hours ago|

[-]

Yes, one of the unexpected problems I ran into is one optimization undoing another one, and the optimizer would flip-flop between the two states.

reply

upvote

by cogman1021 hours ago|

[-]

I actually agree with you.

My point wasn't "don't optimize" it was "don't optimize the wrong thing".

Trying to replace a division with a bit shift is an example of worrying about the wrong thing, especially since that's a simple optimization the compiler can pick up on.

But as you said, it can be very worth it to optimize around things like the icache. Shrinking and aligning a hot loop can ensure your code isn't spending a bunch of time loading instructions. Cache behavior, in general, is probably the most important thing you can optimize. It's also the thing that can often make it hard to know if you actually optimized something. Changing the size of code can change cache behavior, which might give you the mistaken impression that the code change was what made things faster when in reality it was simply an effect of the code shifting.

reply

upvote

by YesBox6 hours ago|

[-]

> A cache miss is going to mean anywhere from a 100 to 1000 cycle penalty. That blows out any sort of hit you take cutting your cycles from 3 to 1.

A good example of this is using std::vector<bool> vs. std::vector<uint8_t> in the debug build vs release build.

vector<bool> is much slower to access (it's a dynamic bitset). If you have a hot part of the code that frequently touches a vector<bool>, you'll see a multiple X slowdown in the debug build.

However, in the release build, there is no performance difference between the two (for me at least, I'm making a fairly complicated game). The cache misses bury it.

reply

upvote

by fragmede4 hours ago|

[-]

Fascinating, that's counterintuitive. I'd think the point of using vector <bool> is because the compiler would optimize it to be a bit field which is fewer bits and thus smaller and thus faster than using vector <uint_t8>. How did you come to figure that out?

reply

upvote

by YesBox3 hours ago|

[-]

I dont know how it's implemented by the standard/compiler (not my domain). The performance differences are well documented though.

I've used both in my pathing code and tested each in debug/release.

Even if the std:: implementation was as fast as possible, you're still adding bit manipulation on top of accessing the element, so it will be slower no matter what you do.

reply

upvote

by rcxdude21 hours ago|

[-]

Also, it's unusual for a game to be CPU bottlenecked nowadays, and if it is, it's probably more constrained on memory bandwidth than raw FLOPS.

reply

upvote

by eru15 hours ago|

[-]

Yes, though it depends a bit on your style of game.

reply

upvote

by timschmidt1 days ago|

[-]

Absolutely. I have written a small but growing CAD kernel which is seeing use in some games and realtime visualization tools ( https://github.com/timschmidt/csgrs ) and can say that computing with numbers isn't really even a solved problem yet.

All possible numerical representations come with inherent trade-offs around speed, accuracy, storage size, complexity, and even the kinds of questions one can ask (it's often not meaningful to ask if two floats equal each other without an epsilon to account for floating point error, for instance).

"Toward an API for the Real Numbers" ( https://dl.acm.org/doi/epdf/10.1145/3385412.3386037 ) is one of the better papers I've found detailing a sort of staged complexity technique for dealing with this, in which most calculations are fast and always return (arbitrary precision calculations can sometimes go on forever or until memory runs out), but one can still ask for more precise answers which require more compute if required. But there are also other options entirely like interval arithmetic, symbolic algebra engines, etc.

One must understand the trade-offs else be bitten by them.

reply

upvote

by ryandrake22 hours ago|

[-]

Back in the early, early days, the game designer was the graphic designer, who also was the programmer. So, naturally, the game's rules and logic aligned closely with the processor's native types, memory layout, addressing, arithmetic capabilities, even cache size. Now we have different people doing different roles, and only one of them (the programmer) might have an appreciation for the computer's limits and happy-paths. The game designers and artists? They might not even know what the CPU does or what a 32 bit word even means.

Today, I imagine we have conversations like this happening:

Game designer: We will have 300 different enemy types in the game.

Programmer: Things could be really, really faster if you could limit it to 256 types.

Game designer: ?????

That ????? is the sign of someone who is designing a computer program who doesn't understand the basics of computers.

reply

upvote

by WalterBright21 hours ago|

[-]

I wrote the Intellivision Mattel Roulette cartridge game back in the 1970s. It was all in assembler on a 10 bit (!) CPU. In order to get the game to fit in the ROM, you had to do every feelthy dirty trick imaginable.

reply

upvote

by biglost18 hours ago|

[-]

Please, write a comment, pastebin, gist or whatever, I would love to read it, that are stories in computer science i enjoy the most

reply

upvote

by DougN719 hours ago|

[-]

I would love to hear more about that.

reply

upvote

by WalterBright2 hours ago|

[-]

I wish I'd kept a listing of that and other projects I worked on. But that never occurred to me.

A friend of mine wrote the Mattel Intellivision poker game. I was playtesting it (a very boring job), and got suspicious. I walked over to his desk and said the program was cheating. It was looking at my hole cards. He sighed and asked how I knew, and I replied it was obvious. He said he didn't have room to add code to improve its play otherwise. I don't know if he fixed it or not.

reply

upvote

by applfanboysbgon19 hours ago|

[-]

Not all, or even most, games are made by billion dollar studios. Overlapping roles are still the norm in small studios. And even those that do have bespoke designer roles would likely benefit from telling them that computers have certain limitations where trade-offs in game design need to be selected for, because many AAA games run like shit. Many times for reasons other than the game design, sure, but also sometimes because of ways that could be worked around more easily if the game design were accomodating the tradeoffs.

reply

upvote

by danbolt18 hours ago|

[-]

Yeah, I’m quite surprised at this comment. Commercial video games are mass-produced products, and as much as I dislike designers being bogged down in technical minutiae, having a sense of industrial design for the thing you’re making is an incredible boon.

Fumito Ueda was notably quite concerned with the technical/production feasibility of his designs for Shadow of the Colossus. [1] Doom was an exercise in both creativity and expertise.

[1] https://www.designroom.site/shadow-of-the-colossus-oral-hist...

reply

upvote

by astrange2 hours ago|

[-]

> Fumito Ueda was notably quite concerned with the technical/production feasibility of his designs for Shadow of the Colossus. [1]

And he didn't really achieve it - the game runs very slowly and has a good deal of cut content.

(I once got him in trouble because I found a GPL violation in ICO. I assume the developer didn't pursue it because I don't see the source code up anywhere.)

reply

upvote

by lukan23 hours ago|

[-]

"and in many cases it is one of many silent contributing factors to a noticeable decrease in the quality of their game"

Game designers are not so constrained anymore by the limits of the hardware, unless they want to push boundaries. Quality of a game is not just the most efficient runtime performance - it is mainly a question if the game is fun to play. Do the mechanics work. Are there severe bugs. Is the story consistent and the characters relatable. Is something breaking immersion. So ... frequent stuttering because of bad programming is definitely a sign of low quality - but if it runs smooth on the targets audience hardware, improvements should be rather done elsewhere.

reply

upvote

by crq-yml18 hours ago|

[-]

There's an artistic thread in game coding - one that isn't the norm, but which I think RCT is exemplary of - that holds that mechanical sympathy is important to the game design process. A limit set around NPOT maximums and divisions and lengths of pathfinding is allowing the machine to opine, "you will actually do less work if you set the boundary here". Setting those limits tends to inform the shape of resulting assets as something tiny and easy to hardcode.

The thing that changed during the 90's is that mechanical sympathy became optional to achieving a large production. The data input defining the game world was decoupled into assets authored in disconnected ways and "crunched down" to optimized forms - scans, video, digital painting, 3D models. RCT exhibits some of this, too, in that it's using PCM audio samples and prerendered sprites. If the game weren't also a massive agent simulator it would be unremarkable in its era. But even at this time more complex scripting and treating gameplay code as another form of asset was becoming normalized in more genres.

From the POV of getting a desired effect and shipping product, it's irrelevant to engage with mechanical sympathy, but it turns out that it's a thing that players gradually unravel, appreciate and optimize their own play towards if they stick with it and play to competitive extremes, speedrun, mod, etc.

The 64kb FPS QUOD released earlier this year is a good example of what can happen by staying committed to this philosophy even today: the result isn't particularly ambitious as a game design, but it isn't purely a tech demo, nor does it feel entirely arbitrary, nor did it take an outrageous amount of time to make(about one year, according to the dev).

reply

upvote

by scns23 hours ago|

[-]

> it is mainly a question if the game is fun to play.

10000x this. Miyamoto starts with a rudimentary prototype and asks himself this. Sadly it seems for many fun is an afterthought they try to patch in somehow.

reply

upvote

by calvinmorrison20 hours ago|

[-]

When Halo 2 (anniversary edition? ) was released there was also a video on in the game about the development. The point that always stuck with me was "you must nail that 2 seconds that will keep people playing forever". The core mechanic of that game is just excellent.

reply

upvote

by ukuina18 hours ago|

[-]

"30 seconds of fun"

https://youtu.be/0q69Msy8ttM?t=287

reply

upvote

by 23 hours ago|

[-]

deleted

reply

upvote

by sublinear23 hours ago|

[-]

This way of thinking has caused at least a few prominent recurring bugs I can think of.

Texture resolution mismatches causing blurriness/aliasing, floating point errors and bad level design causing collision detection problems (getting stuck in the walls), frame rate and other update rates not being synced causing stutter and lag (and more collision detection problems), bad illumination parameters ruining the look they were going for, numeric overflow breaking everything, bad approximations of constants also breaking everything somewhere eventually, messy model mesh geometry causing glitches in texturing, lighting, animation, collision, etc.

There's probably a lot more I'm not thinking of. They have nothing to do "with the hardware", but the underlying math and logic.

They're also not bugs to "let the programmer figure out". Good programmers and designers work together to solve them. I could just as easily hate on the many criminally ugly, awkward, and plain unfun games made by programmers working alone, but I'll let someone else do that. :)

reply

upvote

by WalterBright21 hours ago|

[-]

> getting stuck in the walls

I remember the early Simpsons video game. Sometimes, due to some bug in it (probably a sign error), you could go through the walls and see the rendered scenery from the other side. It was like you went backstage in a play. It would have made a great Twilight Zone episode!

reply

upvote

by jkestner9 hours ago|

[-]

That immediately made me think of the Treehouse of Horror episode where Homer got stuck in the third dimension.

reply

upvote

by WalterBright2 hours ago|

[-]

Maybe that episode was inspired by the game bug!

reply

upvote

by lukan14 hours ago|

[-]

Those bugs I experienced in all sorts of games and via cheats, fly mode, I intentionally went backstage to explore.

reply

upvote

by lukan23 hours ago|

[-]

Game designer != game engine designer

(But it definitely helps if the game designer knows of the technical limits)

reply

upvote

by sublinear23 hours ago|

[-]

Sorry, I'm not super familiar with professional game dev, but I am familiar with professional web dev. The problems seem similar, as evidenced by the constant complaining here on HN about the state of the web.

Who formats or cleans up the assets and at least oversees that things are done according to a consistent spec, process, and guidelines? Is that not a game designer or someone under their leadership?

I think in all the cases I gave, what might be completely delegated to "engine design" really should be teamwork with game design and art direction too. This is what the top-level comment was talking about. Even when a game is "well made", they just adopted someone else's standards and that sucks all the soul out of it. This is a common problem in all creative work.

(adding this due to reply depth): Coordination is a big aspect of design and can often be the most impactful to the result.

reply

upvote

by lukan23 hours ago|

[-]

It depends how big the studio is, but a job of a game designer is usually not cleaning up assets. It is to well, design the game. The big picture.

reply

upvote

by lifis22 hours ago|

[-]

That makes no sense since multiplication has been fast for the last 30 years (since PS1) and floating point for the last 25 years (since PS2) and anyway numbers relevant for game design are usually used just a few times per frame so only program size matters, which has not been significantly constrained for the last 40 years (since NES)

reply

upvote

by applfanboysbgon19 hours ago|

[-]

I wasn't talking about the specific example in the article. There are many, many other ways in which numeric characteristics can constrain game design, particularly if your game has any kind of scale to it (say, simulations with tons of moving parts or many NPCs, like RCT, or large open worlds like Minecraft, or large multiplayer games like WoW, as examples all mentioned in the thread).

If your game is small-scale, something like Super Mario Bros., you should be able to get away with not thinking about it in theory. But even then people manage to write simple games with bloated loading times and stuttery performance, so never underestimate the impressive ability of people who are operating solely at the highest level of abstraction to make computers cry.

reply

upvote

by exmadscientist21 hours ago|

[-]

Related to that, for a consumer electronics product I worked on using an ARM Cortex-M4 series microcontroller, I actually ended up writing a custom pseudorandom number generation routine (well, modifying one off the shelf). I was able to take the magic mixing constants and change them to things that could be loaded as single immediates using the crazy Thumb-2 immediate instructions. It passed every randomness test I could throw at it.

By not having to pull in anything from the constant pools and thereby avoid memory stalls in the fast path, we got to use random numbers profligately and still run quickly and efficiently, and get to sleep quickly and efficiently. It was a fun little piece of engineering. I'm not sure how much it mattered, but I enjoyed writing it. (I think I did most of it after hours either way.)

Alas, I don't think it ever shipped because we eventually moved to an even smaller and cheaper Cortex-M0 processor which lacked those instructions. Also my successor on that project threw most of it out and rewrote it, for reasons both good and bad.

reply

upvote

by WalterBright21 hours ago|

[-]

I remember the older driving games. They'd progressively "build" the road as you progressed on it. Curves in the road were drawn as straight line segments.

Which wasn't a problem, but it clearly showed how the programmers improvised to make it perform.

reply

upvote

by Sharlin12 hours ago|

[-]

Limiting the drawing distance and rendering as little geometry as possible is absolutely still a thing, devs just can afford to hide it better these days. The golden rule of graphics programming has always been "cheat as much as you can get away with, and then a bit more".

reply

upvote

by rkagerer21 hours ago|

[-]

Now that's what being a full stack programmer really means.

reply

upvote

by eru15 hours ago|

[-]

Constraints breed creativity.

reply

upvote

by 7bit10 hours ago|

[-]

Today, constraints are simply ignored. (Looking at you, we devs and Microsoft devs).

reply

upvote

by edflsafoiewq1 days ago|

[-]

Examples?

reply

upvote

by mort9623 hours ago|

[-]

I think Minecraft's lighting system is a good example: there are 16 different brightness levels, from 0 to 15. This allows the game to store light levels in 4 bytes per block.

Similarly, redstone has 16 power levels: 0 to 15. This allows it to store the power level using 4 bits. In fact, quite a lot of attributes in Minecraft blocks are squeezed into 4 bits. I think the system has grown to be more flexible these days, but I'm pretty sure the chunk data structure used to set aside 4 bits for every block for various metadata.

And of course, the world height used to be at 255 blocks. Every block's Y position could be expressed as an 8-bit integer.

A voxel game like that is a good example of where this kind of efficiency really matters since there's just so much data. A single 1616256 chunk is 65.5k blocks. If a game designer says they want to add a new light source with brightness level 20, or a new kind of redstone which can go 25 blocks, it might very well be the right choice to say no.

reply

upvote

by tosti22 hours ago|

[-]

I don't think Minecraft would be considered a cornerstone of optimal programming.

reply

upvote

by helterskelter21 hours ago|

[-]

The 4 bit stuff is a hangover from Mojang having to squeeze every bit of perf from their Java based engine that they could. Their original sound engine was so sketchy that C418's (music composer) minimalist sound is partly because it really couldn't handle much more than what got released.

MS has been loosening up on the 4 bits limit and have created a CPP variant of Minecraft which performs better, but they've also introduced their unified login garbage that has almost made me give up Minecraft completely.

reply

upvote

by Pannoniae21 hours ago|

[-]

Hey, this isn't entirely accurate!

The 4-bit stuff is a hangover from Notch doing this (I'd maybe even say a similar-calibre programmer to Chris Sawyer...). The sound has nothing to do with technical limits, that's a post-facto rationalisation.

The game never played midi samples, it was always playing "real" audio. The style was an artistic choice, many similar retro-looking games were using chiptune and the sorts. It's a deliberate juxtaposition...

The CPP variant doesn't really perform better anymore either.

reply

upvote

by helterskelter20 hours ago|

[-]

Fair enough, I mostly meant to point out some of those design decisions predate MS, as much as I love to hate on them. The music was just an interesting bit of trivia I read the other day.

reply

upvote

by Pannoniae19 hours ago|

[-]

Yeah, 100% :) Ironically, the design constraints are one of the big things which made it work so much! If it was designed in a "traditional" way, it would have been much less ambitious.

reply

upvote

by imtringued12 hours ago|

[-]

Bedrock Edition has a smaller simulation distance, which is kind of the opposite you'd expect from the more "optimized" version.

reply

upvote

by mort9620 hours ago|

[-]

Minecraft is, and always has been, handling vast amounts of data at pretty good performance. It's not an impossibly difficult task, many other people have made voxel game engines which are better, but it's something you can't do without paying attention to these things. Every voxel engine with remotely reasonable performance needs to carefully count bits used per block.

reply

upvote

by nitwit0052 hours ago|

[-]

You can find other people discussing implementing similar games on YouTube, and the need to cram the representation of blocks into as small a size as possible always comes up.

Information about blocks is the overwhelmingly dominant thing being stored in memory for those games, so naturally reducing the size of that data becomes important.

reply

upvote

by kulahan21 hours ago|

[-]

The entire program doesn't need to be a cornerstone of optimal programming for this one example to hold true.

reply

upvote

by andai1 days ago|

[-]

https://en.wikipedia.org/wiki/Nuclear_Gandhi

From what I heard, there was a Civilization game which suffered from an unsigned integer underflow error where Gandhi, whose aggression was set to 0, would become "less aggressive" due to some event in the game, but due to integer underflow, this would cause his aggression to go to 255, causing him to nuke the entire map.

The article says this was just an urban legend though. Well, real or not, it's a perfect example of the principle!

reply

upvote

by luaKmua1 days ago|

[-]

Indeed an urban legend. Sid Meier himself debunked in his memoir, which is a pretty great read.

reply

upvote

by mrguyorama6 hours ago|

[-]

It's fascinating to live through the entire lifecycle of:

Weird thing happens. People make up reasons why. One reason is possible. That becomes THE reason, and spread wildly, without confirmation, as an accurate explanation. "Actually that's not true". Now that not being the reason is widely disseminated and if we are lucky the original meme dies out!

But it took 30 years. For a very meaningless rumor.

reply

upvote

by bombcar22 hours ago|

[-]

Read all of the Factorio Friday Facts https://factorio.com/blog/ - a number of the more obscure bug/performance issues come down to making something fit naturally into a value the CPU can handle.

reply

upvote

by hcs1 days ago|

[-]

Not the same thing but I was reminded of a joke about the puzzle game Stephen's Sausage Roll:

> I have calculated the value of Pi on Sausage Island and found it to be 2.

https://web.archive.org/web/20240405034314/https://twitter.c...

reply

upvote

by Waterluvian1 days ago|

[-]

Not really an example that proves any point, but one that comes to mind from a 20-year-old game:

World of Warcraft (at least originally) encoded every item as an ID. To keep the database simple and small (given millions of players with many characters with lots of items): if you wanted to permanently enchant your item with an upgrade, that was represented essentially as a whole new item. The item was replaced with a different item (your item + enchant). Represented by a different ID. The ID was essentially a bitmask type thing.

This meant that it was baked into the underlying data structures and deep into the core game engine that you could never have more than one enchant at a time. It wasn't like there was a relational table linking what enchants an item in your character's inventory had.

The first expansion introduced "gems" which you could socket into items. This was basically 0-4 more enchants per item. The way they handled this was to just lengthen item Ids by a whole bunch to make all that bitmask room.

I might have gotten some of this wrong. It's been forever since I read all about these details. For a while I was obsessed with how they implemented WoW given the sheer scale of the game's player base 20 years ago.

reply

upvote

by plopz22 hours ago|

[-]

One of the main issues with Kerbal Space Program is instability caused by floating point numbers. I know Starcraft 2 was built upon integers.

reply

upvote

by Gigachad22 hours ago|

[-]

Floating point issues are less a problem of performance here but one of precision. Particularly being a space game, the coordinates can be massive resulting in the precision deteriorating enough to cause issues.

reply

upvote

by ErroneousBosh23 hours ago|

[-]

Going way back into history, the Alesis MIDIVerb reverb unit had a really simple DSP core made out of discrete logic chips. It could add a memory location to an accumulator and divide it by two, invert it, add it and divide it by two, or store it in ram either inverted or not and divide the accumulator by two.

Four instructions, in about eight chips.

By combining shifts and adds Keith Barr was able to devise all the different filter and delay coefficients for 63 different reverb programs (the 64th one was just dead passthrough).

reply