I really encourage people to read the Donald Knuth essay that features this sentiment. Pro tip: You can skip to the very end of the article to get to this sentiment without losing context.
Here ya go: https://dl.acm.org/doi/10.1145/356635.356640
Basically, don't spend unnecessary effort increasing performance in an unmeasured way before its necessary, except for those 10% of situations where you know in advance that crucial performance is absolutely necessary. That is the sentiment. I have seen people take this to some bizarre alternate insanity of their own creation as a law to never measure anything, typically because the given developer cannot measure things.
Similar to the "code should be self documenting - ergo: We don't write any comments, ever"
1) Most bugs are integration bugs. Whereby multiple systems are glued together but there’s something about the API contract that the various developers in each system don’t understand.
2) Most performance issues are architectural. Unnecessary round trips, doing work synchronously, fetching too much data.
Debuggers and profilers don’t really help with those problems.
I personally know how to use those tools and I do for personal projects. It just doesn’t come up in my enterprise job.
He stopped me an said he was just looking to see if I knew what an INT 3 was. He said few engineers he interviewed had any idea.
(Then, shortly afterward I also tried to find a new job, realized the entire industry had changed, and was fortunate enough to decide it wasn't worth the trouble.)
That's likely thanks to C which goes to great pains to not specify the size of the basic types. For example, for 64 bit architectures, "long" is 32 bits on the Mac and 64 bits everywhere else.
The net result of that is I never use C "long", instead using "int" and "long long".
This mess is why D has 32 bit ints and 64 bit longs, whether it's a 32 bit machine or a 64 bit machine. The result was we haven't had porting problems with integer sizes.
I've met very few folks who understand the overheads involved, and how extreme the benefits can be from avoiding those.
The sort of insane stuff I've seen on the dotnet repo where people are trying to tear apart the entire type system just because they think they've cracked some secret performance code.
You mean the .net compiler/runtime itself? I haven't looked at it, but isn't that the one place you'd expect to see weirdly low-level C# code?
And you have a frame with an operands stack where you should be able to store at least a 32-bit value. `double` would just fill 2 adjacent slots.
And references are just pointers (possibly not using the whole of the value as an address, but as flags for e.g. the GC) pointing to objects, whose internal structure is implementation detail, but usually having a header and the fields (that can again be reference types).
Pretty standard stuff, heap allocating stuff is pretty common in C as well.
And unlike C, it will run the exact same way on every platform.
If you ask a typical grad the size of a bool they will inevitably say one bit, but, CPUs and RAM, etc don't work like that, typically they expect WORD sized chunks of memory - meaning that the boolean size of one but becomes a WORD sized chunk, assuming that it hasn't been packed
To be fair, though, I come up short on a lot of things comp sci graduates know.
It's why Andrei Alexandrescu and I made a good team. I was the engineer, and he the scientist. The yin and the yang, so to speak.
And yet even more of a fun time with porting pointer code was going from the various x86 memory models[0] to 32-bit. Depending on the program, the pain was either near, far, or huge... :-D
The integer representation wasn't always two's complement in the early days of computing, so you couldn't even assume that. C++ only required integer representations to be two's complement as of C++20, since the last architectures that don't work this way had effectively been dead for decades.
In that context, an 'int' was supposed to be the native word size of an integer on a given architecture. A long time ago, 'int' was an abstraction over the dozen different bit-widths used in real hardware. In that context, it was an aid to portability.
I suggested to him that he'd have a hard time finding any existing C code that ran correctly on it. After all, how are you going to write a byte to memory if you've only got 32 bit operations?
Anyhow, after 20 years of programming C, I took what I learned and applied it to D. The integral types are specified sizes, and 2's complement.
One might ask, what about 16 bit machines? Instead of trying to define how this would work in official D, I suggested a variant of D where the language rules were adapted to 16 bits. This is not objectively worse than what C does, and it works fine, and the advantage is there is no false pretense of portability.
If the number of bits isn't actually included right in the type name, then be very sure you know what you're doing.
The senior engineer answer to "How many bits are there in an int?" is "No, stop, put that down before you put your eye out!" Which, to be fair, is the senior engineer answer to a lot of things.
On the other, the right answer is 16 or 32. It's not the correct answer, strictly speaking, but it is the right one.
I haven't used a debugger much at work for years because it's all Docker (I know it's possible but lots of hoops to jump through, plus my current job has everything in AWS i.e. no local dev).
It should be to the greatest extent possible. Strive to write literate code before writing a comment. Comments should be how and why, not what.
> - ergo: We don't write any comments, ever"
Indeed this does not logically follow. Writing fluent, idiomatic code with real names for symbols and obvious control flow beats writing brain teasers riddled with comments that are necessary because of the difficulty in parsing a 15-line statement with triply-nested closures and single-letter variable names. There's a wide middle ground where comments are leveraged, not made out of necessity.
My counterpoint: Code can be self-documenting, reality isn't. You can have a perfectly clear method that does something nobody will ever understand unless you have plenty of documentation about why that specific thing needs to be done, and why it can't be simpler. Like having special-casing for DST in Arizona, which no other state seems to need:
I know it may be hard for me to understand the need for writing in english what is obvious (to me) in code. I also know i have read a stupid amount of code.
My rule is simple, if the comment repeats verbatim the name of a variable declaration or function name, it has to go. Anything else we can talk about.
Even 'grug brained' isn't about not thinking, it's about keeping capacity in reserve for when the shit hits the fan. Proper Grug Brain is fully compatible with Kernighan's Law.
I'm still salty about that time a colleague suggested adding a 500 kb general purpose js library to a webapp that was already taking 12 seconds on initial load, in order to fix a tiny corner case, when we could have written our own micro utility in 20 lines. I had to spend so much time advocating to management for my choice to spend time writing that utility myself, because of that kind of garbage opinion that is way too acceptable in our industry today. The insufferable bastard kept saying I had to do measurements in order to make sure I wasn't prematurely optimizing. Guy adding 500 kb of js when you need 1 kb of it is obviously a horrible idea, especially when you're already way over the performance budget. Asshat. I'm still salty he got so much airtime for that shitty opinion of his and that I had to spend so much energy defending myself.
OR, perhaps its the case that different contexts have different levels of effort. Running a spike can be an important way to promote new ideas across an org and show how things can be done differently. It can be a political tool that has positive impact, because there's a lot more to a business than simply writing good code. However if your org is horrible then it can backfire in the way that was described. Maybe business are too aggressive and trample on dev, maybe dev doesn't have a spine, maybe nobody spoke up about what a fucking disaster it was going to be, maybe they did and nobody listened. Those are all organisational issues akin to an exploitable code base but embedded into the org instead of the code.
These issues are not the direct fault of the spike, its the fault of the org, just like the idiot that took your poorly formatted comment and put it on the front page of Vogue.
I mean I could take a toddlers tricycle and try to take it onto the motorway. Can we blame the toy company for that? It has wheels, it goes forward, its basically a car, right? In the same way a spike is basically something we can ship right now.
I’m being a bit provocative here, just to make two points:
a) Software development back in the day, especially when it comes to service, reach, security, etc., was completely different from today. Black Friday, millions of users, SLAs, 24-hour service... these didn’t exist back then.
b) Because of so many conditions — some mentioned in point (a) - prematurity ends when the code is live in production. End.
"You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is."
Moreso, in my personal experience, I've seen a few speed hacks cause incorrect behavior on more than one occasion.
Parent is talking about building software that is inherently non-performant due to abstractions or architecture with the wrong assumption that it can be optimized later if needed.
The analogy is trying to convert a garbage truck into a race car. A race car is built as a race car. You don't start building a garbage truck and then optimize it on the race course. There are obvious principles and understanding that first go into the building of a race car, assuming one is needed, and the optimization happens from that basis in testing on and off the track.
Which is pretty close to just saying "don't do anything unless you have a good reason for doing it."
Yeah like, NOT indexing any fields in a database, that'll become a problem very quickly. ;)
For example, in Java I usually use ConcurrentHashMap, even in contexts that a regular HashMap might be ok. My reasoning for this is simple: I might want to use it in a multithreaded context eventually and the performance differences really aren't that much for most things; uncontested locks in Java are nearly free.
I've gotten pull requests rejected because regular HashMaps are "faster", and then the comments on the PR ends up with people bickering about when to use it.
In that case, does it actually matter? Even if HashMap is technically "faster", it's not much faster, and maybe instead we should focus on the thing that's likely to actually make a noticeable difference like the forty extra separate blocking calls to PostgreSQL or web requests?
So that's the premature optimization that I think is evil. I think it's perfectly fine at the algorithm level to optimize early.
The work can be done in future to migrate to using ConcurrentHashMap when the feature to add multithreading support is added. There's no sense to add groundwork for unplanned, unimplemented features - that is premature optimisation in a nutshell.
Make a (very) good argument, and suggest a realtistic path to change the whole codebase, but don't create inconsistency just because it is "better". It is not.
It makes no difference to the outside code.
2) Locks are cheap
3) I seriously doubt that the difference between a Map and a ConcurrentHashMap is measurable in your app
Which means that both, the comments on your PRs are irrelevant and you are still going too far in your thread-safety. So you are both wrong.
What you are right about is to focus on network calls.
ConcurrentHashMap has the advantage of hiding the locking from me and more importantly has the advantage of being correct, and it can still use the same Map interface so if it’s eventually used downstream somewhere stuff like `compute` will work and it will be thread safe without and work with mutexes.
The argument I am making is that it is literally no extra work to use the ConcurrentHashMap, and in my benchmarks with JMH, it doesn’t perform significantly worse in a single-threaded context. It seems silly for anyone to try and save a nanosecond to use a regular HashMap in most cases.
Thinking about the overall design, how its likely to be used, and what the performane and other requirements are before aggregating the frameworks of the day is mature optimization.
Then you build things in a reasonable way and see if you need to do more for performance. It's fun to do more, but most of the time, building things with a thought about performance gets you where you need to be.
The I don't need to think about performance at all camp, has a real hard time making things better later. For most things, cycle counting upfront isn't useful, but thinking about how data will be accessed and such can easily make a huge difference. Things like bulk load or one at a time load are enormous if you're loading lots of things, but if you'll never load lots of things, either works.
Thinking about concurrency, parallelism, and distributed systems stuff before you build is also pretty mature. It's hard to change some of that after you've started.
I want it in a t-shirt. On billboards. Everywhere :)
I also find it a bit annoying is that most people just make shit up about stuff that is "faster". Instead of measuring and/or looking at the compiled bytecode/assembly, people just repeat tribal knowledge about stuff that is "faster" with no justification. I find that this is common amongst senior-level people at BigCos especially.
When I was working in .NET land, someone kept telling me that "switch statements are faster" than their equivalent "if" statements, so I wrote a very straightforward test comparing both, and used dotpeek to show that they compile to the exact same thing. The person still insisted that switch is "faster", I guess because he had a professor tell him this one time (probably with more appropriate context) and took whatever the professor said as gospel.
Generally I've found that the penalty, even without contention, is pretty minimal, and it almost always wins under contention.
It's particularly the kind of people who like to say "hur hur don't prematurely optimize" that don't bother writing decent software to begin with and use the term as an excuse to write poor performing code.
Instead of optimizing their code, these people end up making excuses so they can pessimize it instead.
I'm actually considering, for the first time since 2013/14 when I worked on a Visual Studio extension, creating a piece of desktop software - and a piece of cross-platform desktop software at that. Given that Microsoft's desktop story has descended into a chaotic mishmash of somewhat conflicting stories, and given it will be a cold day in hell before I choose Electron as the solution to any problem I might have, most likely I will roll with Qt + Rust, or at least Qt + something.
20-odd years ago I might have opted for Java + Swing because I'd done a lot of it and, in fairness to Swing, it's not a bad UI toolkit and widget set. These days I simply prefer the svelte footprint and lower resource reuqirements of a native binary - ideally statically linked too, but I'll live with the dynamic linking Qt's licensing necessitates.
It is written in Rust.
Usually those people also have a good old whinge about the premature optimization quote being wrong or misinterpreted and general attitudes to software efficiency.
Not once have I ever seen somebody try to derail a process of "ascertain speed is an issue that should be tackled" -> "profile" -> fix the hot path.
Ascertain an issue is too late for bad software. The technical term is polishing a turd.
Not that what you're describing doesn't happen, people trying to make something irrelevant fast, but that's not the big problem we face as an industry. The problem is bad software.
There's too little appreciation today for a well designed system. And the "premature optimization" line is often used to justify not thinking about things because, hey, that's premature. Just throw something together.
Many things need to be optimized before you can easily profile them, so at this stage its already too late and your software will forever be slow.
That's because your boss will never in a 1000 years hire the type of dev who can do that. And even if you did, there will be team members who will fight those fixes tooth and nail. And yes, I have a very cynical view of some devs but they earned that through some of the pettiest behavior I have ever seen.
People write some code, test it, ship it, then get some ideas that its too slow and make it faster.
The nice thing about doing it with shipped code is you can actually measure where time is spent insyead of guessing.
Your users are not going to notice. Sure, it's faster but it's not focused on the problem.
This doesn't make sense. Why is performance (via architectural choices) more important today than then?
You can build a snappy app today by using boring technology and following some sensible best practices. You have to work pretty hard to need PREMATURE OPTIMIZATION on a project -- note the premature there
Optimization of bandwidth-bound code is almost purely architectural in nature. Most of our software best practices date from a time when everything was computation-bound such that architecture could be ignored with few bad effects.
So a lot of code quality debates don’t matter for the typical enterprise app. While a dev spends their afternoon shaving off 100 nanoseconds in the hot path, a second developer on a deadline added a poorly thought out round trip that adds 800milliseconds.
This architectural problems are also more difficult to unwind later since they tend to have cascading effects.
If you are building something with similar practical constraints for the Nth time this is definitely true.
You are inheriting “architecture” from your own memory and/or tools/dependencies that are already well fit to the problem area. The architectural performance/model problem already got a lot of thought.
Lots of problems are like that.
But if you are solving a problem where existing tools do a poor job, you better be thinking about performance with any new architecture.
There were fewer available layers of abstraction.
Whether you wrote in ASM, C, or Pascal, there was a lot less variance than writing in Rust, JavaScript, Python.
I agree but I don't think this discredits the "premature optimisation is the root of all evil" thing, aside from the fact that it's a heavy exaggeration.
The trouble is, people read it as "don't optimise" which is an incredibly bad decision.
Especially in the data world though, I've seen lots of teams really struggle with problems caused by using technologies they don't need (normally spark or kubernetes or both) just because they might need them later.
I think that type of pitfall is what the original quote is warning against.
SOLID isn't bad, but like the idea of premature optimization, it can easily lead you into the wrong direction. You know how people make fun of enterprise code all the time, that's what you get when you take SOLID too far.
In practice, it tends to lead to a proliferation of interfaces, which is not only bad for performance but also result in code that is hard to follow. When you see a call through an interface, you don't know what code will be run unless you know how the object is initialized.
The problem is that SOLID on its own does nothing for you. It's a set of (vague) rules, but not a full framework for how to design software. I would even argue that SOLID is actively harmful if used on its own.
Things like Clean Architecture and Domain-Driven Design are a lot closer to being true frameworks for software design, and a lot of their basic principles are actually really good (like the core of the application being made up of objects which perform calculations, validations and business rules with no side effects), but the complexity of those architectures is a problem in itself.
And, even aside from that, I think the industry in general reached a point where people decided that, principled object-oriented design is just not worth it. Why spend all this effort worrying about the software remaining maintainable for decades, when we could instead just throw together something that works, then IPO, then rewrite the whole thing once we have money.
I think the most important principle above all is knowing when not to stick to them.
For example if I know a piece of code is just some "dead end" in the application that almost nothing depends on then there is little point optimizing it (in an architectural and performance sense). But if I'm writing a core part of an application that will have lots of ties to the rest, it totally does make sense keeping an eye on SOLID for example.
I think the real error is taking these at face value and not factoring in the rest of your problem domain. It's way too simple to think SOLID = good, else bad.
[1] https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...
This should be the header of the website. I think the core of all these arguments is people thinking they ARE laws that must be followed no matter what. And in that case, yeah that won't work.
SOLID approaches aren't free... beyond that keeping code closer together by task/area is another approach. I'm not a fan of premature abstraction, and definitely prefer that code that relates to a feature live closer together as opposed to by the type of class or functional domain space.
For that matter, I think it's perfectly fine for a web endpoint handler to make and return a simple database query directly without 8 layers of interfaces/classes in between.
Beyond that, there are other approaches to software development that go beyond typical OOP practices. Something, something, everything looks like a nail.
The issues that I have with SOLID/CLEAN/ONION is that they tend to lead to inscrutable code bases that take an exponentially long amount of time for anyone to come close to learning and understanding... Let alone the decades of cruft and dead code paths that nobody bothered to clean up along the way.
The longest lived applications I've ever experienced tend to be either the simplest, easiest to replace or the most byzantine complex monstrosities... and I know which I'd rather work on and support. After three decades I tend to prioritize KISS/YAGNI over anything else... not that there aren't times where certain patterns are needed, so much as that there are more times where they aren't.
I've worked on one, singular, one application in three decades where the abstractions that tend to proliferate in SOLID/CLEAN/ONION actually made sense... it was a commercial application deployed to various govt agencies that had to support MS-SQL, Oracle and DB2 backends. Every, other, time I've seen an excess of database and interface abstractions have been instances that would have been better solved in other, less performance impacting ways. If you only have a single concrete implementation of an interface, you probably don't need that interface... You can inherit/override the class directly for testing.
And don't get me started on keeping unit tests in a completely separate project... .Net actually makes it painful to put your tests with your implementation code. It's one of my few actual critiques about the framework itself, not just how it's used/abused.
Even his "critique" of Demeter is, essentially, that it focuses on an inconsequential aspect of dysfunction—method chaining—which I consider to be just one sme that leads to the larger principle which—and we, apparently, both agree on this—is interface design.
The only part of SOLID that is perhaps OO-only is Liskov Substitution.
L is still a good idea, but without object-inheritance, there's less chance of shooting yourself in the foot.
If you follow SOLID, you'll write OOP only, with always present inheritance chains, factories for everything, and no clear relation between parameters and the procedures that use them.
L and I are both pretty reasonable.
But S and D can easily be taken to excess.
And O seems to suggest OO-style polymorphism instead of ADTs.
That's how I view it. You should design your application such that extension involves little modifying of existing code as long as it's not necessary from a behavior or architectural standpoint.
Bunch of stuff is done for us. Using postgres having indexes correct - is not premature optimization, just basic stuff to be covered.
Having double loop is quadratic though. Parallelism is super fun because it actually might make everything slower instead of faster.
Not if your optimization for performance is some Rube Goldberg assemblage of microservices and an laundry list of AWS services.
And as I point out, what Knuth was talking about in terms of optimization was things like loop unrolling and function inlining. Not picking the right datastructure or algorithm for the problem.
I mean, FFS, his entire book was about exploring and picking the right datastructures and algorithms for problems.
its my favorit quotes
premature optimization nowadays looks like choosing microservice when monolith can works just fine
Decades in, this is the worst of all of them. Misused by laziness or malice, and nowhere near specific enough.
The graveyard of companies boxed in by past poor decisions is sprawling. And the people that made those early poor decisions bounce around field talking about their "successful track record" of globally poor and locally good architectural decisions that others have had to clean up.
It touches on a real problem, though, but it should be stricken form the record and replaced with a much better principle. "Design to the problem you have today and the problems you have in 6 months if you succeed. Don't design to the problems you'll have have next year if it means you won't succeed in 6 months" doesn't roll off the tongue.
One thing that came out of the no-sql/new-sql trends in the past decade and a half is that joins are the enemy of performance at scale. It really helps to know and compromise on db normalization in ways such as leaning on JSON/XML for non-critical column data as opposed to 1:1/children/joins a lot of the time. For that matter, pure performance and vertical scale have shifted a lot of options back from the brink of micro service death by a million paper cuts processes.
You are right about the origin of and the circumstances surrounding the quote, but I disagree with the conclusion you've drawn.
I've seen engineers waste days, even weeks, reaching for microservices before product-market fit is even found, adding caching layers without measuring and validating bottlenecks, adding sharding pre-emptively, adding materialized views when regular tables suffice, paying for edge-rendering for a dashboard used almost entirely by users in a single state, standing up Kubernetes for an internal application used by just two departments, or building custom in-house rate limiters and job queues when Sidekiq or similar solutions would cover the next two years.
One company I consulted for designed and optimized for an order of magnitude more users than were in the total addressable market for their industry! Of that, they ultimately managed to hit only 3.5%.
All of this was driven by imagined scale rather than real measurements. And every one of those choices carried a long tail: cache invalidation bugs, distributed transactions, deployment orchestration, hydration mismatches, dependency array footguns, and a codebase that became permanently harder to change. Meanwhile the actual bottlenecks were things like N+1 queries or missing indexes that nobody looked at because attention went elsewhere.
I was quite literally asked to implement an in-memory cache to avoid a "full table scan" caused by a join to a small DB table recently. Our architect saw "full table scans" in our database stats and assumed that must mean a performance problem. I feel like he thought he was making a data-driven profiling decision, but seemed to misunderstand that a full-table scan is faster for a small table than a lookup. That whole table is in RAM in the DB already.
So now we have a complex Redis PubSub cache invalidation strategy to save maybe a ms or two.
I would believe that we have performance problems in this chunk of code, and it's possible an in-memory cache may "fix" the issue, but if it does, then the root of the problem was more likely an N+1 query (that an in-memory cache bandaids over). But by focusing on this cache, suddenly we have a much more complex chunk of code that needs to be maintained than if we had just tracked down the N+1 query and fixed _that_
Yes. When I was a young engineer, I was asked to design something for a scale we didn’t even get close to achieving. Eventual consistency this, event driven conflict resolution that… The service never even went live because by the time we designed it, everyone realized it was a waste of time.
I learned it makes no sense to waste time designing for zillions of users that might never come. It’s more important to have an architecture that can evolve as needs change rather than one that can see years into the future (that may never come).
This is too true. However, often you dont fully know the shape of the domain until you swing at it and fail.
I don't blame Knuth, he's talking about focusing on micro-optimizations, but a lot of devs nowadays don't even care to get basic performance right.
In these domains, algorithm selection, and fine tuning hot spots pays off significantly. You must hit minimum speeds to make your application viable.
Anyone who has done optimization even a little knows that it isn't very difficult, but you do need to plan and architect for it so you don't have to restructure you whole program to get it to run well.
Mostly it's just rationalization, people don't know the skill so they pretend it's not worth doing and their users suffer for it.
If software and website were even reasonably optimized people could just use a computer as powerful as a rasberry pi 5 (except for high res video) for most of what they do day to day.