Yes, or pretty close to it. What we don't know how to do (AFAIK) is do it at a cost that would be acceptable for most software. So yes, it mostly gets done for (components of) planes, spacecraft, medical devices, etc.
Totally agreed that most software is a morass of bugs. But giving examples of buggy software doesn't provide any information about whether we know how to make non-buggy software. It only provides information about whether we know how to make buggy software—spoiler alert: we do :)
I have to disagree here. All of these you mentioned have regularly bugs. Multiple spacecraft got lost because of these. For planes there's not so distant Boeing 737 MAX fiasco (admittedly this was bad software behavior caused by sensor failure). And medical devices, the news about their bugs semi-regularly pop up. So while the software for these might do a bit better than the rest, they certainly are not anywhere close to being bug free.
And same goes for specifications the software is based on. Those aren't bug-free either. And writing software based on flawed specification will inevitably result in flawed software.
That's not to say we should give up on trying to write bug free software. But we currently don't know how to do so.
You could say that Python is designed around preventing these memory bugs.
Security and reliability are also parameters that exist on a sliding scale, the industry has simply chosen to slide the "cost" parameter all the way to one end of the spectrum. As a result, the number of bugs and hacks observed are far enough from the desired value of zero that it's clear the true requirements for those parameters cannot be honestly said to be zero.
Zero is not the desired number, particularly not when discussing "hacks". This may not matter in current situation, but there's a lot of "security maximalism" in the industry conversations today, and people seem to not realize that dragging the "security" slider all the way to the right means not just the costs becoming practically infinite, but also the functionality and utility of the product falling down to 0.
Mind, I'm not talking about financial overhead for the company/developer(s), but rather an UX overhead for the user. It often increases friction and might even need education/training to even make use the software it's attached to. It's much like how body armor increases the weight one has to carry and decreases mobility, security has (conceptually) very similar tradeoffs (cognitive instead of physical overhead, and time/interactions/hoops instead of mobility). Likewise, sometimes one might pick a lighter Kevlar suit, whereas othertimes a ceramic plate is appropriate.
Now, body armor is still a very good idea if you're expecting to be engaged in a fight, but I think we can all agree that not everyone on the street in, say, a random village in Austria, needs to wear ceramic plates all the time.
The analogy does have its limits, of course ... for example, one issue with security (which firmly slides it towards erring on the safe side) as compared to warfare is that you generally know if someone shot at you and body armor saved you; with security (and, again, privacy), you often won't even know you needed it even if it helped you. And both share the trait that if you needed it and didn't have it, it's often too late.
Nevertheless, whether worth it or not (and to be clear, I think it's very worth it), I think it's important that people don't forget that this is not free. There's no free lunch --- security & privacy are no exception.
Ultimately, you can have a super-secure system with an explicit trust system that will be too much for most people to use daily; or something simpler (e.g. Signal) that sacrifices a few guarantees to make it easier to use ... but the lower barrier to entry ensuring more people have at least a baseline of security&privacy in their chats.
Both have value and both should exist, but we shouldn't pretend the latter is worthless because there are more secure systems out there.
Today a bank really sent me a legitimate email about trying their new site. Went over, it was their site alright, logged in with correct username and password - poof, instantly blocked for suspicious access (from my usual home machine), call helpline to fix.
Now that's safe ... and useless. But safe.
I still wonder what did I do wrong (support isn't responsive). But it's true that we're both safe from having a user/vendor relationship now.
You could make a car that's safer than others at 10x the price but what would the demand look like at that price?
Would you pay 2x for your favourite software and forego some of the more complex features to get a version with half the security issues?
Well.. except that I never want either of those. So sometimes I want Kate editor and sometimes I want Akelpad.
The answer to the above question will reveal if someone an engineer or a electrician/plumber/code monkey.
In virtually every other engineering discipline engineers have a very prominent seat at the table, and the opposite is only true in very corrupt situations.
Even basic theorems of science are incorrect.
It depends on exactly what you are doing but there are many languages which are efficient to develop in if less efficient to execute like Java and Javascript and Python which are better in many respects and other languages which are less efficient to develop in but more efficient to run like Rust. So at the very least it is a trilemma and not a dilemma.
One of these is not like the others...
Java (JVM) is extremely fast.
Hot take, but: Performance hasn’t been a major factor in choosing C or C++ for almost two decades now.
A while back when my son was playing Chess I wrote a chess engine in Python and then tried to make a better one in Java which could respect time control, it was not hard to make the main search routine work without allocating memory but I tried to do transposition tables with Java objects it made the engine slower, not faster. I could have implemented them with off-heap memory but around that time my son switched from Chess to guitar so I started thinks about audio processing instead.
The Rust vs Java comparison is also pointed. I was excited about Rust the same way I was excited about cyclone when it came out but seeing people struggle with async is painful for me to watch and makes it look like the whole idea doesn’t really work when you get away from what you can do with stack allocation. People think they can’t live with Java’s GC pauses.
Write code that carefully however is really not something you just do, it would require a massive improvement of skills overall. The majority of developers simply aren't skilled enough to write something anywhere near the quality of qmail.
Most software also doesn't need to be that good, but then we need to be more careful with deployments. The fact that someone just installs Wordpress (which itself is pretty good in terms of quality) and starts installing plugins from un-trusted developers indicates that many still doesn't have a security mindset. You really should review the code you deploy, but I understand why many don't.
Djb didn’t allow forking and repackaging so quail did not keep up with an increasingly hostile environment where it got so bad that when the love letter virus came out it was insufficient to add content filtering to qmail and I had to write scripts that blocked senders at the firewall. Security was no longer a 0 and 1 problem, it was certainly possible to patch up and extend qmail to survive in that environment but there was something to say for having it all in one nice package…. And once the deliverability crisis started, I gave up on running email servers entirely.
We built a weird solution where two systems would sync data via email. Upstream would do a dump from an Oracle database, pipe it to us via SMTP and a hook in qmail would pick up the email, get the attachment and update our systems. I remember getting a call one or two years after leaving the organisation, the new systems administrator wanted to know how their database was always kept up to date. It worked brilliantly, but they felt unsafe not knowing how. I really should have documented that part better.
I’m not sure I’d go quite as far as GP, but they did caveat that we often choose not to write software with few bugs. And empirically, that’s pretty true.
The software I’ve written for myself or where I’ve taken the time to do things better or rewrite parts I wasn’t happy with have had remarkably few bugs. I have critical software still running—unmodified—at former employers which hasn’t been touched in nearly a decade. Perhaps not totally bug-free, but close enough that they haven’t been noticed or mattered enough to bother pushing a fix and cutting a release.
Personally I think it’s clear we have the tools and capabilities to write software with one or two orders of magnitude fewer bugs than we choose to. If anything, my hope for AI-coded software development is that it drops the marginal cost difference between writing crap and writing good software, rebalancing the economic calculus in favor of quality for once.
Blame PMs for this. Delivering by some arbitrary date on a calendar means that something is getting shipped regardless of quality. Make it functional for 80% of use, then we'll fix the remaining bits in releases. However, that doesn't happen as the team is assigned new task because new tasks/features is what brings in new users, not fixing existing problems.
We have literally countless examples of software that devs have released entirely of their own volition when they felt it was ready.
If anything, in my experience, software that’s written a little slower and to a higher standard of quality is faster-releasing in the long (and medium) run. You’d be shocked at how productive your developers are when they aren’t task-switching every thirty minutes to put out fires, or when feature work isn’t constantly burdened by having to upend unrelated parts of the code due to hopelessly interwoven design.
https://sqlite.org/chronology.html
Regular releases for over a quarter of a century now, and it's renowned for its reliability.
There is also the angle of asking for estimate without allocating time for estimation itself.
For lack of a better word, I think it should drive from "complexity". Hardness of estimate should be inversely proportional to the complexity. Adding field to a UI when it is also exposed via the API is generally low complexity so my estimate would likely hold. We can provide estimate for a major change but the estimate would be soft and subject to stretch and it is the role of the PM to communicate it accordingly to the stakeholders.
Then I ask: why not add a week to how long that thing will take, meaning it stretches two sprints (or whatever you call it).
Add upfront. Then if you get to hard convo where someone says “do it sooner” you say “not possible.”
Developers aren't alone in adhering to schedules. Many folks in many roles do it. All deal with missed deadlines, success, expectation management, etc. No one operates in magical no-timeline land unless they do not at all answer to anyone or any user. Not the predominant model, right?
So rather than just say "you can blame the PMs" I'd love to hear a realistic-to-business flow idea.
I am not saying I have the answers or a "take". I've both asked for and been asked for estimates and many times told people "I can't estimate that because I don't know what will happen along the way."
So, it's not just PMs. It's the whole system. Is there a real solution or are we pretending there might be? Honest inquiry.
It's inevitable that work will slip. That doesn't necessarily mean the release will slip. Sometimes you actually need the thing, but often the work is something you want to include in the release but don't absolutely have to. Then you can decide which tradeoff you prefer, delaying the release or reducing its scope.
Earlier discussion focuses on writing software at a slower pace to inject more accuracy and robust thinking/design/code. Conceptually, yes, I get it!
But in numerous practical scenarios, some adherence to a recurring schedule seems like the only way to align software to business outcomes. My thinking is tied more to enterprise products (both external and internal) rather than open-source.
I like an active dialog with engineers. (I'm neither SWE nor PM). Let's talk together about estimates. What's possible and not possible. Where do you feel most uncertain, most certain. What dependencies/externalities do you expect to cause problems.
Those conversations help me (business/analytics-side) do things like adjust my own deadlines, schedules. Communicate with c-suite to realign on what's possible and not. Adjust time.
I feel for anyone that has to wrangle these tasks into a business-consumable time frame.
What a great articulation. Completely agree.
This is why I don't blame PMs anymore than devs anymore than business folks throwing requirements at PMs. Possible to find fault everywhere.
I think the broader problem is scale and growth. Many people in many roles are caught in growth-mind or scale-mind companies where the business wants to operate at a velocity that may not align with the realistic development work we're discussing. PMs are similarly caught with less time to understand, scope, plan, etc. Business folks ask questions like "why isn't this ready" to devs that may not understand the reasons why the business operates the way it does, or the business at all.
Full disclosure: I'm in insurance. Seeing lots of these problems play out in front of me. C-suite moving at speed 100, devs moving at a perceived speed of 50. Silos and communication problems and unclear requirements up and down the stack.
So, in my interactions, the way I try to help is just to understand the most basic components and their ability to come alive or not. Is there anything to show? Yes, ok - let's celebrate a small win. Is there a rather large delay? Why - ok, let's use that to reinforce building something robust vs. crap.
But, there are schedules! Someone above mentioned sqlite. Another example comes to mind: Obsidian. I think they're anomalies (good ones) rather than examples that broadly prove the point to slow down.
Because it has to be released at some point, and without picking a point in advance, you can never reach it.
Not all orgs of course. But most I’ve personally seen, seem to be like this.
The main point is that there are super widespread software systems in use that we know aren't secure, and we certainly could do better if we (as the industry, as customers, as vendors) really wanted.
A prime example is VPN appliances ("VPN concentrators") to enable remote access to internal company networks. These are pretty much by definition Internet-facing, security-critical appliances. And yet, all such products from big vendors (be they Fortinet, Cisco, Juniper, you name it) had a flood of really embarrassing, high-severity CVEs in the last few years.
That's because most of these products are actually from the 80s or 90s, with some web GUIs slapped on, often dredged through multiple company acquisitions and renames. If you asked a competent software architect to come up with a structure and development process that are much less prone to security bugs, they'd suggest something very different, more expensive to build, but also much more secure.
It's really a matter of incentives. Just imagine a world where purchasing decisions were made to optimize for actual security. Imagine a world where software vendors were much more liable for damage incurred by security incidents. If both came together, we'd spend more money on up-front development / purchase, and less on incident remediation.
Yes. There’s a ton of lessons learned, best practices, etc. We’ve known for decades.
It’s just expensive and difficult. Since end-users seem to have no issue, paying for crud, why bother?
> Do we, really? Because a week doesn’t go by when I don’t run into bugs of some sort.
I mean, we do know how to do it, but we don't because business needs tend to throw quality under the bus in exchange for almost everything else: (especially) speed to develop, but also developer comfort, feature cram, visual refreshes, and so on always trump bugs, so every project ends up with bugs.
I have a few hobby projects which I would stick my neck out and say have no bugs. I know, I'm going to get roasted for this claim, but the projects are ultra simple enough in scope, and I'm under no pressure to ever release them publicly, so I was able to prioritize getting them right. No actual businesses are going to be doing this level of polish and care, and they all need to cut corners and actually ship, so they have bugs. And no ultra-complex project (even if it's done with love and care) is capable of this either, purely due to its size and number of moving parts.
So, it's not like we don't know how to do it, but that we choose not to for practical reasons.
1. Freeze the set of features.
2. Continue to pay programmers to polish the software for several years while it is being actively used by many people.
3. Resist adding new features or updating the software to feel modern.
If you do that, your program will asymptomatically approach zero bug.Of course, your users will complain about missing features, how ugly and ancient your products look, and how they wished you were more like your buggy competitors.
And if your users are unhappy, then you probably lose the "used heavily by a lot of people" part that reveals the bugs.
Formal verification to EAL7[0] in theory, as long as your requirements are correct.
In practice I'm not aware of any bugs being discovered in any EAL7 software, but it's so expensive there isn't a lot of it.
That’s the beauty of OSS - the level we could write code is way less than the level the culture / timescale / management allows. I recently saw OSS as akin to (good) journalism for enterprise - asking why is this hidden part of society not doing the minimum (jails, corruption etc).
Free software does sooo much better compared to much in-house it is like sunlight
The issue is almost always feature management.
Back in the days I was making Flash games, usually a 3-5 weeks job, with no real QA, and the project was live for 3-5 months. Every time I was ahead of schedule someone came with a brilliant idea to test few odd things and add couple new features that was not discussed prior. Sometimes literally hours before the launch.
Every time I was making the argument that adding one new feature will create two bugs. And almost always I was right about it.
Fast forward and I'm working for BigCo. Few gigs back I was working for a major bank which employed supper efficient and accountable workflow - every release has to be comprised of business specific commits, and commits that are not backed by explicit tickets are not permitted.
This resulted in team having to literally cheat and lie to smuggle refactors and optimizations.
Add to that that most enterprise projects start not because the requirements were gathered but because the budget was secured and you have a recipe for disaster.