upvote
Nearly the same experience. Had to fix an issue in a boot loader. It came down to improper setup of the memory controllers ECC engine. It would correct and ignore a single fault. If you managed to get two faults it would raise an exception that was not handled and the boot would fail. For the customer this meant that a reboot might randomly brick the unit until you go in and manually power cycle it.

Just convincing them that their problem boiled down to a single incorrect bit was difficult enough but then having to, in a day, build and successfully operate a test harness to prove the fix worked was the real stress.

I do not miss embedded engineering.

reply