25 crazy software bugs explained

615.8k views3538 WordsCopy TextShare
Fireship
Find all the best dev content at https://daily.dev/fireship Let’s explore 25 crazy software bugs th...
Video Transcript:
it's not a bug it's a feature but programmers often use that excuse to justify their garbage code and sometimes it actually works like in the original sidm civilization game Gandhi's aggression level was set to an unsigned integer of one making him a super chill pacifist but when another Civ adopts diplomacy it reduces aggression by two 1 - 2 = 255 or an unsigned integer underflow which max out Gandhi's aggression and turned him into a diabolical thermonuclear Enthusiast the Civ players love this bug so much that it became a feature that the urban legend anyway but
it's all fun in games until people start losing money and getting blown up in real life real men test in production and in today's video we're going to take this carbon fiber submarine and travel down the software bug Iceberg to look at 25 examples of bad code that changed the world where the deeper we go the more severe the consequences if you survive until the end you'll understand why to-do fix it later might be the last line of code you ever write our journey begins in the year 2008 a time when people still use devices
like iPods to listen to music Microsoft had its own crappy knockoff called Zoom the device worked great until December 31st New Year's Eve when every zoon around the world froze and became a very stylish brick the screen displayed a loading bar and would get stuck at 100% And then just stay there it could only be fixed manually by removing the battery this happened because it wasn't programmed to account for the extra day in a leap year when reaching the 366th day of a leap year the software attempted to reset but a logic error in handling
the day count meant the device could not exit the loop causing it to freeze permanently this this code could be easily fixed by simply breaking out of the loop when the number of days reach 367 but not all bugs are that easy to fix like the infamous Pentium fdiv bug of 94 occasionally when a program did floating Point division on a Pentium chip it would return an incorrect value it only happened in about 1 in N billion division operations but was discovered by a professor at Lynchberg College while doing computational number Theory originally he thought
it was his own bug but after extensive testing he reported it to Intel Intel tried to downplay it but then IBM suspended Intel chips in their PCS and it turned into a massive PR nightmare the core issue is related to the SRT division algorithm which is a way to speed up division by using lookup tables to estimate the next digit of a quotient the lookup table contains 1,66 entries but the problem in this bug is that five entries were missing causing certain combinations of numbers to return incorrect results at the hardware level that's pretty bad
but this next bug was found inside an apple in 2019 you could use your iPhone to start a FaceTime call with someone then while the call is ringing swipe up to add another person making it a group call add your own phone number as an additional person then FaceTime would glitch out and think the group call was active and now you can EES drop on the audio from the original recipient's phone and then if the person pressed the power button to dismiss the call it would activate their camera what's even crazier is that this bug
was discovered by a 14-year-old while trying to set up a group chat for playing fortnite then he tells his mom and his mom tries to report it to Apple Apple thinks its code is perfect and just ignores him but then about a week later the bug starts to go viral on social media and and everybody's doing it on live streams at this point Apple disables group FaceTime entirely and released a patch about a week later they did make things right with the 14-year-old though and awarded him a bug Bounty and some education funding but the
root of the problem appears to be the fact that they weren't checking the state of the call before activating the audio stream normally software bugs cause people to lose money but sometimes the opposite is true like the 2024 Chase ATM glitch most banking systems run on proprietary code that was written 50 years ago using ancient languages like Cobalt usually when you deposit to check it needs a few days to clear before you can withdraw those funds from an ATM but at JP Morgan Chase some sort of glitch prevented the Safeguard and tons of viral videos
hit Tik Tok of people withdrawing tens of thousands of dollars that they never had by simply depositing a fake check and then withdrawing money right away I'm no fan of the banking system but unfortunately this also constitutes check fraud a few days later these people had bank accounts deep in the red and are now being sued by the bank and may face criminal charges that brings us to the end of Tier 1 and things are going to get a lot crazier from here but if you like crazy programming stories a website you'll want to check
out is daily. deev the sponsor of today's video if you're learning to code or learning a new language or framework it's the one place you can curate all the latest news tutorials videos and content filtered by the topics you're interested in when you join a squad it'll create a personalized news feed on that topic but more importantly connect you to a community of like-minded developers where we can all suffer together and these aren't Bots but real people of all skill levels who can help you build your network in the tech industry and did I mention
it's all entirely free so check it out with my invite Link at daily. d/f fireship but now it's time to dive below the surface into tier 2 starting with the AT&T long-distance crash of 1990 the cell phones were hardly a thing back then and if you wanted to communicate with someone outside of your village you wouldd have to pay for a long-distance Network after a software update a single faulty line of code it caused a network switch to crash and automatically reboot one failed switch is no big deal but when it would reboot it would
cause neighboring switches to fail as well resulting in a Cascade of failures that took out the entire network the bug was a c program that included a break statement within an if Clause within a switch Clause when a second message was received while the switch was still processing the first the program dropped out of the if Clause prematurely and overwrote crucial data causing that switch to reset and this created a domino effect that blocked 50 million calls around the world a dropped call is annoying but the last place you want to bug isn't an aircraft
especially when it's in the oxygen system of an F-35 fighter jet in 2012 Pilots were experiencing hypoxia likee symptoms during flights like dizziness disorientation and overconfidence just like a drunk driver these Jets have a complex oxygen generation system called obogs usually it worked fine but the underlying software wasn't robust enough to account for real-time variables like rapid altitude changes or variations and pilot breathing patterns when it comes to data the old saying goes garbage in garbage out and when that data controls how much oxygen you get while flying 100 million Lethal Weapon you better be
damn sure you test your code for all possible variables in this case the code is top secret but it's probably C++ and luckily nobody got blown up and it was later fixed with a software update speaking of airplanes though back in 2008 the 4.3 billion pound Heath throw Terminal 5 opened and it only took a few days for a software bug to result in over 500 canel flights and 42,000 lost bags the new terminal had a complex automated baggage handling system with 30 Mi of conveyor belts and rfids and barcodes to manage up to 12,000
bags per hour when they pushed a prod several different software systems failed to communicate employees couldn't log in RFID tracking systems couldn't track bags among many other issues and this led to an entire breakdown of the system costing 16 million to fix which is a prime example of when testing and production goes wrong but well tested code breaks to in 1982 the Vancouver Stock Exchange began to slowly decrease in value due to a rounding error the error was so minor though that nobody even noticed the value of the total index started at 1,000 but then
over the course of 2 years dropped to 520 until one day somebody was like what the hell is going on here it was a software bug but had the index increased in value it would have been a feature what happened was the software was truncating each stock price change to two decimal places instead of rounding One stock in an index of thousands only has a minor impact but over time it creates a cumulative rounding error and in this case they had to completely recalculate the value of the index from scratch and that brings us to
November 1988 when Robert Morris accidentally created the Morris worm on this here floppy disc Morris was not an evil hacker but a graduate student at Cornell who released a self-replicating computer program designed to gauge the size of the internet which was Tiny back then but it rapidly spread across Unix based systems and crashed thousands of computers in the process by overwhelming their resources it took out 6,000 computers which at that time was around 10% of the total internet the worm itself exploited weaknesses in things like the send mail protocol and a buffer overflow vulnerability in
the finger program which provided information about users on the network the program would then execute code remotely on these machines and send itself to other machines on the network his plan almost worked but there was a bug in his code that would reinfect the same computer over and over again eventually it took out MIT UC Berkeley and Nassau and resulted in Morris getting the first felony conviction under the Computer Fraud and Abuse Act but now it's time to dive deeper into tier three where things actually start to blow up like NASA's Mars climate Orbiter in
1999 which torched $125 million in taxpayer money when it burned up entering the atmosphere of Mars and that happened because one software team used Imperial units pound Force while the other team used metric units Newton seconds and that's kind of a big issue when it comes to space travel especially when just a few years ago in 1996 bad software caused the Aran 5 rocket to explode exactly 37 seconds after liftoff a software bug in the inertial reference system led to a critical data conversion error where a 64-bit floating Point number was incorrectly converted to a
16-bit integer this made the rocket think it was 90° off course and when it went to correct at High Velocity it went boom luckily there was nobody on board but in 2010 the Toyota Prius put a bunch of people in danger due to a bug in its anti-lock braking system which would cause momentary delays in braking under certain conditions like on icy roads or potholes there's something wrong with the brakes the computer created A4 second delay switching from regenerative braking to friction braking and this resulted in a recall of 400,000 Vehicles which was a beautiful
thing because it meant there weren't as many Prius drivers driving slow in the fast lane that was a big mistake made by low-level systems Engineers but developers screw things up too like the 2020 City Bank bad UI disaster the City Bank was intending to make an interest payment of $8 million however due to a poorly designed UI they accidentally transferred the full loan amount of approximately $900 million the software had a confusing three screen process for payments and the interface made it appear as though checking certain boxes would ensure only interest was paid but in
reality it did the opposite now after City Bank accidentally paid back this money they asked the lenders if they could have it back and they were like no bro so City Bank them to court but the Court ruled in favor of the lenders and let them keep the $900 million that City Bank accidentally transferred due to the bad UI that's pretty bad but this tier wouldn't be complete if we didn't talk about the Y 2K bug in the early days of programming no one thought the world would still be around by the year 2000 so
they typically used only two digits to represent years as the year 2000 approached people realized this and began to panic that computers would interpret 0000 as the year 1900 instead of 2000 what's funny about Y2K is that this issue never caused any wide widespread disasters but the media hysteria around it led to billions of dollars being spent in preparation for the end of the world but now it's time to ratchet up the pressure in tier 4 which includes critical bugs that caused widespread damage around the world like the night Capital Money burn speedrun this company
used algorithms to perform trades in the stock market however the developers accidentally used a variable name linked to an outdated testing algorithm called Power Peg which had been inactive since 2003 this algorithm was designed to manipulate virtual markets by buying high and selling low now if you don't know much about investing usually you want to buy low and then sell High that's not Financial advice though when they pushed a prod the algorithm went haywire and started flooding the New York Stock Exchange with incorrect trades 4 million trades in just 45 minutes to be exact night
Capital lost $44 million in a single day and wiped out 75% of investors Equity but a far scarier bug is Heartbleed of 2014 which was a vulnerability and open SSL a library that's supposed to make internet communication secure which was caused by a simple coding oversight a missing bouns check in the implementation of the transport layer security or TSL heartbeat extension normally when a client sends a heartbeat request it includes a payload and expects the server to reply with the same payload however due to improper input validation an attacker could send a malicious heartbeat request
that appeared legitimate and this allowed attackers to repeatedly request memory contents from the server potentially gaining access to highly confidential data without any way to be detected and that's unimaginably bad because 2third of the internet servers were vulnerable able now we already talked about Toyota's braking problem but they also implemented an acceleration problem where your car would magically speed up uncontrollably and try to kill you it led to 6,200 complaints 52 injuries and 89 deaths this one could have gone in tier five but it was a multitude of issues that caused the problem not just
the software but the investigation revealed that a software bug in the electronic throttle control system was partly to blame apparently the code had terrible error handling logic and lack redundancy and the code was structured in a way to allow multiple failure points which could Cascade through the entire system and that's not good when that software is in control of your throttle Toyota had to recall 9 million vehicles and pay 1.2 billion in fines and settlements in the previous tier Y2K was a bit disappointing but just a few months ago crowd strike gave us the Y2K
we deserved as a cyber security company they convinc companies to install this thing called the Falcon sensor which has kernel level access to all their employees machines everything was going great until one day somebody pushed a bad configuration file called Channel file 291 to production this resulted in millions of Windows machines entering the blue screen of death hospitals shut down flights were cancelled and people nearly starved to death after the Arby's drive-through went down it was bad but not as bad as the northeastern blackout which occurred on August 14th 2003 when nearly 50 million people
in the United States in Canada lost power the first energy corporation had a monitoring system for the power grid but its error handling code wasn't so good if multiple alarms were generated in a short period of time the entire system would enter an unrecoverable State without notifying the operator ators and this meant that no new alarms would be shown and no existing ones be cleared which meant that it looked like everything was normal to The Operators but in reality half the damn country didn't have power luckily nobody died but now our submarine is making a
weird cracking noise as we get down to the bottom tier in 1994 a military helicopter of the royal Air Force took off in Scotland in foggy conditions the aircraft has an automatic throttle control that takes inputs from sensors placed around the aircraft but in these challenging conditions the system became overloaded and didn't know what to do when things started getting weird the pilots tried to regain manual control but eventually lost total control and 25 people died in the process the investigation found that the throttle control software was not adequately tested under these conditions but perhaps
the most famous deadly software bug is the theak 25 radiation machine this was a device used to treat cancer patients by providing a therapeutic dose of radiation unfortunately it was yet another device with inadequate air handling and when it encountered a race condition it wouldn't just break but it would deliver the patient with a lethal dose of radiation and at least three patients died after receiving radiation doses 100 times higher than intended in addition to the race condition bug the device also removed mechanical interlocks that provide a physical safeguard from things like this happening and
instead chose to rely entirely on software and that was a fatal mistake but usually when software kills people it has to do with the military-industrial complex in 1991 the United States was fighting the Gulf War in Iraq and Saddam launched a scud missile that should have been easily intercepted by the Patriot missile system unfortunately Ely there was a bug in the system's clock and timing calculations it had a 24-bit timer that tracked time in tenths of seconds but after it was an operation for over 100 hours without a reset the timer overflowed causing it to
report incorrect information about incoming threats and that incoming scud missile killed 28 American soldiers for that issue you can blame a back-end developer but the 1988 egis combat system disaster was caused by a front-end developer one day it accidentally shot down a civilian plane from Iran with 290 Souls on board the investigation revealed that a lack of userfriendly information on the system's display was one of the key reasons that they misidentified the friendly plane as a threat there was a timing lag that read to misleading altitude data which ultimately led to a tragic mistake one
of the biggest military contractors though is Boeing and they make many of the Jets we fly on like the 737 Max thanks to programmers these modern jets practically fly themselves but maybe that's not a good thing these planes were updated with a thing called the maneuvering characteristics augmentation system or mcast which would automatically push the nose down to prevent a stall there were two angle of attack sensors hooked up to the system but it would initiate the nose down sequence if only one of them provided weird data but that was a major oversight in the
programming because if one of the sensors provided faulty data it would start pushing the nose down and that's exactly what happened on two tragic flights Lion Air Flight 610 in 2018 and Ethiopian Airlines flight 302 in 2019 where 346 passengers and crew lost their lives Boeing's reputation was permanently bricked after all 737 Max planes were grounded and the fix was to Simply make sure that both sensors were providing consistent Data before pushing the nose down this final tier was tragic but on the bright side good code has saved a lot more lives than the lives
lost by Bad Code writing entirely bug free code is difficult but remember there are two ways to write error-free programs only the third one works thanks for watching and I will see you in the next one
Copyright © 2025. Made with ♥ in London by YTScribe.com