International Blue Screen Day: Lessons from a Major Windows Crash

26 July 2024

crowdstrike

In the aftermath of an IT global outage last week, speculation runs rampant. Several theories emerged to explain the unprecedented disruption. 

Was it merely a human error, a simple mistake that snowballed into chaos? Or perhaps a sophisticated cyber attack, concealed under a veil of secrecy? Some even ponder if the outage was planned years ago by shadowy figures with unknown motives.

Why did this happen?

Let us fact-check.

On July 24th, 2014, around 4:40 PM NZ time, millions of computers worldwide were brought down by a problematic channel update that caused Windows to crash with the infamous Blue Screen of Death (BSOD). This incident has since been dubbed International Blue Screen Day. While it wreaked havoc globally, causing disruptions to payment services, banks, billboards, and airports, NZCS remained unaffected as we do not use CrowdStrike Falcon. However, this incident is a stark reminder that it could have easily impacted one of our security vendors.

Understanding CrowdStrike Falcon CrowdStrike Falcon is a leading cybersecurity tool designed to protect computers and networks from cyber threats. At its core, the CrowdStrike Falcon agent is a small program installed on your computer that monitors activity for suspicious behaviour. This data is then sent to the cloud, where advanced algorithms analyse it in real-time. If a threat is detected, the agent can take immediate action, such as isolating the computer or blocking harmful processes, to protect your system.

CrowdStrike Falcon operates in kernel mode, providing the highest level of access and control to monitor and manage system activities deeply. Kernel mode allows the software to detect and stop threats effectively, but it also means there’s no exception handling at this level. Errors can disrupt vital system functions, and handling exceptions can cause instability or crashes. Thus, while kernel mode offers powerful protection, it requires careful handling to avoid system issues.

crowdstrike
Photos via X

The Fallout and Fix

The BSOD incident quickly went viral, with Reddit and X buzzing about the widespread disruptions. For many affected systems, the fix involves a somewhat complex process:

  1. Boot Windows into Safe Mode or the Windows Recovery Environment.
  2. Navigate to the C:\Windows\System32\drivers\CrowdStrike directory.
  3. Locate the file matching “C-00000291*.sys” and delete it.
  4. Boot the host.

However, this process can be complicated by factors like disk encryption (e.g., BitLocker) or lack of admin rights. Users may sometimes find themselves locked out, with the encryption key stored on the now inaccessible device.

Consider an environment with 100,000 devices and remote staff worldwide. For many, sending out new devices might be cheaper than spending countless hours guiding users through the manual fix. IT departments worldwide are grappling with this challenge. Shortly after the incident, over 30 new fake websites emerged, created by bad actors looking to capitalise on the chaos.

Key Takeaways

  1. Be Prepared for Disruptions: Human or AI mistakes are inevitable. We must be prepared to operate even if our computers are down.
  2. Updates Are Essential: While updates sometimes break things, they are crucial for security. It’s like wearing a seatbelt—sometimes accidents happen, but you’re safer with it on.
  3. Timing Matters: Avoid pushing updates on a Friday or before a long weekend to ensure IT support is available if something needs to be fixed.
  4. Diversify Your Vendors: Using multiple vendors across your environment can prevent a single point of failure.

Looking Ahead CrowdStrike Falcon is well-regarded in the industry, and we hope the company, alongside Microsoft, will develop a solution to make recovery from such mistakes easier. This incident underscores the importance of robust cybersecurity measures and the resilience needed to handle unexpected challenges.

You May Also Like

Understanding Zero-Day Vulnerabilities

Why Clicking a Link Isn’t Always Safe Recently, I had an interesting conversation with a developer about one of the most common pieces of advice