Windows outage: CrowdStrike blames bug in testing software; CEO says 'deeply sorry'
Windows Blue Screen Outage: Strike claims that it is going to improve its software resilience and ramp up the testing.
CrowdStrike's faulty update for millions of Windows users brought the world to a standstill last Friday—after their computers got stuck in boot loops and displayed the ‘Blue Screen of Death’ error. Now, while CrowdStrike did acknowledge that it was due to a bad update, the American company has said that the bad update was a result of faulty testing software that resulted in the bug.
Why is CrowdStrike Blaming a Test Software?
CrowdStrike says that the content update was intended to “gather telemetry on possible novel threat techniques,” but it contained a problematic content configuration that reportedly caused over 8.5 million computers running sensor version 7.11 and above to crash.
Now, here’s the interesting bit: CrowdStrike claims that it releases Rapid Response Content as ‘Template Instances,’ which are further representations of a Template Type. CrowdStrike had already sent a Template Type for the new sensor on March 4, 2024, and it worked without any issue, but the problem arose when one of the three Template Types released on July 19 (day of the outage) passed validation despite containing problematic content data. This was due to a “bug in the Content Validator.”
When this update reached Windows devices, “problematic content in Channel File 291 resulted in an out-of-bounds memory read triggering an exception. This unexpected exception could not be gracefully handled, resulting in a Windows operating system crash (BSOD).”
What is CrowdStrike Going to do to Prevent a Situation like this from Happening in the Future?
CrowdStrike claims that it is going to improve its software resilience and ramp up the testing. For this, it is going to incorporate testing types such as local developer testing, content update and rollback testing, stress testing and fault injection, stability testing, and content interface testing. Also, it will add additional validation checks to the Content Validator for releasing Rapid Response Content.
Further, moving forward, apart from implementing a staggered deployment strategy, customers will have more control over the Rapid Response Content Updates by allowing them to select when and where these updates are installed. Additionally, the updates are now going to contain release notes.
Company CEO Says 97% Affected Windows PCs Are Back Online
In good news, the company’s CEO, George Kurtz, said that over 97% percent of computers affected by the recent outage are now back to being functional. “I want to share that over 97% of Windows sensors are back online as of July 25. This progress is thanks to the tireless efforts of our customers, partners, and the dedication of our team at CrowdStrike. However, we understand our work is not yet complete, and we remain committed to restoring every impacted system,” Kurtz said on LinkedIn.
He also apologised, saying, “I am deeply sorry for the disruption this outage has caused and personally apologize to everyone impacted. While I can’t promise perfection, I can promise a response that is focused, effective, and with a sense of urgency.”
Written by: Shaurya Sharma, HT Tech
Enjoy incredible deals on laptops , TVs, washing machines, refrigerators, and more. Save big this Diwali on home appliances, furniture, gadgets, beauty, and more during the biggest sale of the year.