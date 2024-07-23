Around 8.5 million PCs. This number, confirmed by Microsoft and later by CrowdStrike, illustrates the scale of the global outage of Windows PC managed by enterprises’ IT departments last Friday due to a faulty update by third-party cybersecurity company CrowdStrike. Work came to a standstill across industries, including aviation, healthcare and fintech. It soon became painfully clear that organisational IT processes worldwide, are woefully unprepared for such an eventuality. A typical Blue Screen of Death, or BSOD, error on Windows PCs, often when system privileges are incorrectly altered. (Screenshot)

Is there a way for organisations to be better prepared? “The disruption has underscored vulnerabilities in our IT infrastructure, and an over-reliance on certain providers,” said Piyush Goel, founder and CEO of Beyond Key software development company, during a conversation with HT.

The weakest link in the chain proved to be a security software, categorised as endpoint protection. This is more versatile than an antivirus app you may be familiar with, helping IT monitor and secure devices that employees use on their network. By nature, this software requires elevated privileges, or access to Windows system configuration. The faulty CrowdStrike update rolled out on Friday morning, therefore caused Windows to crash soon after installation.

For enterprises, any PC connecting to their network, is an endpoint.

With millions of PCs impacted, Microsoft confirms they’re working with Amazon Web Services and Google Cloud Platform too, to speed up availability of a fix. “CrowdStrike has helped us develop a scalable solution that will help Microsoft’s Azure infrastructure accelerate a fix for CrowdStrike’s faulty update,” says David Weston, vice-president for enterprise and OS security at Microsoft, in a statement shared with HT.

“Together with customers, we tested a new technique to accelerate impacted system remediation. We’re in the process of operationalizing an opt-in to this technique,” confirms CrowdStrike, in a statement. But getting the affected 8.5 million PCs back into working order and connected to the internet, will be a painstaking process.

Take the example of PCs with the dreaded BSOD, or blue screen of death which renders them unusable. A manual fix to reboot into the Windows ‘safe’ or recovery mode is necessary. That is, to manually delete CrowdStrike’s buggy update file, or connect to the internet to download a corrective patch. Getting every impacted PC back in action could take days, if not weeks.

The scale of this outage may well illustrate a need for enterprises to stress-test their infrastructure, including placing fail safes. It is what Jake Moore, who is a Global Cybersecurity Advisor at security company ESET, calls a ‘cyber-resilience plan’. Yet, it may not always be possible to simulate this scale and magnitude, during tests.

“The inconvenience caused by the loss of access to services for thousands of people serves as a reminder of our dependence on Big Tech such as Microsoft in running our daily lives and businesses. Upgrades and maintenance to systems and networks can unintentionally include small errors, which can have wide-reaching consequences as experienced today by CrowdStrike’s customers,” Moore points out.

Ashish Tandon, who is Founder and CEO of Indusface, which makes application security solutions, agrees IT systems need to be designed better with failure points in focus. “This incident highlights the importance of robust contingency planning and shared responsibility between software vendors and businesses. By incorporating fail-safes and redundancy into system architectures and developing comprehensive backup plans, we can enhance the resilience of our digital infrastructure and ensure the continuity of critical services,” he says.

Could an indigenized security solution, if that were to be deployed by IT departments in Indian companies, helped cushion the impact? Beyond Key’s Goel believes that’s likely. “Leveraging indigenized solutions can offer certain advantages, such as better alignment with local regulatory requirements and potentially quicker response times during incidents, he says.

This is a thought echoed by Jay Kotak, co-head of Kotak811, a digital banking platform. “Globalization is good, but guardrails are needed. Sovereign nations must drive localized data and tech stacks, sectoral foreign ownership, management norms etc., to mitigate such risks,” he wrote in a post on X.

However, a choice between global and local solutions does come with a caveat, which Goel believes would be a thorough risk assessment based on an organization’s specific needs. A one-size-fits-all approach, such as shifting completely to locally developed solutions, wouldn’t work for every industry.

Being less reliant on a handful of solution providers, would tackle what Eset’s Moore calls a problem of variety. “Where diversity is low, a single technical incident, not to mention a security issue, can lead to global-scale outages with subsequent knock-on effects,” he says.

It’s early days in assessing the fallout of this global PC outage, but experts including the Anderson Economic Group believe the bill of losses could top $1 billion.

“This incident demonstrates an interconnected nature of our broad ecosystem, including global cloud providers, software platforms, security vendors, and customers. It’s also a reminder of how important it is for all of us across the tech ecosystem to prioritize operating with safe deployment and disaster recovery using the mechanisms that exist,” explains Microsoft’s Weston.

The coming weeks will illustrate if organisational IT processes see an overhaul, to be better prepared and less vulnerable to external factors such as faulty software. CrowdStrike, which reported annual revenue of $3.06 billion for FY 2023-24, may rightly be concerned about increased competition for enterprise endpoint security business from rivals including Symantec, McAfee, Sophos and Sentinel One.