The frequency of large-scale attacks on corporate IT is increasing. This is not unusual or unexpected, as companies spend heavily on cyber defense in an asymmetric war against hackers who can put together a few lines of code and wreak havoc.
But Friday’s largest-ever IT outage, which resulted from a CrowdStrike software bug that was uploaded to Microsoft’s operating systems rather than any malicious attack, shows a type of technology threat that is growing alongside hacking but is getting less attention: failure single point — a failure in one part of a system that creates a technical disaster in industries, operations, and interconnected communications networks; a massive domino effect.
Earlier this year, AT&T had a nationwide outage attributed to a technical update. Last year, the FAA had an outage that occurred after a single person replaced a critical file in a route update (now that the FAA has a backup system in place to prevent this from happening again).
“It’s more frequent even when it’s just routine patches and updates,” Chad Sweet, co-founder and CEO of The Chertoff Group and former chief of staff at the Department of Homeland Security, told CNBC on Friday.
The digital signs appear due to the global communications outage caused by CrowdStrike, which provides cyber security services to the American technology company Microsoft, it was observed that some digital signs in Times Square in New York, United States, displayed a blue screen and some screens turned completely black on July 19, 2024.
Selcuk Acar | Anadolu | Getty Images
Managing the risk of single-point failure is an issue that companies need to plan for and protect against. There’s no software in the world that goes out that doesn’t later need to be patched or updated, and there are security best practices long after a production release that cover ongoing software maintenance, Sweet said.
The companies the Chertoff Group works with are taking a close look at their software development and update standards after the CrowdStrike shutdown. Sweet pointed to a set of protocols already provided by the government, the Secure Software Development Framework (SSDF), which may give the market an idea of what to expect as Congress begins to look more closely at the issue. This is likely after the recent spate of incidents, from AT&T to the FAA to CrowdStrike, as this type of technical failure has now been shown to affect the lives of citizens and the operations of critical infrastructure on a widespread basis.
“Get ready on the corporate side,” Sweet said.
Aneesh Chopra, Arcadia’s chief strategist and former White House chief technology officer, told CNBC on Friday that critical sectors such as energy, banking, healthcare and airlines have separate regulations that oversee the risk, and measures can to be unique in the most regulated sectors. But for any business leader, the question now is, “Supposing systems go down, what’s plan B? We’re going to see a lot more scenario planning, and if that’s not the No. 1 job, it’s the No. 2 job or 3 to have these scenarios described,” he said.
Unlike many issues in DC, Chopra noted that there is bipartisan commitment to critical infrastructure and systemic risk issues, and technical standards are a “stamp” of the US system. There may now be efforts he described as designed to “improve competition” as a means of strengthening accountability.
“If there is a mechanism to inform in a more open and competitive way, there can be pressure to ensure that this is done in a way where the i’s and t’s are dotted and crossed,” Chopra said.
Sweet said this will inevitably lead to concerns from the business world about the risk of over-regulation. While there’s no way to know for sure now if there was a way for CrowdStrike to work using a more open process that allowed for a single point of failure, he said it’s a reasonable question to ask.
The best method to avoid overregulation, according to Sweet, is to look to market-enhancing mechanisms, such as the insurance industry. “The short answer is, ‘Let the free market do it, through things like the insurance industry, which will reward good actors with lower premiums,'” he said.
Sweet also said more companies should embrace the idea of ”anti-fragile” organizations, as he does with his clients, a term coined by risk analyst Nassim Nicholas Taleb. “Not just an organization that is resilient after a disruption, but an organization that thrives and innovates and outperforms competitors,” he said. In his view, any legislation or regulation would be difficult to keep up with malicious attacks and technical updates pushed with unintended consequences.
“It’s definitely a wake-up call,” Chopra said.