The Great IT Outage of 2024 is a wake-up call about digital public infrastructure
On July 19, the world experienced its largest global IT outage to date, affecting 8.5 million Microsoft Windows devices. Thousands of flights were grounded. Surgeries were canceled. Users of certain online banks could not access their accounts. Even operators of 911 lines could not respond to emergencies.
The cause? One mere faulty section of code in a software update.
The update came from CrowdStrike, a cybersecurity firm whose Falcon Sensor software many Windows users employ against cyber breaches. Instead of providing improvements, the update caused devices to shut down and enter an endless reboot cycle, driving a global outage. Reports suggest that insufficient testing at CrowdStrike was likely the cause.
However, this outage is not just a technology error. It also reveals a hidden world of digital public infrastructure (DPI) that deserves more attention from policymakers.
What is digital public infrastructure?
DPI, while an evolving concept, is broadly defined by the United Nations (UN) as a combination of “networked open technology standards built for public interest, [which] enables governance and [serves] a community of innovative and competitive market players working to drive innovation, especially across public programmes.” This definition refers to DPI as essential digital systems that support critical societal functions, like how physical infrastructure—including roads, bridges, and power grids—are essential for everyday activities.
Microsoft Windows, which runs CrowdStrike’s Falcon Sensor software, is a form of DPI. And other examples of DPI within the UN definition include digital health systems, payment systems, and e-governance portals.
As the world scrambles to fix their Windows systems, policymakers need to pay particular attention to the core DPI issues that underpin the outage.
The problem of invisibility
DPI, such as Microsoft Windows, is ubiquitous but also largely invisible, which is a significant challenge when it comes to managing risks associated with it. Unlike physical infrastructure, which is tangible and visible, DPI powers essential digital services without drawing public awareness. Consequently, the potential risks posed by DPI failures—whether stemming from software bugs or cybersecurity breaches—tend to be underappreciated and underestimated by the public.
The lack of a clear definition of DPI exacerbates the issue of its invisibility. Not all digital technologies are public infrastructure: Companies build technology to generate revenue, but many of them do not directly offer critical services for the public. For instance, Fitbit, a tech company that creates fitness and health tracking devices, is not a provider of DPI. Though it utilizes technology and data services to enhance user experience, it does not provide essential infrastructure such as internet services, cloud computing platforms, or large-scale data centers that support public and business digital needs. That said, Fitbit’s new owner, Google, known for its widely used browser, popular cloud computing services, and efforts to expand digital connectivity, can be considered a provider of DPI.
Other companies that do not start out as DPI may become integral to public infrastructure by dint of becoming indispensable. Facebook, for example, started out as a social network, but it and other social media platforms have become a crucial aspect of civil discourse surrounding many elections. Regulating social media platforms as a simple technology product could potentially ignore their role as public infrastructure, which often deserve extra scrutiny to mitigate potential detrimental effects on the public.
The recent Microsoft outage, from which airlines, hospitals, and other companies are still recovering, should now sharpen the focus on the company as a provider of DPI. However, the invisibility of DPI and the absence of appropriate policy guidelines for measuring and managing its risks result in two complications. First, most users who interact with DPI often do not recognize it as a form of DPI. Second, this invisibility leads to a misplaced trust in major technology companies, as users fail to recognize how high the collective stakes of a failure in this DPI might be. Market dominance and effective advertising have helped major technology companies publicize their systems as benchmarks of reliability and resiliency. As a result, the public often perceives these systems as infallible, assuming they are more secure than they are—until a failure occurs. At the same time, an overabundance of public trust and comfort with familiar systems can foster complacency within organizations, which can lead to inadequate internal scrutiny and security audits.
How to prevent future disruptions
The Great IT Outage of 2024 revealed just how essential DPI is to societies across the globe. In many ways, the outage serves as a symbolic outcry for solution-oriented policies and accountability to stave off future disruptions.
To address DPI invisibility and misplaced trust in technology companies, US policymakers should first define DPI clearly and holistically while accounting for its status as an evolving concept. It is equally crucial to distinguish which companies are currently providers of DPI, and to educate leaders, policymakers, and the public about what that means. Such an initiative should provide a clear definition of DPI, its technical characteristics, and its various forms, while highlighting how commonly used software such as Microsoft Windows is a form of DPI. A silver lining of the recent Microsoft/CrowdStrike outage is that it offers a practical, recent case study to present to the public as real-world context for understanding the risks when DPI fails.
Finally, Microsoft has outlined technical next steps to prevent another outage, including extensive testing frameworks and backup systems to prevent the same kind of outage from happening again. However, while industry-driven self-regulation is crucial, regulation that enforces and standardizes backup systems, not just with Microsoft, but also for other technology companies that may also become providers of DPI, is also necessary. Doing so will help prevent future outages, ensuring the reliability of infrastructure which, just like roads and bridges, props up the world.
Saba Weatherspoon is a young global professional with the Atlantic Council’s Geotech Center.
Zhenwei Gao is a young global professional with the Cyber Statecraft Initiative, part of the Atlantic Council Technology Programs.
Further reading
Mon, Jul 29, 2024
A policymaker’s guide to ensuring that AI-powered health tech operates ethically
GeoTech Cues By Coley Felt
The private sector is moving quickly with the development of AI tools. The public sector will need to keep up with new strategies, standards, and regulations around the deployment and use of such tools in the healthcare sector.
Fri, Jul 26, 2024
The sovereignty trap
GeoTech Cues By Konstantinos Komaitis, Esteban Ponce de León, Kenton Thibaut, Trisha Ray, Kevin Klyman
When sovereignty is invoked in digital contexts without an understanding of the broader political environment, several traps can be triggered.
Fri, Jul 5, 2024
Advancing AI safety requires international collaboration. Here’s what should happen next.
New Atlanticist By Courtney Lang
In May, ten countries and the European Union met in South Korea to establish an international network of AI safety institutes. Next, this network should focus on three specific objectives.