Last updated October 5, 2021
In AI Mysteries

What Is Border Gateway Protocol: Reason Behind Facebook’s Outage

For about six hours, Facebook, Instagram, and WhatsApp, which command a total of over 3.5 billion monthly core users, were not working.

Published on October 6, 2021

by Shraddha Goled

Top social media apps — Facebook, Instagram, and WhatsApp — suffered one of the longest global outages last night. For about six hours, all the three apps, which command a total of over 3.5 billion monthly core users, were not working. This outage grazed Facebook founder Mark Zuckerberg’s fortune by $6 billion and pushed him down a few positions in the world’s richest list.

In an official statement issued by Facebook, the cause behind the unprecedented outage was a configuration change in the backbone routers that coordinate network traffic between the data centres. This had a cascading effect and brought all the Facebook services to a halt. In layman terms, everything that Facebook runs disappeared for a period of time.

We’re aware that some people are having trouble accessing our apps and products. We’re working to get things back to normal as quickly as possible, and we apologize for any inconvenience.
— Meta (@Meta) October 4, 2021

What Went Wrong

The company talks about the Border Gateway Protocol (BGP) in a detailed blog by website security firm Cloudflare. It is a mechanism to exchange routing information among autonomous systems. The list of possible routes is constantly updated by the Internet routers. If BGP stops working, these routers will not know what to do, bringing the Internet to a halt. So while DNS (domain name system) is the address system of the location of each website or the IP address, BGP is the roadmap that helps find the most efficient route to get to that address.

Further, an individual network with a unified internal routing policy is called an Autonomous System (AS), each of which has an Autonomous System Number or ASN. An AS can originate prefixes — control a group of IP addresses — and transit prefixes — tell how to reach specific groups of IP addresses.

As per Cloudflare, Facebook stopped announcing the routes to their DNS prefixes, which means that its DNS servers were unavailable. A BGP UPDATE message gives information about any changes made to a prefix advertisement or entirely withdraws the prefix. Cloudflare said that for Facebook, this chart is largely unchanged as the social media giant does not make a lot of changes to its network minute by minute. However, before the global outage, Cloudflare observed a lot of routing changes from Facebook. This led to the routes being withdrawn, and the DNS went offline. With these changes, Facebook and associated sites were effectively disconnected from the Internet.

Credit: Cloudflare

Facebook found itself in a deep slump because it was not able to fix the issue quickly. The company’s own internal systems run from the same place, so it was difficult for the staff to resolve the problem; they were restricted from accessing their own communications and were unable to access their office due to the security pass system being affected during the outage. As per reports, Facebook sent a technical team to its servers in California to manually reset the servers where the problem originated.

Credit: Cloudflare

Interestingly, when Facebook and related apps were down, people started looking for alternatives. A lot of DNS queries to Twitter, Signal and other social media apps increased.

hello literally everyone
— Twitter (@Twitter) October 4, 2021

PS: The story was written using a keyboard.

Access all our open Survey & Awards Nomination forms in one place

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

The Impact of Lok Sabha Election on India’s AI Progress

Vidyashree Srinivas

The BJP aims to safeguard citizen safety and privacy, leaning towards regulation, while the Congress views AI advancements as an opportunity to create jobs.