- The 40+ best Black Friday Sam's Club deals of 2024: Last chance on electronics, TVs, and more
- Best Black Friday gaming PC deals 2024: Last chance on prebuilt PCs, GPUs, monitors, and more
- I found the AirTags that Android users have been waiting for - don't miss the Black Friday sale price
- iPad 10th gen for $279 is a big price drop for Cyber Monday - and it's the model I recommend most
- The 40+ best Black Friday Nintendo Switch deals 2024: Last chance
Facebook Blames Global Outage on Configuration Error
Facebook has apologized for a major global outage that left users unable to access the social network and other platforms for hours, blaming the incident on a configuration error.
The outage began at around 11.40 Eastern Time on Monday morning and lasted well into the evening of the same day — affecting not just Facebook and Messenger but Instagram and WhatsApp.
The recovery effort was also impacted as Facebook engineers found it difficult to access internal tooling which used the same internet infrastructure. Global staff were left high-and-dry for similar reasons.
The issue appears to have stemmed from an update to the firm’s Border Gateway Protocol (BGP) records. BGP is critical to the seamless functioning of the internet, allowing networks of addresses such as Facebook’s to advertise their presence to others.
“It’s a mechanism to exchange routing information between autonomous systems (AS) on the internet,” explained Cloudflare in a technical blog about the incident.
“The big routers that make the internet work have huge, constantly updated lists of the possible routes that can be used to deliver every network packet to their final destinations. Without BGP, the internet routers wouldn’t know what to do, and the internet wouldn’t work.”
Although some commentators had speculated foul play, the cause of the outage appears to be human error..
Vice president of infrastructure, Santosh Janardhan, said no user data was compromised and that the root cause of the issue was a “faulty configuration change.”
“Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our datacenters caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our datacenters communicate, bringing our services to a halt,” he explained.
“People and businesses around the world rely on us every day to stay connected. We understand the impact outages like these have on people’s lives, and our responsibility to keep people informed about disruptions to our services. We apologize to all those affected, and we’re working to understand more about what happened today so we can continue to make our infrastructure more resilient.”