A Brief Look at Facebook Outage
The below graph shows coarse grain Facebook (ASN 32934) traffic statistics from 60 randomly selected ISPs around the world. While most press / blog coverage (e.g. Gigaom’s “Facebook Sees Major Outage”) pegged the disruption at 5:30 pm ET, the traffic data suggest Facebook’s problems began much earlier in the day.
Normally, Facebook’s diurnal traffic follows the same pattern as other social media and interactive consumer sites. Generally, Facebook traffic reaches a low over night at 2am and then grows to its daily peak at 5pm EDT before declining briefly before a second smaller peak at 9pm ET (the peaks likely matching the North American end of work day and prime time across PDT and EDT).
But beginning Friday morning at 2am, Facebook saw dozens of modest traffic drops (each of a few Gigabits) until plumitting 30 Gbps at 5pm EDT for roughly twenty minutes.
What happened to Facebook?
While there is no shortage of speculation on Twitter and operations mailing lists, Facebook so far is not saying. I think a recent post to an engineering outage discussion list sums up the situation:
“Given Facebook’s complexity, who knows what the problem was. Load balancer or layer 7 filter/re-writer (think F5) issues? Back-end server problems? Software misconfiguration? … Some developer deciding to just roll something out in the middle of the day (as is quite common with social networking sites these days)? We’ll probably never know.”
Facebook has come a long way from a few hundred Harvard freshman looking for dates. As Facebook accelerates past 400 million users and pursues goals of nothing short of taking over the web, the social media giant has become critical infrastructure — at least from the perspective of millions of consumers and ISP support desks.
In an upcoming series of blogs, we’ll explore the growing Internet infrastructure footprint of Facebook, Google and other dominant Internet content companies.