Tag Archives : YouTube March 2010

Google Blip

By: Craig Labovitz -

While Google’s YouTube outage today generated a steady stream of tweets and blog posts, a quick look at traffic across 50 or so small / mid-size ISPs around the world suggests this was more of a “blip” than a global outage.


Certainly the outage was nowhere as large nor prolonged as the great “GoogleLapse” last year.

Below is a graph of traffic originating in Google (AS 15169) over the last 24 hours using data from 50 ISPs around the world selected at random. All times are EDT. Looks like a small outage overnight preceded the larger traffic 8am EDT drop-off.

Google Blip

And a quick aside, my intent is not to pick on Google (unless, of course, they do not pick Ann Arbor) — all providers have outages. I just find Google an especially interesting case study given their size and overall impact on the Internet.

How Big is Google?

By: Craig Labovitz -

Google’s recent FTTH announcement generated a wave of media coverage and industry discussion. Responses ranged from exuberant local communities racing to sign up to anti-competitive howls from incumbent carriers.

Industry pundits wondered what is Google up to? What will the search giant do with 1Gbps to the home? And more ominously, is Google getting too big?

While this blog post won’t explore the politics / strategy behind Google’s FTTH initiative (except to suggest Google should choose Ann Arbor), we will share some data on Google’s relative size and growth from a global Internet perspective.

Google is big.

And by “big”, I mean really big. If Google were an ISP, it would be the fastest growing and third largest global carrier. Only two other providers (both of whom carry significant volumes of Google transit) contribute more inter-domain traffic. But unlike most global carriers (i.e. the “tier1s”), Google’s backbone does not deliver traffic on behalf of millions of subscribers nor thousands of regional networks and large enterprises. Google’s infrastructure supports, well, only Google.

Based on anonymous data from 110 ISPs around the world, we estimate Google contributes somewhere between 6-10% of all Internet traffic globally as of the of summer of 2009.

The below graph shows the weighted average percentage of all Internet traffic contributed by Google ASNs between June 2007 and July 2009. Most of Google’s rapid growth comes after the acquisition of YouTube in 2007.

Google's Contribution to Global Internet Traffic

Before getting much further, a few words about what we’re measuring. Traffic volumes provide only the most indirect measure of a network’s size or popularity (for example, it takes tens of thousands of Tweets to match the bandwidth of a single HD video). Our anonymous data also does not include internal provider services (e.g. IPTV or VPN) nor data served from co-located caches within provider data centers. Rather, we’re measuring inter-domain traffic, i.e. the traffic between providers (the “inter” in “Internet”).

With all of the above said, inter-domain traffic volumes provide a key metric for understanding Internet topology and the evolution of Internet traffic patterns.

But even traffic volumes tell only part of the story.

The competition between Google, Microsoft, Yahoo and other large content players has long since moved beyond just who has the better videos or search. The competition for Internet dominance is now as much about infrastructure — raw data center computing power and about how efficiently (i.e. quickly and cheaply) you can deliver content to the consumer.

And here again, Google is at the head of the pack.

In 2007, Google used transit providers for the majority of their Internet traffic (including Level(3)). But over the last three years, Google both built out their global data center and content distribution capability as well as aggressively pursued direct interconnection with most consumer networks.

The graph below shows an estimate of the average percentage of Google traffic per month using direct interconnection (i.e. not using a transit provider). As before, this estimate is based on anonymous statistics from 110 providers. In 2007, Google required transit for the majority of their traffic. Today, most Google traffic (more than 60%) flows directly between Google and consumer networks.


But even building out millions of square feet of global data center space, turning up hundreds of peering sessions and co-locating at more than 60 public exchanges is not the end of the story.

Over the last year, Google deployed large numbers of Google Global Cache (GGC) servers within consumer networks around the world. Anecdotal discussions with providers, suggests more than half of all large consumer networks in North America and Europe now have a rack or more of GGC servers.

So, after billions of dollars of data center construction, acquisitions, and creation of a global backbone to deliver content to consumer networks, what’s next for Google?

Well, I’m hoping for delivery of content directly to the consumer via a nice, fat 1 Gbps FTTH pipe.

Google, please choose Ann Arbor.

The Internet After Dark (Part 2)

By: Craig Labovitz -

This blog completes our informal three week study of Internet daily traffic patterns. Using data from the Internet Observatory, we analyzed weekday application traffic across 110 geographically diverse ISPs, including some of the largest carriers in North American and Europe. We believe this report (and upcoming paper) represent the largest study of Internet traffic temporal characteristics to date.

In the first half of this post, we showed unlike European Internet traffic which peaks in the early evening and then drops off until the next day’s business hours, US Internet traffic reaches its peak at 11pm EDT and then stays relatively high until 3am in the morning.

The question is what are Internet users doing after dark?

The answer: long after Exchange and Oracle business traffic slows to a crawl, Internet users turn to the web to surf, watch videos, send IM’s and happily try to kill each other.

We illustrate these trends with graphs of four application categories below.


The top two graphs show the daily average traffic fluctuations of TCP / UDP ports related two popular online game multi-player platforms: World of Warcraft and Steam (which includes many popular first person shooter games like Half Life). The bottom two graphs show common video and instant messaging protocols. As in earlier analysis, we take the average of North American consumer / regional providers traffic over 10 weekdays in July. To make the graph more readable, we show traffic as a percentage of peak traffic levels. All times are EDT.

Some observations:

  • Gamers Come Out at Night: Unlike most Internet applications which peak midday or late afternoon, online game traffic grows by more than 60% after 2pm. Gaming prime time appears to be between 8pm and 11pm EDT weekday nights (corresponding to the traditional and now declining television prime time hours). By comparison, web traffic levels remain relatively constant through the late afternoon and peaks much earlier at 5pm.
  • A Guild that Plays Together Stays Together: Unlike other online game traffic, World of Warcraft’s Battlenet shows a distinct 30% jump exactly at 8pm EDT every evening. In-house WoW level 80 colleagues suggest 8pm is a common time for guilds to set out on quests. Also unlike other game traffic, WoW declines rapidly after 11pm every night. Again, we suspect WoW traffic patterns are related to the more large group, social nature of World of Warcraft.
  • Midnight Video: Of all Internet applications, streaming video protocols reach their traffic peak the latest around midnight EDT every evening. We do not have very good visibility into what Internet users are watching this late, but correlation with large content site traffic patterns (below) provides some clues.
  • Always in Touch: Beginning at 9am EDT at lasting though midnight, Internet users IM constantly. The IM graph above shows traffic reaches 80% of peak by 10am and stays above 80% until midnight (with a 5pm EDT peak — perhaps related to millions of users making dinner plans). Interestingly, email exhibits a very different pattern and plummets by more than 30% immediately after 5pm EDT.

As mentioned earlier, we do not have detailed visibility into what Internet users are watching at midnight but ASN level traffic analysis provides some hints. Predictably, traffic grows dramatically to consumer sites like Google’s YouTube and large CDN / video providers. Also not surprisingly, we see a large jump in traffic to colo / hosting companies with adult content such as a 40% jump to ISPrime (AS23393) between 10pm and 1am EDT. We will explore one of the fastest growing and largest nighttime sites, Carpathia Hosting (AS29748), in an upcoming blog.

Editor’s Note: This blog is the third in a series of weekly posts leading up to the publication of the joint University of Michigan, Merit Network and Arbor Networks “2009 Internet Observatory Report”. The full technical reports goes into detail on the evolving Internet topology, commercial ecosystem and traffic patterns — available this October. Next week: “Who Put the IPv6 in My Internet?”

Reblog this post [with Zemanta]

DPI is not a Four-letter Word!

By: Kurt Dobbins -

As founder and CTO of Ellacoya Networks, a pioneer in DPI, and now having spent the last year at Arbor, a pioneer in network-based security, I have witnessed first hand the evolution of Deep Packet Inspection. It has evolved from a niche traffic management technology to an integrated service delivery platform. Once relegated to the dark corners of the central office, DPI has become the network element that enables subscriber opt-in for new services, transparency of traffic usage and quotas, fairness during peak busy hours and protection from denial of service attacks, all the while protecting and maintaining the privacy of broadband users.

Yet, DPI still gets a bad rap. Guilty until proven innocent! Why is that?

DPI means different things, because it is an overloaded term. I can think of at least four separate product categories of DPI:

1) Traffic Management: DPI that classifies application traffic by examining the headers, without looking into the actual content itself.

2) Surveillance: DPI that logs, reconstructs, or plays back communication exchanges.

3) Ad-Insertion (and profiling): DPI that profiles subscriber web browsing or search activities, inserts cookies, or logs URLs visited by a subscriber.

4) Security: DPI that examines content for viruses, trojans, or other forms of vulnerabilities.

Paramount to each of these product categories is privacy. Service providers and consumers share in concerns over privacy, as do industry luminaries. Yesterday, according to ZDNet, Sir Tim Berners-Lee, “inventor” of the World Wide Web, spoke out against the use of deep packet inspection citing concerns over how snooping on clicks and data reveals more information about people than listening to their conversations.

His concerns are valid. And I can attest, having worked with service providers around the globe, that service providers are deeply aware of how important it is to protect consumer privacy. That is why service providers are becoming more transparent and giving consumers choices with opt-in and opt-out capabilities. This new era of transparency is as much a result of consumer interests, service provider best practices, and increasing regulatory pressures, as it is an indication of the broader shift of how DPI-based services are being used.

That is why Phorm, the targeted advertising service company mentioned in the ZDNet article which uses DPI, has a technology that can’t know who users are and allows users to switch it off or on at any time (opt-out or opt-in).

But transparency and consumer opt-out are not limited to broadband service providers and DPI. Yesterday, Google launched “interest-based” advertising on their partner sites and on YouTube, where ads will associate categories of interest based on the types of sites you visit and the pages you view. And, in line with DPI and service provider models of transparency and consumer choice, Google is offering transparency, choice with Ads Preference Manager, and a non-cookie based opt-out capability.

So at the heart of any service over broadband, not just DPI-based services, is the need for transparency, fairness, consumer choice and protection while preserving the privacy of individuals. These are the new discussion points that need to transcend specific technologies in the network. The public debate and regulatory directions has to be centered on these key areas – stay tuned as Arbor becomes more active in these arenas.

As for DPI itself, it has proven to be a critical network element in service provider networks, by providing those things that we all hold dear: privacy, protection, fairness and transparency. DPI is not a four-letter word!

Reblog this post [with Zemanta]