Additional Discussion of the April China BGP Hijack Incident

By: Craig Labovitz -

My blog post last week on the April 8th China BGP hijack incident generated significant discussion and raised additional questions in both the media and research / engineering community.

In particular, I agree with Dmitri Alperovitch’s recent McAfee blog post that “This topic is highly technical and very difficult to explain to people not fully immersed into the BGP routing jargon… this incident underscores the very serious problems that exist on the Internet due to the system of trust…” It is also clear from my reading of articles and Dmitri’s blog that much of the press mischaracterized his initial estimate of the hijack impacting “15% of the Internet” — Dmitri was referring to routes and not traffic.

In his blog, Dmitri goes on to argue that any analysis of the incident should focus on the upstream ISP (AS4134) to the Chinese provider (AS23724) responsible for the hijack.

Both at the time of the incident in April and prior to my posting of this China hijack blog, I had private conversations with operations staff at several of AS23724′s upstreams, network operators around the world, collaborators in other security companies, and Arbor’s own resident engineers in the region. All of these private discussions reflect the sentiment espoused in public engineering forums that the China hijack had modest to minimal impact on Internet traffic volumes, including this RIPE statement, NANOG discussion thread and even the BGPMon blog at the heart of the controversy.

In the below graph, I chart traffic from 80 ATLAS providers around the world that terminates or transits AS4134 (the primary upstream to the Chinese company responsible for the BGP hijack). Traffic is shown as a weighted average percentage of all inter-domain traffic using the peak five minute daily value for the month of April 2010. As in my earlier blog, the day of the hijack (April 8) is highlighted in yellow.

china traffic

The main take-away from the above graph is that ATLAS data shows no statistically significant increase for either AS4134 or AS23724. While we did observe modest changes in traffic volumes for carriers within China, the BGP hijack had limited impact on traffic volumes to or from the rest of the world.

As a couple readers of my blog observed (link to comments), traffic volumes provide an awkward measure of the security implications of a BGP hijack. In particular, the volume of hijacked traffic change depends on:

    1. The scope of the hijack. Specifically, how far and well the routes propagated. In this case, the additional ASPath length meant few peers and upstreams preferred the AS23724 routes. In other words, the leak had limited scope.

 

    1. Termination of the traffic. Did the China ISP (AS23724) drop hijacked packets or complete the connections? For example, in the former drop scenario, my laptop might just send 40 byte TCP Syn packets to an unresponsive destination. Since the TCP connection does not complete, my laptop will never send any significant volume of data traffic — China would only get lots and lots of Syns (and the rest of the world ICMP unreachables in exchange). UDP and ICMP, of course, are slightly different stories. On the other hand, if the traffic transits China or Chinese computers / VMs otherwise respond to the TCP requests, than significantly larger volumes of hijacked data traffic would flow from the rest of the world to China.

 

  1. Objective of the hijack. Though some of the media have drawn cyber-war conclusions, we may likely never know if this was a misconfiguration, practice run, or intentional hijack. In any case, traffic volumes do not map well to the different possible security threats. For example, if the goal was to disrupt Internet communication and “blackhole” hijacked traffic, then we would expect to see a global decrease in Internet traffic and a large volume of Syn directed at China. However, the technical particulars of the April hijack were not particularly well-suited for this type of large-scale Internet disruption (see this article or an earlier blog post for examples on how to do this correctly).Alternatively, the intent could be a trial run exploring worm-like attacks against the global routing infrastructure. In this scenario, a small set of well-crafted malformed routing messages (hidden in a hijack of thousands of other routes) quickly propagates across the Internet crashing core routers and switches. Or something a little like this event in August (as a side note, Xiaowei did absolutely nothing wrong in her August experiment and is a really nice person to boot). I also note that ATLAS routeviews data did not show any increase in dropped sessions nor unusual (other then the fact the routes were hijacked) BGP activity.

    If the intent was to hijack traffic for a small set of sensitive US government machines, then we might see TCP connections diverted for only a few machines in a man-in-the-middle attack, relatively low volumes of diverted traffic, and thousands of bogus routes announced as a smokescreen (credit for this scenario to my colleague Danny McPherson in a NYTimes interview). In other words, basically close to what we observed on April 15th.

    Or maybe, of course, this was just a typo in a configuration file.

As I observed in my earlier blog, inadvertent BGP route leaks and intentional hijacks have been part and parcel of Internet routing for the last twenty years. BGP hijacks happen all the time. The research and operations community have written hundreds of papers on the topic (including my own small contributions).

If I have not been clear up to this point, we have a problem. We need to address BGP security (as well as DNSSec, botnets, DDoS and other critical infrastructure threats) as quickly as possible. The Internet’s future may depend on it.
- Craig


Comments

  1. How can a third party AS detect if a route has been hijacked ?