Earlier this week AS16735 (Companhia de Telecomunicacoes do Brasil Central – CTBC) of Brazil had a bit of a routing snafu that resulted in their [apparent] accidental attempt to hijack a large number of prefixes spread across the whole of the Internet routing address space. In fairness, the terms “route leaks” and “route hijacks” are used synonymously nowadays – with the former seemingly implying unintentional conotations, while the latter malicious. Either way, the result is the same. As for the impact of this incident, well, that’s a metric that depends largely on perspective, with impact to CTBC perhaps being most pronounced.
Interestingly, this incident came to light on the NANOG mailing list only after a number of folks indicated they’d received route hijack alerts for one or more of their prefixes, with AS16735 listed as the culprit origin AS. I suppose that with the numerous incidents over the past year (e.g., YouTube hijack by Pakistan Telecom, AFOL-KE hijack by Abovnet, the L-Root fiasco, and the defcon Stealing the Internet proof of concept exercise) it’s fortunate that many folks finally began to employ Internet routing system hijack alert systems (e.g., PHAS, BGPmon.net, watchmy.net, Renesys). Unfortunately, as the Renesys folks pointed out here, not all the route hijack monitoring systems seem to have detected the incident – illustrating the autonomous, fragmented and topologically sensitive nature of inter-domain routing on the Internet.
Fortunately, as noted by Eduardo Ascenço Reis in a message to the NANOG mailing list in response to the thread above, it appears as though the route hijack alerts generated were triggered by a sole RIPE route server (rrc15) that’s part of RIPE’s Routing Information Service (RIS) project, which is physically located in Brazil with PTTMetro-SP, and has a topologically localized view of routing information. The fact that the routes were not propagated into the global routing system is a very good thing, and kudos to the networks involved that proactively implemented policies that scoped the leaking and otherwise wider propagation of those routes.
So, what was the impact? Well, as already noted, CTBC perhaps didn’t much impact anyone upstream or within the global Internet routing system. However, the route leaks indeed appeared to have impacted many of CTBC’s own IP address spaces and Internet transit customers. A RIPE BGPlay topology diagram during the incident (for one sample prefix, 220.127.116.11/20) is provided above. Interestingly, AS path correlation of Internet traffic (illustrated in the traffic graph above) from the Internet Observatory to and from AS 16735 originated BGP routes (i.e., only AS 16735 originated prefixes, not downstream networks or IP address blocks) seem to indicate that rather than picking up a substantial amount of traffic as a result of leaking routes that belong to others on the Internet (which would typically be the case), Internet traffic for AS16735 essentially slowed to a trickle for many hours as a result.
This significant drop in traffic observed could have occurred because folks were filtering or stopped propagating AS 16735 routing announcements at their AS borders, or because local configuration or instability within the involved routing domains resulted in BGP route flap damping suppression, or perhaps something such as maximum-prefix limits triggering BGP session tear-down, or local advertisement policies within CTBC’s routing domain, or any number of other reasons.
It is preculiar that the routes that were leaked by AS 16735 were not simply existing paths that were leaked, but rather, were routes that listed AS 16735 as the BGP path origin. Normally, when routes are leaked, the origin AS and other AS path attributes are preserved and the leaking AS is simply inserted somewhere along the path data. Of course, when people do odd things (e.g., redistribute routes, or bork things such as AS 7007 many moons ago) it’s difficult to pinpoint precisely what the cause was without detailed information from the sources themselves.
I suppose there are several takeaways here:
prefix hijack alerting seems to be more widely employed, and topological diversity among those systems is a good thing
- while the propagation of leaked routes in this event seems to have been quite limited in scoped, which is certainly a good thing, the incident apparently had a very significant impact on CTBC’s overall Internet availability.
- with control plane data (i.e., BGP routing information) alone it can be very difficult to gauge full impact when incidents of this sort occur
- when hijacking the Internet, one should take care not to shoot yourself in the foot
As for why the routes were leaked in the first place, it could have been any of an array of reasons – most of which have been outlined either here or at one of the above references in the past. Absent application of secure inter-domain routing solutions, incidents of this sort will surely continue to result in outages on a more frequent basis, be they the result of fat fingers or mal-intent, and continue to illustrate just how vulnerable the Internet routing system is.