Ahh, The Ease of Introducing Global Routing Instability

By: Danny McPherson -

Today’s global routing instability trigger (flavor of the month), that of extremely long AS paths, seems to be a bit of a repeat.  Some of you may remember this occurring over 5 years ago, and presumably, that same bug in Cisco IOS (CSCdr54230 – inadequate buffer sizing and a knob to limit maximum AS path lengths) is the culprit here as well.  This seems to have triggered a great deal of wide-spread routing system instability (and underlying connectivity issues) for a few hours this morning, as illustrated both in the chart below (GMT -7), and by volume of related discussion on a number of operational mailing lists.

In short, AS 47868 (SUPRO-AS) apparently took a notion to prepend it’s own AS number some ~251 times (I say 251 because he would normally put it there at least once when sending to external BGP peers (+1), if you’re counting and get 252) to it’s route advertisements for prefix 94.125.216.0/21, before announcing the route to it’s peers.  Some implementations, e.g., Jurassic (> 3 years) versions of Cisco IOS, didn’t allocate enough buffer space for silly long AS paths, and so they blow chunks when they receive the update.  However, other vendor implementations and patched versions happily propagate the update, apparently through numerous intermediate ASes, so seemingly random sets of BGP routers in the routing system were taking a dump, or dropping sessions with malformed AS path complaints (which may be a slightly different problem).

The alleged update looked something like this in your average Cisco command line interface (CLI) output, albeit with the leftmost AS numbers varying depending on topological perspective:

*  94.125.216.0/21    xxx.xxx.xxx.xxx       0    100      0 3356 29113
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868 47868
i

I’d highly recommend folk that haven’t patched in a few years do so ASAP (duh), else you are vulnerable to remote targeted attacks, period.  Also, I suspect the SUPRO-AS folks could have easily accomplished their traffic engineering goals (assuming that was the intent???) with just a few ASNs prepended, and remind them that many folks auto-suppress routes with AS paths containing more than n ASes, e.g., the knob added to IOS when the IOS buffer sizing issue was fixed.

I’ve seen a few different failure modes reported for this incident, and suspect there may be another issue with storing such large AS paths, or behavior related to sending error notifications and session tear down in other IOS versions or different implementations, in particular, when AS path lengths approach 255 (one byte, indicating number of ASes that can be contained in a single AS_SEQUENCE segment, or dealing with the sum of ASes contained in multiple AS_SET and AS_SEQUENCE segments).  I don’t have more details on this at the moment, unfortunately, but will continue digging.

Ohh, and as usual, I’ll note that absent explicit origin AS and path validation in a secure routing protocol, we’ll continue to see problems along these lines.

Some background information on this and a few related problems, for those interested…

In BGP autonomous system (AS) numbers are used to represent unique routing domains, which typically correlate to administrative domains.  In order to avoid routing information loops BGP routers, when sending a routing update message to external (not in the local AS) BGP peers, prepend their local AS number to the AS path in the routing update message.  While the AS path information primarily serves as a path vector to provide a mechanism for loop detection, it is also considered in BGP’s best path selection algorithm (shorter AS paths preferred over longer ones, hence the common prepending observed in the Internet routing system, namely for traffic engineering purposes), and various routing policies are often applied based on the length and/or contents of the AS path.

The AS path (AS_PATH) attribute is composed one or more TLV-encoded segments, of type AS_SEQUENCE (ordered set of AS numbers the update has traversed) or AS_SET (unordered set of AS numbers the update has traversed – used mostly in proxy aggregation and present on less than one hundred of the ~300k prefixes in the global routing system today).   As noted, an AS path can have multiple segments of both types, but when sending an update to an external BGP peer the leftmost segment type needs to be an AS_SEQUENCE, and the local AS needs to be placed in the leftmost position in that AS_SEQUENCE.  If the leftmost segment type is not an AS_SEQUENCE, then the local BGP speaker needs to create it and place the local AS in the leftmost position.  If it is of type AS_SEQUENCE, then the local BGP speaker simply places the local AS number in the leftmost position.  If there’s no AS path data (e.g., the route was originated locally) and the update is being sent to an internal BGP peer, then the local BGP speaker sends an empty (length == 0) AS_SEQUENCE.  The length field associated with the AS path segment attribute identifies the number of AS paths contained in a given segment, not the size in octets of the segment.

So, the reason all that’s important is because if a given BGP speaker receives an AS path that has an invalid length, or is in some manner malformed, then the receiving BGP speaker should send a NOTIFICATION message to the peer indicating “Malformed AS_PATH” and tear the BGP session down.  Ideally, when malformed AS paths (or other attributes) are generated the BGP sessions would have problems only at the originating AS and the immediately adjacent upstream, and not effect other networks – i.e., implicitly be squelched as close to the source of the problem as possible, and not result in wider instability in the routing system.  However, because different routers on the Internet run different versions of BGP routing software, and some vendor implementations are more forgiving than others, what you end up getting is unpredictable propagation of malformed updates (e.g., such as those the Cisco knob noted above introduces), updates that result in BGP routing session tear down multiple AS hops upstream from the originator or source of the malformed update.   I should also note that the originator may well not be the source of the malformed update, as any intermediate BGP speaker within a given AS rebuilds the update and could introduce problems.

Furthermore, it’s important to realize that when you tear down a BGP session with a peer, all advertised routes in each direction must be removed, not just the route(s) in the update(s) that triggered the session tear down.  After a session is torn down, a clever implementation might opt to keep some amount of state as to exactly which update triggered the NOTIFICATION – if it can glean such information from the received NOTIFICATION (which is usually unlikely), so that when an attempt to re-establish the session occurs – which is usually near immediate, advertisement of that update is suppressed as to avoid triggering another session tear down.  However, such specification tweaks by clever implementations are rare in practice, and for good reason.  If it were to suppress the update that triggered the session tear down, you’d get non-deterministic reachability for the prefix contained in the update, and that’d be bad.  So instead, what usually happens is that the session flaps and “clever” implementations exponentially backoff session re-establishment times until the problem is resolved, with a lot of instability triggered during any intermediate state.  This works if it’s between adjacent networks near the origin of the problem, but is really pretty ugly and unacceptable when such malformed updates can be propagated far and wide before session resets are triggered.

A while back, we updated the BGP Confederations RFC 3065 with RFC 5065 to specify that BGP AS confederation segments (AS_CONFED_SET and AS_CONFED_SEQUENCE) must not be sent to peers outside the local BGP confederation, and added error handling procedures that specified if those segment types are received from an external BGP peer, a “Malformed AS_PATH” NOTIFICATION message must be sent, and the session torn down.  This was on the heels of an event where an implementation was sending those segments to external BGP peers, and sessions were oscillating as a result.  This was between adjacent networks only.

However, another recent BGP incident highlighted similar concerns, that of inclusion of BGP Confederation segments (AS_CONFED_SET and AS_CONFED_SEQUENCE) in AS4_PATH attributes.  In short, AS4_PATH attributes are used to “tunnel” 4-octet AS numbers across 2-octet-only ASes.  An implementation was including these AS_CONFED_* segments in the AS4_PATH data, they were being tunneled, and it was triggering session resets on remote networks when they were being un-encapsulated.

One might surmise that specially crafted AS_PATHs and update messages might well allow an attacker to launch remote targeted session disruption attacks with a technique such as this…

Comments

  1. Darrel Lewis 02/16/2009, 10:44 pm

    Danny,

    I’ll note that when we have explicit origin AS and path validation in a secure routing protocol, we’ll have a whole shitload of new problems due to the complexity of the system and the same bad code that causes the current problems.

    -Darrel

  2. Danny McPherson 02/16/2009, 10:47 pm

    Hah, I certainly can’t disagree with that Darrel, it’s just that the same old problems get boring after a while :-)

  3. Danny,
    Nice post.. i think the problem here is that a single prefix can cause the entire routing table to be reset. a way to get around these cases where there’s malformed BGP msg is to first check if the NRLI was read ok. If it was and the there was a glitch in some other attribute, a route-refresh feature per NRLI (instead of per peer) would fix this. An alternative is to just drop the update, and do a session reset after some time.
    cheers,
    –Ricardo

  4. Jakub Urbanec 02/17/2009, 6:12 am

    I have contacted the ISP here in Czech republic (Sloane). THey said:
    “Oh that thing from yesterday? It was just a tiny little bug…” :-(

  5. Hi,

    Has this issue been resolved already? Is there already a permanent fix for this problem? Is this IOS related problem? setting bgp maxas-limit seems to be just a workaround…

    Regards,
    Ron

  6. It was a previously unknown IOS bug. The preventive measure has been available for years and should have been used by everyone but was obviously not. See Oversized AS paths: Cisco IOS bug details.