Talk:Border Gateway Protocol

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated C-class)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
WikiProject Internet (Rated Start-class)
WikiProject icon This article is within the scope of WikiProject Internet, a collaborative effort to improve the coverage of the Internet on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.

See archive of May, 2006 "Requested move" discussion at /Capitalization

Clarification Request: I'm not quailified to edit the page, but it appears that there is no out from the Idle state in the BGP state machine diagram, while the text beside it, in the Finite State Machine section, it states that the idle state will initiate the connect state. It appears as if the idle state has no exit path. I don't know which is correct or I would edit it myself. —Preceding unsigned comment added by (talk) 17:05, 25 February 2008 (UTC)

Plain english[edit]

Anyone care to offer a non-technical explanation of the purpose of BGP, in language that my mum would understand? Such an overview would give any reader a conceptual framework into which the technical details here could fit. —Preceding unsigned comment added by Ianeiloart (talkcontribs) 10:44, 26 August 2009 (UTC)

I agree. This is incomprehensible to people who don't already understand everything about the Internet except BGP. dweinberger 16:25, 7 August 2012 (UTC) — Preceding unsigned comment added by Dweinberger (talkcontribs)

I had never heard of a BGP until I read the hosting service report on the cause of a 4 hour outage this morning. I agree that the page needs to be de-babbled. It is incomprehensible to anyone who does not already know what a BGP is! (talk) 21:01, 29 October 2016 (UTC)

IPv6 advantage[edit]

I removed this text:

One of the many advantages of IP version 6's huge address space potentially is to solve this by better use of route aggregation.

because it's not really true, and it would be confusing to explain why in a page that's basically about something else.

There is no mechanism associated with that larger address space that is explicitly designed to reduce the size of routing tables. In fact, the larger address space would allow for larger routing tables if it is not managed properly.

The two issues (address space size, and degree of aggregation) are totally orthogonal.

There is a mechanism that is part of IPv6 that might provide some help, the fast renumbering stuff. However, for that to be of any use, people must be willing to renumber their networks, to produce greater aggregation in the routing tables, and there is no empirical evidence that this will actually happen. However, this has little to do with the size of the address (only that large addresses allow use of hardware derived low-order parts - except that this is now deprecated on privacy grounds).

Similarly, if Multi6 actually agrees on a mechanism, and it gets OK'd by the IPv6 community, and it is adopted, implemeneted, and deployed, then that would help too - but again, this has nothing to do with the size of addresses. Noel 20:55, 14 Sep 2004 (UTC)

⇒ I think it is true. The huge address space of IPv6 enables you to assign a prefix to a customer or (from a RIR perspective) to a service provider and that assigned prefix would be so large that the service provider wouldn't have to come back for a larger prefix for many years. Today, most larger service providers have in the hundreds or thousands of IPv4 prefixes to cover their need of addresses. With IPv6, they would typically have one prefix - thus leading to a smaller routing table. Just look at DTAG or FT who both have a /19 IPv6, that will probably service their needs for the next ten years if not more. I do however agree that this information might not really belong on the BGP page ;) Kll (talk) 20:54, 19 July 2009 (UTC)

Terms: path vector versus vector routing protocol[edit]

"path vector".. don't you mean "vector routing protocol"? I am not confident enough to edit it myself.

Rembering my courses on BGP, I think that it's been given many names, e.g. depending on whether people wanted to insult it (for marketing reasons e.g). It's a kind of vector routing protocol (since it only knows next hop and a kind of sophisticated cost, it doesn't know the whole link through to the end) but it's really much more than a vector routing protocol since beyond direction and fixed cost, it also knows the "ASPATH" which lists the networks that will be traversed in that direction. This allows political decisions (I would rather my traffic didn't go across ZUZONET since they monitor traffic). That's why it is sometimes called something like a "path vector" protocol see Google... first several links look good. Mozzerati 05:59, 2004 May 27 (UTC)

"path vector" is the proper term, invented by Yakov Rekhter distinguish it from "destination vector" protocols such as RIP. Noel 20:57, 14 Sep 2004 (UTC)
I was not sufficiently precise above, I wrote in haste, sorry. "path vector" is a sub-set of "destination vector". Note also that "distance vector" != "destination vector".
All "destination vector" means is that the data that is passed from one router to another is a table (vector) of information about destinations; about basically (modulo policy constraints) all destinations, in fact. (Think of it as the complete routing table.) Contrast this with link-state routing (which I recently re-wrote in a major way to provide a precise, and hopefully readable, description), which is fundamentally radically different.
BGP does carry the complete AS path for each destination (for loop-prevention as well as making it available for policy decisions - initially the former was the more important, but nowadays the latter is), so your statement ("it doesn't know the whole link through to the end") isn't quite correct - yes, the BGP information doesn't specify each individual router in the path, but it does show which AS's are in the path which packets to that destination will take.
(Sigh, I need to redo the destination-vector routing page too, and also distance-vector routing, to make them all equally precise. The history of the terminology is a little confusing. At first there was only "distance" vector, which implied that the only data in each element of the vector, other that the destination identity, was a single metric, the "distance". [That's very ancient terminology, I'd have to do some research to track down its source.] Then Yakov came up with the PV term to emphasize that each vector element included an entire path. So then we started calling the entire class "destination" vector [in part so that the "DV" acronym didn't have to change ;-] to emphasize that its fundamental nature - as opposed to LS - hadn't changed. You still gave your neighbour the entire routing table, and the path selection computation was still a distributed one, as opposed to LS, where the entire path-selection algorithm runs in parallel on each node.)
I hope this makes things a bit clearer (until I get around to re-writing those two articles). Noel (talk) 17:40, 27 Dec 2004 (UTC)
Let me try to clarify the terminology. BGP is path vector, which, like distance vector (e.g., RIP), determines the preferred routes based on incremental calculations, rather than link state (e.g., OSPF), in which each routing instance has full information and calculates the full set of routes at once. Link state is more computationally intensive and, with present technology, cannot scale to the size of the global Internet routing table. Path vector differs from distance vector in a number of ways, chief among which is the loop detection mechanism. If a given AS finds its own AS in a path vector received from another AS, the AS knows the path is looping and discards it. Note that in the technique of AS path prepending to make certain paths less desirable, the AS is adding its own AS number to the path, which is a different and non-looping condition than receiving one's own ASN from another AS.Hcberkowitz 21:56, 17 May 2007 (UTC)

All ISPs?[edit]

Not all ISPs use BGP - smaller ISPs (eg tier 3) may be part of the upstream provider's AS. Unless someone objects, I'll soon amend the relevant text.

--Thedangerouskitchen 11:06, 3 Apr 2005 (UTC)

Seems like a good idea to me... That being said, the ISPs that don't run BGP tend to be very small with only a few exceptions. --Jwvo

I object, as a downstream ISP frequently runs BGP to its upstream. In such cases, the smaller ISP may use an AS number from the "private" AS number block that is never advertised to the public Internet. The downstream could be a simple external BGP peering or a confederation AS within the larger AS. Hcberkowitz 01:46, 24 May 2007 (UTC)

Requested move[edit]

In May, 2006, the article was moved to Border gateway protocol. After a "Requested move" discussion, it was moved back to Border Gateway Protocol. See the archive of the discussion at /Capitalization --NealMcB 22:25, 30 May 2006 (UTC)

I summarized the discussion very briefly at: Wikipedia_talk:Naming_conventions_(capitalization)#Capitalizing_standardized_names_for_protocols.2C_etc. and further discussion should probably happen there. --NealMcB 17:15, 31 May 2006 (UTC)

Expert request: Path Selection not part of standard?[edit]

I'm wondering if the secion on path selection is really about the BGP, or about a non-standardized algorithm for using the info that BGP provides. E.g. it says Prefer the path with the highest weight (Only on Cisco routers) but that would clearly not be in a standard. I see no use of the term 'path selection' in RFC 4271. Can someone who knows this stuff better address this? --NealMcB 00:38, 1 June 2006 (UTC)

⇒ <SSPecteR> As far as I know, there is no standard for path selection in BSP. The criteria is free for the AS to choose. Someone confirm this. It would be nice to have a section with general preferences to determine the best path. But how it is there (Quote: "BGP uses the following criteria ...") it is wrong. For now, Im taking this area off. Saturday, 2006-06-10, 06:22 (UTC)
⇒ The paths selection is totally up to the ASes, and in fact a certain set of preferences of the ASes in the network may cause route oscillations (that may never converge). It may be worthwhile to mention this. Here's a reference on this: Timothy G. Griffin, F. Bruce Shepherd, and Gordon Wilfong. Policy disputes in path-vector protocols. [Wed. December 20th 2006] —The preceding unsigned comment was added by (talk) 06:54, 20 December 2006 (UTC).
⇒ Totally up to the ASes is not true. Furthermore route oscillations occur in case of misconfigurations of the set of preferences with the complete set of rules of the standard Route Selection. The set of preferences should be seen as the arguments of the function 'Route Selection'. The RFC4271 defines the rules to apply when performing the Route Selection (section Having said that, it is true that vendors implement mechanisms which allow network operators to disable some rules. It should not be used. Each vendor has also specific rule(s) added to the standard ones. Sebastien Tandel —The preceding unsigned comment was added by (talk) 18:26, 19 January 2007 (UTC).
Hopefully, I've covered the basic rules from the standard, although there are lots of vendor-specific knobs to tweak them. On Cisco, for example, you can change attributes such as LOCAL_PREF or MED based on a regular expression pattern match to a community/communities, AS_PATH, or IP prefix range. You can also conditionally inject a route based on the presence or absence of a route in the routing table.
Path vector remains loop free only if you don't use any of the other selection attributes. I'm not saying that you should not use them; realistic routing policy implementation requires doing so. Nevertheless, the first rule of most Internet operational engineering groups is Do Things Only When You Have A Clue About What You Are Doing. A classic North American Network Operators Group curse is "You have no clue. You couldn't get a clue if you stripped naked, smeared yourself with clue musk, and hurled yourself into a field of horny clues during clue mating season." Howard C. Berkowitz 03:11, 10 July 2007 (UTC)

It seems like the path selection process given in this article is very confusing, and from what I've been learning in the Train Signal CBT for CCNP, incomplete. According to Chris Bryant, CCIE #12933, the BGP Path Selection process is as follows:

  1. Highest weight (Cisco Proprietary, as mentioned above)
  2. Highest local preference (LOCAL_PREF)
  3. Originated locally
  4. Shortest AS_PATH
  5. Best origin code (iBGP, eBGP, incomplete/? - in that order)
  6. Lowest MED (multi-exit discriminator)
  7. Prefer any eBGP over iBGP path
  8. Lowest IGP metric to BGP next-hop
  9. Use most recent path
  10. Lowest BGP RID (the RID comes from the peer IP Address, but the two are distinctly different)

I guess what I'm hoping is that the "Per-Neighbor Decisions" and "Decision Factors at the LOC-Rib Level" sections will be cleaned up a bit because it is very messy currently and difficult to read. Daedalus01 21:30, 10 July 2007 (UTC)

As I understand the purpose of a Wiki article, it is not a how-to as would be in a configuration manual. I've tried to walk the line of not going to excessive detail for an encyclopedia article; NANOG has a number of my BGP tutorials online, and one of my books is largely about service provider policies and BGP implementation.
It would help if you would say where you find the decisions hard to read, rather than just giving Cisco material that isn't precisely correct. I can try to find a balance between the two. Sorry, but the RFC is definitive rather than Cisco certification study guides, and indeed some of the Cisco list you give above is incomplete. For example, the peer IP address and the RID can be different, if you don't explicitly set RID or set the RID as the loopback0 address, and the peer interface happens to be the first one initialized. I was a certified Cisco instructor for many years, and I can only say that there can be a right way, a wrong way, and a Cisco way. That isn't meant to be critical of Cisco, which made certain default assumptions before the standard addressed them, and the IETF didn't always decide on doing things the way Cisco had assumed. I speak here as a participant in the IETF Inter-Domain Routing working group, which develops the BGP standard, as well as a coauthor of RFC 4098 on BGP convergence.
For example, Cisco will default to assuming that a missing MED is to be treated as a zero value, while the eventual IETF decision was to treat it as a maximum value. There is a Cisco configuration "knob" that lets you choose the Cisco default or the IOS default behavior.
Perhaps even more dramatic was that prior to RFC 4271, while AS path length was the principal selection criterion of every BGP implementation of which I was aware, the IETF standard did not directly consider length of the AS_PATH. Cisco, again, put in a knob to turn off AS_PATH length so the decision algorithm could be RFC 1771 compliant.
In the Cisco courseware I've seen, there isn't huge emphasis on the difference between Adj-RIBs and LOC-Rib, and, indeed, there is much simplification of real-world BGP in any of the general courses. There are invitational or private courses for service providers that go into some of those details, as as many NANOG tutorials by Cisco BGP specialists.
This is not meant to be critical of Cisco's BGP implementation, but courseware, and third-party exam preparation material is not at the same level of detail. Howard C. Berkowitz 22:01, 10 July 2007 (UTC)

OpenBGPD... being objective[edit]

I have removed the legend "that is much faster and reliable than Zebra or Quagga" in the OpenBGPD link.
We should make a comparison table about routing software.

AS vs. ASN[edit]

AS stands for Autonomous System wheres ASN stands for AS Number. I'll replace incorrect uses of ASN instead of AS.

Finite State Machine[edit]

While the state diagram on this page does much to aid understanding, is contains a fundamental flaw. From the "Idle" state it should be possible to move directly to "Connect" or to "Active". See the image talk page for more. 02:49, 28 April 2007 (UTC)

I've edited the original version to include a transition from "Idle" to "Connect", since this is the most common initial transition. Showing all possible transitions would get messy, and probably do little to aid understanding. Lukeritchie (talk) 23:48, 28 February 2008 (UTC)

The short explanation of the FSM does not correspond to what is described later in the bulleted list. Also, that list has inconsistencies, such as first saying that every error puts BGP into IDLE, but then saying that errors in CONNECT change the state to ACTIVE. i don't know enough about the protocol to correct this. —Preceding unsigned comment added by (talk) 23:18, 19 November 2009 (UTC)

MD5 password failures should be attributed to a failure in the connect state. MD5 hashing is a TCP function, — Preceding unsigned comment added by (talk) 02:43, 10 March 2015 (UTC)

Large IBGP Peer Meshes[edit]

I've deleted this sentence : "To avoid these problems, it is commonly recommended not to exceed three IBGP peers in a mesh." It is totally untrue and ridiculous. It is because of pitfalls like this one that we fall into some hard-to-resolve problems with BGP today! Sebastien Tandel

It's really not clear what is meant by a mesh. While it's implementation and AS specific, it is true that you do not want to have a huge number of directly meshed iBGP speakers in the same mesh. That won't scale, for a variety of reasons. What is done, in the real world, is to get BGP scalability by various combinations of route reflectors and confederations. You can have hierarchies of route reflectors, if you know exactly what you are doing: see RFC 3345 and the measures needed to prevent persistent oscillation. Hcberkowitz 01:50, 24 May 2007 (UTC)

I agree with you on the term *huge* but people have to understand it. Huge is not equal to three. I know that some (really) good Cisco router may easily handle 200 BGP sessions. Therefore, I do not know a lot of ISPs which should use route-reflectors. Of course, a Tier-3 won't own best routers but it will be far from having 200 BGP routers as well! I do think that route-reflectors are not to be introduced by *design* in the network but because it is known that there will be a problem of performance on the router and *only* in this situation. I fear that many ISPs are unfortunately introducing them for a bad reason. The paper you mention for avoiding persistent oscillation is working fine as long as there is no failure in the network. You can't guarantee anything in case of one link falling down (or a router)! I don't even mention when two links are falling down ... Don't tell me it does not happen! ;) For me, it is not a robust solution. Furthermore, persistent oscillation is not the only problem occuring with route-reflectors and one of them is that in some case, losing a route with route-reflectors is leading to *lost traffic* during the convergence and which would not happen with a full-mesh situation. This is really not wished by any ISP I know. Most of the time, whatever being a Tier-1 or a router vendor, they do not master route-reflectors (or confederations). They are a real headache and there are probably yet a lot of (bad) things to find out with them! :) Regards, Sebastien Tandel —Preceding unsigned comment added by (talk) 13:19, 14 September 2007 (UTC)

Routing Table Size, Router Resources, and Multihoming[edit]

While I have substantial experience here, I'm still not getting the formatting for references in Wikipedia. Some references I'd like to cite are my own works, the bulk of which are available freely online, plus one book. Most of the work are presentations or documents from NANOG and the IETF. Here are some examples for which I need to understand how to format references. Don't worry; there will be plenty of references I didn't write.

  • RFC 4098, H. Berkowitz et al., "Terminology for Benchmarking BGP Device Convergence in the Control Plane"
  • H. Berkowitz, "Exterior Routing 201: the Full Picture". North American Network Operators Group, February 2001,
  • H. Berkowitz, Building Service Provider Networks, New York: John Wiley & Sons, 2002.

Hcberkowitz 18:25, 2 June 2007 (UTC)

BGP problems and mitigation section[edit]


The recently revised "BGP problems and mitigation" section classes all the problems as scalability issues. This doesn't seem to be entirely true; I'm thinking in particular of instability. Instability obviously affects scalability—the more updates per router, the more total load on routers. But it also causes problems like packet loss and forwarding loops [1]. These problems could happen even in a small network, so they don't fall under scalability. --Nethgirb 06:31, 9 July 2007 (UTC)

We may need to agree on a definition of scalability, and I'm trying to keep some real-world examples at reasonable length. As long as an AS is connected to the global Internet, no matter how small it is, it can affect the scalability of the entire Internet. You are correct that these errors could take place in a small network, but remember we are talking about BGP, for which small networks rarely have a need. If we were talking about IGP's, I'd agree with you that it is primarily an instability issue that could stay isolated. With BGP, it's likely that a small network instability, in an ISP connected to the Internet, or a military base connected to the secure SIPRNET, could cause widespread problems affecting scalability in many networks.
One horrible example was a mid-sized ISP that was multihomed to two upstreams, one being Sprint. Quite appropriately, Sprint sent the ISP both its aggregates and its customer routes. Since many other AS are multihomed to Sprint and some other AS, a downstream can take full routes including multihomed customer routes and pick the best upstream to reach a specific destination.
In this case, the ISP, through misconfiguration or lack of knowledge, imported all Sprint routes into its interior routing protocol, which, surprisingly, didn't crash. It then exported its entire routing table, including the Sprint routes, to the other ISP, which had not implemented some safeguards. To all the customers of the second ISP, the mid-sized ISP, with a T1 connection, appeared one AS closer to many Sprint customer AS than Sprint itself. A significant part of the Internet then rehomed to the small AS, which choked on the bandwidth. Howard C. Berkowitz 16:49, 9 July 2007 (UTC)
Ah, I'm sorry, by "small network" I meant a hypothetical Internet with only a few ASs (rather than a small AS). Let me rephrase... I'm saying that instability causes multiple problems. One of them is driving up load on routers, hence leading to scalability issues. But a second problem, in my example above, would happen even if there were only a few ASs on the Internet and all routers had CPU to spare. So that second problem can't be called a scalability issue, at least not in the sense that I'm used to using the term (having to do with performance as a function of the size of the system in question).
BTW I do like your real-world examples; I think those help a lot --Nethgirb 04:43, 10 July 2007 (UTC)
Also—in addition to the fact that stability causes problems unrelated to scalability—there are scalability problems that are unrelated to stability (e.g., number of prefixes pushing the memory bounds). So while there are connections between stability and scalability, I think those two ideas should be presented separately. --Nethgirb 04:59, 10 July 2007 (UTC)

Black holes[edit]

Regarding black holes, does someone have a reference describing how they're used in practice? I don't follow the current example. It says

An AS that has been allocated the address space might only advertise that range, for which it has not assigned all subnets. Assume, for example, that only the first half has been assigned, or This AS will still receive traffic for, but will discard it without withdrawing the unassigned routes and thus forcing neighbors to recompute.

I think the scenario is not fully laid out here. Why wouldn't the AS just advertise to begin with? And is this really a notable issue to discuss? What problem with BGP do black holes solve—or, rather than being a solution, are they merely a side effect of aggregation? --Nethgirb 06:31, 9 July 2007 (UTC)

I'm sure there are examples in NANOG presentations, but I have to see which ones. It's a little embarrassing to say that I've probably presented it, but I have to remember which of my tutorials has the best example. Let me answer briefly here, and try to find some online examples. With due regard for personal interest, there are examples in my book, Building Service Provider Networks.
An AS that has been allocated the address space might only advertise that range, for which it has not assigned all subnets. Assume, for example, that only the first half has been assigned, or This AS will still receive traffic for, but will discard it without withdrawing the unassigned routes and thus forcing neighbors to recompute.
I think the scenario is not fully laid out here. Why wouldn't the AS just advertise to begin with?
Perhaps I oversimplified with a contiguous allocation. In my direct experience, lots of AS do not aggregate at all; I've seen new ISPs given a /19 that promptly advertised the /19 aggregate and every /24 that could be created from it, even though many of the /24s are not used, and the actual assignments include /28's, /25's, /26's. etc. You are correct that we want as much aggregation as possible.
By "as possible", consider a multihoming scenario: ISP AS1 has, and ISP AS2 has AS3 is a multihomed customer of both, using the subassigned block, which it advertises to AS1 and AS2.
For multihoming to work, AS2 has to advertise, as well as its own aggregate, from AS1. Things then get subtle, because if AS1 continued to advertise only its own aggregate, the rest of the Internet would see the more-specific from AS2, and no traffic to that would come through AS1. The only way that the multihoming will work is that AS1 can't just advertise its aggregate, but must also advertise all of its multihomed more-specifics. So, AS1 advertises and, while AS2 advertises, while AS2 must advertise and
Returning to my earlier example, assume that the AS actually sub-assigned, and, and that none of these are multihomed through another AS. Let's say the primary AS's link to goes down.
Is there any point to telling the rest of the Internet that this /18 has gone down, forcing them to reccompute their routing tables? Alternatively, if they add, is there any point to announcing that and forcing recomputation? By advertising their entire aggregate, they don't pollute the Internet with ICMP Destination Unreachables if someone sends to an undefined address within their aggregate.
So, blackholing addresses two problems: route recomputation and ICMP floods. Howard C. Berkowitz 16:49, 9 July 2007 (UTC)
And is this really a notable issue to discuss? What problem with BGP do black holes solve—or, rather than being a solution, are they merely a side effect of aggregation?
They are best common practice to minimize the number of routes in the global routing table, and also reduce instability and routing table recomputation when an ISP adds additional single-homed customers to its address space. Did the above make the reason clear?Howard C. Berkowitz 16:49, 9 July 2007 (UTC)
Thanks. Your explanation is clear and it reinforces my opinion that black holes should be presented with an aggregation focus rather than listing them as a separate problem or solution. The goal is to aggregate as much as possible: there are interesting subtleties about when you can and can't do this, including your multihomed example above (which doesn't involve black holes). And I think black holes are another example of where the "aggregate as much as possible" goal naturally leads. --Nethgirb 04:53, 10 July 2007 (UTC)
You're quite welcome. Your comment makes me wonder whether we need articles on several subjects that deal more with the use of BGP in Internet operational engineering than the protocol itself. One of the challenges, which I think we both see, is deciding on the extent of subtleties that belong in Wikipedia. There is a multihoming article. There is an article on Aggregator that is specific to the Web, not IP, apparently making the all-too-common assumption that the WWW is the Internet.
I may or may not be the person to write a short description of multihoming, as I'm more used to writing descriptions at book level or multi-hour tutorials, which still focus on the BGP (or other) behavior rather than the how-to.
Internet growth and stability are issues as well. I was involved in an Internet Research Task Force effort on "Future Domain Routing", which considered when the model of Internet global routing with BGP will be exhausted and new interdomain protocol needed. BGP multihoming by stub AS, variously for high availability and traffic engineering, is getting more common than Internet architects expected; I'll have to see if I can find any of the presentations from an Internet Society panel in which I participated, dealing with the limits of scalability of current techniques.
Coming back to black hole, there are other stability applications. For example, if an ISP detects a denial-of-service attack against a host or address range, the security software can generate a blackhole route for the target. Blackholing the target, of course, denies service to it, but it stops "collateral damage" to other parts of the ISP and the Internet. Either before or after the DoS blackholing, a similar iBGP route to the host or subnet can be sent to a security analysis sinkhole. Intuitively, I suspect these applications of blackholes (and sinkholes, and even honeypots) are beyond the scope of the BGP article. The current DoS article, however, does not mention blackhole and sinkhole routes, which have many subtleties -- these routes stay inside your AS. That may involve routing policy, or the well-known community NO-EXPORT.
Now that I've either braindumped or given a primal scream, is there a place to discuss the meta-issues about articles? Does the computer networking project do such things?Howard C. Berkowitz 15:06, 11 July 2007 (UTC)
I agree on both fronts: a section on black holes would fit well in Denial-of-service attack#Prevention and response, and Wikipedia:WikiProject Computer networking is a good place to discuss organization of material among the articles. --Nethgirb 19:38, 11 July 2007 (UTC)
I'm wondering if there logically would be two blackhole articles. The first deals with aggregation and the stability issue of avoiding ICMP floods. The second would be the DoS defense method of dynamically injecting a blackhole host (or subnet) route into your AS's iBGP. Sinkhole routes (which may be anycast) also fit with the second group. Honeypots are related, although I'll have to think about how much. Howard C. Berkowitz 20:35, 11 July 2007 (UTC)
I don't see how avoiding ICMP messages is a stability issue—seems like efficiency to me. But I agree, it would make sense to discuss black holes both here (for aggregation) and at the DoS article (for blocking traffic). --Nethgirb 21:17, 11 July 2007 (UTC)
ICMP flooding could be considered under denial of service just as well as stability. I tend to prefer the latter, because it can be caused by simple ignorance or error rather than deliberate attack. Consider a situation where a bad address (i.e., one that could be blackholed) gets into wide use, and attempts to reach it start clogging bandwidth. I'm thinking of examples where some home router company hard-coded a NTP server address into the firmware and variously got it right (and swamped the NTP server, which was meant for upper strata only), or got it wrong (and generated lots of ICMP destination unreachable, but kept retrying).
There are other aspects of instability and convergence that may or may not fit here. Indeed, whether they are instability, rather than slow convergence or failure to converge, is debatable. One of the first presentations was "Experimental Measurement of Delayed Convergence"; there have been later ones, but I try to make sure to point first to some of the ones Abha Ahuja did before her untimely death. You also might find some of the work at interesting, especially the "skitter core" measurements that define the effective "core" of the modern Internet, which, as we know, has no "core" in the sense of specific backbone. Howard C. Berkowitz 23:18, 11 July 2007 (UTC)
I think we must be using the word "stability" in very different ways (which would explain some of our previous miscommunications). I use the word to refer to the rate of change of routes between source-destination pairs. This seems to be consistent with usage in the RFD RFC [2]. Since ICMP flooding does not involve changes in routes or affect the rate of change (except indirectly), I can't see how it can fall under the heading of "stability". Do you use the word in a different way? --Nethgirb 23:57, 11 July 2007 (UTC)
Regarding the appropriate level of detail and whether to split off articles about BGP operational experience: personally, having lots of details does not worry me too much, as long as they are important details. What is critical, however, is that those details are built on top of a coherent high-level description of what's going on. (E.g., this is why I am worried about whether blackholing is about stability or about aggregation. Too many Wikipedia articles seem like a collection of unsorted factoids.) And actually, fitting everything into a coherent framework might be slightly easier if all the material is present in a single article. At some point an article just gets too big and has to be partitioned, but I don't think we're at that point yet.
For multihoming specfically, I could see that being separate from this article, since there are issues with multihoming that have nothing to do with the BGP protocol.
Please do point me to your Future Domain Routing routing work, as I have an interest in that topic. Thanks, --Nethgirb 19:38, 11 July 2007 (UTC)
There wound up being two teams and two reports, which was fine as we were looking at requirements rather than being anywhere close to a specific protocol. There were two teams (I was in Team B), with cross-comments back and forth. I think this is the final draft; for some reason I've never understood, IRTF and sometimes IAB workshops/special groups never publish a final RFC: Dmitri Krioukov, at the least, is considering next-generation protocols. Howard C. Berkowitz 20:35, 11 July 2007 (UTC)
Thanks for the link --Nethgirb 21:17, 11 July 2007 (UTC)

I'm by no means an internet nor wikipedia expert, but I noticed there's no coverage of BGP hijacking on Wikipedia. However, there's growing evidence this is being exploited in the wild [3] and [4]. — Preceding unsigned comment added by Erwin glassee (talkcontribs) 22:02, 17 January 2014 (UTC)

The Meaning of Stability[edit]

I'm copying your comment from blackholes, because to count enough colons to indent properly, this is nearing the point at which I would have to take off my shoes. :-)

I think we must be using the word "stability" in very different ways (which would explain some of our previous miscommunications). I use the word to refer to the rate of change of routes between source-destination pairs. This seems to be consistent with usage in the RFD RFC [5]. Since ICMP flooding does not involve changes in routes or affect the rate of change (except indirectly), I can't see how it can fall under the heading of "stability". Do you use the word in a different way?

We are using it differently, which is a little ironic given that I was the one that argued, for RFC 4098, that we needed to look at convergence in the control plane alone rather than declaring things had converged when the router/AS could forward a packet to the new destination.
I am using it here in the combined sense of the control and forwarding plane. Flapping, bad route announcements, inconsistent AS (which really aren't an error but inelegant), inappropriate deaggregation, slow withdrawals, etc., are stability problems in BGP (i.e., the control plane). My usage of "stability" is more from an NANOG than IETF viewpoint: an AS is unstable if it becomes erratic in its ability to deliver traffic, with due respect to the principle that datagram delivery is unreliable.
In the way I have been using it, if all the routing tables for all the routers in an ISP AS were completely correct, but the upstream connectivity was totally choked due to a DoS attack (e.g., smurf, ICMP flood, Slammer, random flood to TCP port 179 (BGP), etc., the AS was "unstable" in the sense that it could not forward.
Given that this is a BGP article, your usage indeed may be more appropriate. Again, this comes back to the relationship among articles. There's an ill-defined area of "operations" that I addressed at book length, which spans the gamut from good BGP practice, the many forms of multihoming, to being sure that you have electrical power. Apropos of the latter, continuing to operate in disasters is its own topic, but a very important one considering the traditional telephone company obsession with "once up, always up". Occasionally, I collaborate with a network assessment of a provider's layer 2/3 network before they put in a carrier-grade VoIP Class 5/4 switch and SIP gear. Very often, their in-place IP network simply isn't at that level of reliability and it needs to be reengineered before they go to production VoIP. Howard C. Berkowitz 00:34, 12 July 2007 (UTC)
Ah, you are using stability to mean reliability. or, in the context of the American Heritage Dictionary [6], I was using definition 1a or 1b and you were using 1c. (For the purposes of the article, I think it would be cleanest to reserve stability for 1a and 1b, and reliability for 1c.) --Nethgirb 07:58, 12 July 2007 (UTC)


Wired says that a revision of BGP is in the works. This should probably be mentioned, and most likely get its own page, eventually. But should it get its own page now? superlusertc 2008 August 29, 03:49 (UTC)

Communities ARE transitive[edit]

Communities are transitive, per RFC 1997:

   This document creates the COMMUNITIES path attribute is an optional
   transitive attribute of variable length...

I don't know whether it is true that communities are rarely propagated beyond the peering AS, so I left that sentence.

Bergonz (talk) 08:51, 11 September 2008 (UTC)

BGP detailed state diagram[edit]

Guys, I'm doing research using BGP. I created this state diagram using the specifications of RFC 4271 only. You are welcome to use the diagram in wikipedia if you like. I uploaded it to the following location File:K.Nevelsteen-BGP-4x.png (It is big, so you might have to scale it a lot. I'm sure you all are more up to date on wikification). The diagram is created without the following ...

  • TCP resources init and release, connection attempt, listening
  • BGP resources init and release
  • oscillation damping
  • Connection Collision
  • If the value of the autonomous system field is the same as the local Autonomous System number, set the connection status to an internal connection; otherwise it will be external.
  • One reason for an AutomaticStop event is: A BGP receives an UPDATE messages with a number of prefixes for a given peer such that the total prefixes received exceeds the maximum number of prefixes configured. The local system automatically disconnects the peer.
  • Each time the local system sends a KEEPALIVE or UPDATE message, it restarts its KeepaliveTimer, unless the negotiated HoldTime value is zero.

Regards, --K.Nevelsteen (talk) 14:10, 20 January 2009 (UTC)

It uses 20 bytes per header[edit]

I removed the text

   It uses 20 bytes per header.

because it was in a strange place and it wasn't at all clear what it was talking about. Someone with the knowledge should explain this and put it where it belongs. — Preceding unsigned comment added by Gnebulon (talkcontribs) 16:10, 7 February 2011 (UTC)

Explantion of how internet routing works[edit]

So there's currently no place on Wikipedia where you can find an explanation of how routing occurs on the modern Internet. I'm not sure it belongs here, but is it more appropriate on Routing or Router (computing)? Please weigh in.

The explanation I'm referring to is the way a router has a list of IP address blocks (like which are accessible via each interface, and sends the packet out the interface matching its destination IP. Also, please correct me if I got that wrong.

--Qwerty0 (talk) 00:54, 21 December 2011 (UTC)

I found some discussion in Routing table and Packet forwarding. Some additional cross linking and summarization would be helpful. --Kvng (talk) 15:36, 22 December 2011 (UTC)
Hmm, I hadn't seen Packet forwarding before. It's interesting, but still doesn't have the specificity I'm looking for.
Think of the example where someone comes to Wikipedia wondering how a random router knows what direction to send a packet addressed to It can't have a listing for every IP address in the internet, so how does it decide what direction to send it?
That's the question I'd like to answer in this article (or maybe in Routing), using the explanation I referred to in my first post. Any objections? Corrections? Suggestions?
--Qwerty0 (talk) 09:37, 24 December 2011 (UTC)
Perhaps you are referring to Route-aggregation? Where traffic is sent using summary routes. If so, take a look at Supernetwork. And (as you asked for it): I believe routing table entries are (have always been) networks, not individual IP numbers. - Snaily (talk) 05:19, 1 February 2012 (UTC)

Cisco-specific lines[edit]

I Removed these lines from the beginning of the article:

In the Cisco operating system, IBGP routes have an administrative distance of 200 and that of EBGP is 20; IBGP is thus less preferred than either external BGP or any interior routing protocol. Other router implementations also prefer EBGP to IGPs, and IGPs to IBGP.

I felt like they were much too specific for the introductory paragraph, and didn't fit with the general description of iBGP/eBGP that surrounded them. I tried to find another place for them to go (maybe "Per-neighbor decisions"?) but couldn't decide on where/whether they should go back in. Feedback welcome.

--Phinze (talk) 09:05, 28 December 2012 (UTC)

Exploits[edit] — Preceding unsigned comment added by (talk) 18:50, 24 November 2013 (UTC)

BGP confederation merge proposal[edit]

I proposed merging BGP confederation into this article. Disavian removed the merge banner from BGP confederation with the edit comment reproduced below. I have restored the merge banner in hopes of having a wider discussion about this proposal. ~KvnG 19:00, 27 February 2014 (UTC)

The BGP article does not clearly define this term, which is linked from other articles - rv merge tag

For symmetry, this merge would also need to involve Route reflector. Instead of merging, I've decided to improve BGP confederation. Once this article is improved a bit, it may be easier to see how or whether to do these merges. ~KvnG 15:41, 16 August 2014 (UTC)

MP-BGP redirection[edit]

MP-BGP redirects here (to BGP). Should redirect to Multiprotocol_BGP. Kjrrp (talk) 15:05, 9 June 2014 (UTC)

Yes check.svg Done ~KvnG 13:20, 7 June 2014 (UTC)

By clicking the "Save page" button, you agree to the Terms of Use and you irrevocably agree to release your contribution under the CC BY-SA 3.0 License and the GFDL with the understanding that a hyperlink or URL is sufficient for CC BY-SA 3.0 attribution.

Feel free to call at 704 in this wiki for more questions, Thank You! — Preceding unsigned comment added by (talk) 00:06, 6 August 2014 (UTC)

Merged in 512k day[edit]

Based on discussion at Wikipedia:Articles for deletion/512k day, I merged 512k day here, into the Routing table growth section. It might be too much content for the section, so feel free to trim as you see fit. Oiyarbepsy (talk) 15:43, 2 January 2015 (UTC)

State machine transitions[edit]

The text is incorrect in some places, and contradictory in others. Example:

  • Idle State:
    • Refuse all incoming BGP connections


    • Listens for a TCP connection from its peer.

The RFC[1] is a bit chunky, so it looks like whoever wrote this has combined the initial Idle State data with what to do after receiving a Manual/Automatic Start message.

I appreciate the irony of submitting criticism without proposing an alternative :) I do however think it's important to have a section about this here in the talk page, as this part could ultimately be better written.

Samrussellnz (talk) 23:09, 7 November 2015 (UTC)

Potential new version[edit]

In order to make decisions in its operations with peers, a BGP peer uses a simple finite state machine (FSM) that consists of six states: Idle; Connect; Active; OpenSent; OpenConfirm; and Established. For each peer-to-peer session, a BGP implementation maintains a state variable that tracks which of these six states the session is in. The BGP defines the messages that each peer should exchange in order to change the session from one state to another.

Idle state[edit]

  • No resources are allocated to the peer
  • All incoming BGP connections are refused

The finite state machine (FSM) will transition to a new state if it receives one of the following messages:

  • AutomaticStart/ManualStart:
    • Initializes all BGP resources for the peer connection
    • sets the ConnectRetryCounter to zero
    • starts the ConnectRetryTimer with the initial value
    • initiates a TCP connection to the other BGP peer
    • listens for a connection that may be initiated by the remote BGP peer
    • transitions to the Connect state

  • AutomaticStart/ManualStart (with PassiveTcpEstablishment):
    • Initializes all BGP resources
    • sets the ConnectRetryCounter to zero
    • starts the ConnectRetryTimer with the initial value
    • listens for a connection that may be initiated by the remote BGP peer
    • transitions to the Active state

BGP initializes all resources, refuses all inbound BGP connection attempts and initiates a TCP connection to the peer. The second state is "Connect". In the "Connect" state, the router waits for the TCP connection to complete and transitions to the "OpenSent" state if successful. If unsuccessful, it starts the ConnectRetry timer and transitions to the "Active" state upon expiration. In the "Active" state, the router resets the ConnectRetry timer to zero and returns to the "Connect" state. In the "OpenSent" state, the router sends an Open message and waits for one in return in order to transition to the "OpenConfirm" state. Keepalive messages are exchanged and, upon successful receipt, the router is placed into the "Established" state. In the "Established" state, the router can send/receive: Keepalive; Update; and Notification messages to/from its peer.

  • Idle State:
    • Refuse all incoming BGP connections
    • Start the initialization of event triggers.
    • Initiates a TCP connection with its configured BGP peer.
    • Listens for a TCP connection from its peer.
    • Changes its state to Connect.
    • If an error occurs at any state of the FSM process, the BGP session is terminated immediately and returned to the Idle state. Some of the reasons why a router does not progress from the Idle state are:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • Peer address configured incorrectly on either router.
      • AS number configured incorrectly on either router.
  • Connect State:
    • Waits for successful TCP negotiation with peer.
    • BGP does not spend much time in this state if the TCP session has been successfully established.
    • Sends Open message to peer and changes state to OpenSent.
    • If an error occurs, BGP moves to the Active state. Some reasons for the error are:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • Peer address configured incorrectly on either router.
      • AS number configured incorrectly on either router.
  • Active State:
    • If the router was unable to establish a successful TCP session, then it ends up in the Active state.
    • BGP FSM tries to restart another TCP session with the peer and, if successful, then it sends an Open message to the peer.
    • If it is unsuccessful again, the FSM is reset to the Idle state.
    • Repeated failures may result in a router cycling between the Idle and Active states. Some of the reasons for this include:
      • TCP port 179 is not open.
      • A random TCP port over 1023 is not open.
      • BGP configuration error.
      • Network congestion.
      • Flapping network interface.
  • OpenSent State:
    • BGP FSM listens for an Open message from its peer.
    • Once the message has been received, the router checks the validity of the Open message.
    • If there is an error it is because one of the fields in the Open message does not match between the peers, e.g., BGP version mismatch, the peering router expects a different My AS, etc. The router then sends a Notification message to the peer indicating why the error occurred.
    • If there is no error, a Keepalive message is sent, various timers are set and the state is changed to OpenConfirm.
  • OpenConfirm State:
    • The peer is listening for a Keepalive message from its peer.
    • If a Keepalive message is received and no timer has expired before reception of the Keepalive, BGP transitions to the Established state.
    • If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state.
  • Established State:
    • In this state, the peers send Update messages to exchange information about each route being advertised to the BGP peer.
    • If there is any error in the Update message then a Notification message is sent to the peer, and BGP transitions back to the Idle state.
    • If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state.


Seeking historical information including:

  • When was it first written
  • Who first used it
  • What did it replace/superseded.

Thanks. -- GreenC 14:42, 1 December 2015 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just modified 2 external links on Border Gateway Protocol. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

You may set the |checked=, on this template, to true or failed to let other editors know you reviewed the change. If you find any errors, please use the tools below to fix them or call an editor by setting |needhelp= to your help request.

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

If you are unable to use these tools, you may set |needhelp=<your help request> on this template to request help from an experienced user. Please include details about your problem, to help other editors.

Cheers.—InternetArchiveBot (Report bug) 03:07, 6 November 2016 (UTC)

  1. ^