How To Choose A Business Internet Provider
By Karl Denninger
Copyright October 2001. All Rights Reserved.
Reproduction or distribution, in whole or part, without express written permission
is prohibited.
The following is a GENERAL treatise on how to evaluate a claim that someone
is a "Backbone" provider of services to the Internet. Please note
that there are some caveats that apply to this set of general statements:
-
There is no way that I can provide specific information
related to your requirements without knowing what they are. That is, while
I can provide what I believe the criteria for calling someone a "backbone
provider" are, what I can't do is tell you what increased exposures
you may run into, or whether there are any that apply to you,
if a particular provider does not meet these criteria.
- The below is my considered opinion. Take it for whatever you believe
it to be worth.
- If you need a professional opinion for your specific circumstances
please contact me directly.
What makes a "backbone provider"?
Good question.
Many if not most ISPs selling business connectivity claim to be "directly
connected to the backbone", "part of the backbone", or some similar
language.
The first thing to understand is there is no Internet backbone.
That is, the Internet is a loose association of networks that talk to each other
- there is no central, core network that qualifies you for being "in
the group."
With that said, however, there are questions you can ask that will help
to determine how close to optimum the connectivity is that you will receive
from a given provider, in the general sense. Again, the caveat
is that this advice is general!
I consider the "de minimus" requirements for a provider to call themselves
"backbone connected" to be:
- You have your own address space and can document it externally (I can query
the ARIN whois servers and find blocks you own; your own claims are
irrelevant; what's published and public data is all that counts.) If you flunk
this test then YOUR selection of backbone carrier (or a decision on your part
to change those providers) can potentially force your customers to
renumber their networks. This is unacceptable and marks you as a "poor
second cousin" on the Internet, not a backbone-grade carrier. Worse, using
someone else's space makes your attempt to multihome rather hollow.
The reason is due to the way that BGP4 works in the core today; subnets of
a larger supernet where both are announced depend on nobody "black holing"
routes from "known prefixes" of their peers. Unfortunately, many providers
do exactly that, which makes the "hole punching" capability
of BGP4 moot - and means that the redundancy you thought you had is
frequently not really there when something breaks upstream of you.
- You exchange full routing tables with at least two other carriers,
over BGP4, you have your own (one or more) ASNs, and those carriers
do not filter your announcements . This also means that
you do not point default - Period. If there is a default route in your
core equipment, you fail this test. This means that if I get on your network
and try to go somewhere valid but not currently active I must get back a refusal
to route that network from your core - not from your upstream(s). This
is important for two reasons:
- You can actually take an address owned by someone else and get it on
the net without asking "pretty please" to an upstream provider. If you
flunk this test then not only are you a poor second cousin but your upstreams
also don't trust you! That makes you a poor second cousin and
untrustworthy (in the eyes of the network core.) Bad combination.
- If something goes haywire you can actually use the redundancy
you claim. That is, not only are you load balancing, you're redundantly
connected at the routing level. To claim to be part of the core this is
a must. Without this criteria met any rerouting is dependent on the physical
circuit or protocol failing on the link with the present default
route. But the unfortunate reality is that most network failures do not
affect your specific circuit - which means that with default pointed somewhere
you can emit packets that go into a black hole, and this could go on indefinitely
unless someone manually intervenes. Without full route exchange (not
just announcement) and running "defaultless" you may be load balancing,
but you are not part of the Internet core.
- There can be no single point physical layer failure in your network from
your data center to the internet at large. This means that:
- You may not meet all your upstreams in the same physical facility. This
immediately precludes using the NAP as a sole "meet point" (it doesn't
preclude buying dedicated circuits to TWO NAPs, or a Nap and a MAE, etc,
but it does preclude pulling a line to one meet point and
doing it all there)
- You must not use the same telco carrier for all of your upstream (or
peer) links. The reason for this is the same as the reason for (1), but
at a lower level. If their DACS fails, you're screwed. If they have a
fiber cut, you're screwed. You must be protected against this to call
yourself part of the backbone and that means provisioning over more than
one carrier. The reasons for this should be obvious, particularly in light
of 9/11 of this year, although Hinsdale, Illinois is another good example.
In fact, Hinsdale was far worse for those affected by it - many of those
people were out for months rather than days or weeks.
- No single uplink failure can put your remaining proven (not claimed, purchased
or provisioned) packet carrying capacity over the 95th percentile of aggregate
load on your upstream connections, nor may the 95th percentile be exceeded
on your internal infrastructure to the remaining exit(s). (The 95th
percentile assumes "standard" dedicated circuits; if any leg of the
connectivity to any of your providers are ATM based the number is the
80th percentile, since ATM has such horrifyingly bad "fall over" behavior
when overloaded, even transiently. As such you basically must never -
even momentarily - exceed the actual available and usable bandwidth on an
ATM circuit or your performance will go straight in the crapper.) This one
is difficult to meet. It means that while you may have purchased a "burstable
DS3" if the provider it goes to can only sink or source 20mbps to or from
you at any given time you can't count that as a DS-3 - rather, you can only
count it as a 20mbps link. It also requires the statistical data to know
what your 95th percentile load factor is. Unfortunately the only way to
know what those who you connect to can REALLY handle is to test
them , and many providers get real pissed off if/when you do that.
(I never said it was easy to meet the requirements!) For a claim to be accepted
on this I would require the date and time of the test, the method (Treno,
etc) used, the results, and an offer to repeat it at times of my (the
customer's) choosing within a day or two prior to signing up. It also wouldn't
be a bad idea if you're a real stickler to put this into an SLA. This is important
because it determines if your customers will have acceptable or crappy performance
when (not if) one of your upstream connections dies.
Note that you can meet these requirements as either a peer or as a transit
purchaser (or as a hybrid of both); nowhere in here is your particular business
model addressed.
Nor is the actual speed of your connections addressed - all that is important
is that if one link fails the remainder of transport available exceeds your
actual 95th percentile load AND you can, on your own, reroute
around the damage.
These are "starters"; they're not all-inclusive, but they're a good "A list";
without the above you're simply not there, no matter how fancy the rest of your
installation is.
Without all of them (no fudging either!) you are not a "backbone connected
ISP." You are, in fact, a reseller of some backbone provider, with all their
limitations and warts plus your own!
You cannot determine compliance with this list from a network diagram alone.
You also need either ASNs or network address blocks, along with a view into
the provider's network (either via SNMP or physically on a connection you can
test from), along with significant disclosure on infrastructure that goes WAY
beyond what's typically on someone's published network diagram.
Confused?
Clarification for your specific situation - along with a specific evaluation
of your needs and how they may be impacted by various provider decisions you
may make - can be had by clicking on the below link.
Consulting
assistance, particularly for Internet-related projects, is available! Please
insert the word "advocacy" or "agree" in the subject line
of your message to avoid my spam filters.