In article <telecom24.556.10@telecom-digest.org>, James Carlson
<james.d.carlson@sun.com> wrote:
> bonomi@host122.r-bonomni.com (Robert Bonomi) writes:
>> "Incoming" traffic is an entirely different problem. And
>> load-balancing _that_ traffic cannot be done in anything approaching a
>> satisfactory manner without 'help' from the 'upstream' end.
> Indeed.
>> And it requires that both DSL circuits terminate at the same
>> 'upstream' provider.
> Not necessarily. There are at least two other possibilities here,
> both of which allow for connections to multiple providers:
> - NAT in use, and load balancing on a per-connection basis. This
> automatically balances the return traffic as well, as everyone on
> the net thinks you're actually two separate independent IP nodes.
NO, it does _Not_. You cannot change the NAT translation _during_ a
'session' (a single TCP connection). And if the 'incoming' data
characteristics change radically _during_ that session, the 'balance'
goes out the window.
Consider a scenario where there is -one- durable connection presently
in progress, which is, say, 'streaming audio' to laptop #1, and coming
in over circuit #1.
Now, the over the space of a minute, other 19 laptops each initiate a
web request to a trivial text-only web-page with the Windows XP SP2
update _information_ on it, including a link to a copy of the actual
service pack which resides on that same server. Oh, yeah, those
requests have the HTTP 'keepalive' protocol flag set. Circuit #2 is
'unused' at the moment, so -- based on traffic levels -- _all_ these
HTTP sessions are going to go on circuit #2, using 'source' addresses
that will cause return data to come in over that selfsame circuit #2.
Which _is_ reasonable at this point, the overall traffic from
retrieving the 19 copies of that text web-page is likely less than the
the one-minute block of streaming-audio.
*BUT*, now each laptop decides to download the actual service pack.
BAM! usage on circuit #2 goes through the roof. And circuit #1 is
still loafing along at a small fraction of capacity. Yes, this is an
extreme case, but it illustrates the point that there is
_no_PRACTICAL_way_ to balance the incoming load without active
co-operation from the 'upstream' end(s) of the circuits.
You (on the receiving end) *cannot* unilaterally (meaning "without
active cooperation from the remote end[s]) change which circuit those
packets are coming in over. You cannot suddenly switch the IP address
the laptops are using; "keepalive" is in effect, the request goes as
part of the _same_ connection, and of course, once the download
request was sent, the laptop is only doing 'listen and ack'.
> - You're a big company and you can afford to arrange BGP peering
> with the ISPs and inject routes into the backbone.
THAT doesn't solve the "problem", either. Not even 'mostly'.
Again, the original scenario was a 'site' with a maximum of 20
machines (all laptops) at the location. In _that_ situation, "good
luck" in getting a BGP announcement for a /27 (or smaller) propagated
past an immediate upstream. IF _they_ will agree to accept it. To
get a moderately-reliably 'forwardable' announcement, you're going to
have to use "gross overkill" sized blocks. you'll have a _terrible_
time providing adequate 'justification' to get PA space for that
application from an upstream. And you're way too small to qualify for
your own PI space, for _this_ application alone. Now, if you're big
enough that you've got a PI /16, say. *and* can afford to dedicate a
couple of /24s to this inefficient (at best, approximately a
_four_percent_ utilization of the address-space), you _can_ 'play
games' to influence the inbound traffic. *BUT*, to re-route
individual hosts, *without* changing their IP address, you are likely
to have to BGP 'announce' routes for a /32. Over time, this _will_
lead to fracturing of the space, and you _will_ be announcing separate
routes for most of those hosts, *individually*. *OR* you get _really_
wasteful of address-space, and _use_ only one address in each /24 --
thus needing about a /19 to support 20 laptops. With an address-space
utilization of approximately 0.25%. Yeah, this would work, but I
can't imagine that anyone would classify it as a "practical" solution.
We won't even go into 'what happens' to a pre-existing connection with
an _active_ stream of traffic when a route 'withdrawal' (for an
address presently routed via circuit #1) arrives at an intermediate
router before the 'announcement' for routing via circuit @2 arrives.
Note: if you already announce an 'inferior' grade of route through
circuit #2 for that address, you cannot ensure that inbound traffic
will come through circuit #1 -- it _is_ "guaranteed" that "somebody"
will be using a 'policy' bias in their routing that causes them to
select the carrier supplying circuit #2 over that or circuit #1.
*sigh*
> There are others as well that involve just living with the fact that
> you'll appear to be separate nodes on the net, and remaining
> multihomed -- this is what you'd probably do if you were doing this
> for (say) a web server with multiple A records.
Review the original context -- site supporting "a maximum of 20
_laptops_", wanting to load-balance the two circuits.
How many folks run servers (web, or otherwise)
_with_multiple_A_records_ on laptops? <*grin*>
Absent any specifications of the data flow, a reasonable "first guess"
is that the traffic will be mostly: web-page retrieval, e-mail
reading/sending, possibly some RSS feeds, along with some other
'streaming' incoming data.
Not guaranteed, of course, but absent better data for that scenario,
it is 'betting odds' that at least 95% of the total traffic is
'incoming'. Which means that the 'easy' approaches -- which are for
balancing _outgoing_ traffic -- aren't of much use. Especially since
-- for the aforementioned kinds of traffic -- there is _very_little_
correlation between incoming and outgoing traffic on a
machine-by-machine, or even connection-by- connection basis.
Obviously, the "more you know" about the actual traffic generated, the
better your chances of designing a policy that works effectively for
_that_ traffic mix. Caveat: even a minor change in traffic
characteristics can utterly invalidate a 'carefully tuned/optimized
for one particular scenario' balancing policy.
As soon as you have any form of 'durable' connection involved, any
attempt to balance things based solely on conditions at the time of
connection _initiation_ is doomed to to lead to 'far less than
optimal' balancing at a point later in time. 'Things change', and the
circuit assignment for that already established connection cannot be
modified to adapt to the fact that "the world has changed out from
under it".
>> *BUT* the 'standard' routing code _in_the_kernel_ of most operating
>> systems does =not= support multiple equal-priority routes to the same
>> destination, *with* rotating use of those routes on a per-packet
>> basis.
> Doing it on a per-packet basis ("round robin") is a mistake. It
> causes poor performance by reordering packets and often causes trouble
> with various middleboxes. Instead, you want to hash based on flow
> identification, which some systems can do.
THIS depends on what your objectives are. :)
If you're interested in maximizing your link utilization, _without_
regard to impact on QOS (as it were) to the users, the performance
hits due to out-of-order packet reception are "not my problem".
Packet re-ordering _may_ occur in some instances, *BUT* it is not a
guaranteed problem. Furthermore, if the pair of lines are 'bonded'
into a single logical circuit -- which requires cooperation from the
remote end -- then this "possible" issue effectively disappears.
If the circuits go to different end-points, you do get a whole raft of
other possible issues -- including, but not limited to, remote servers
that are multi-homed _on_ both of the networks you are connected to.
They see packets that are part of the same 'connection' arriving on
different interfaces. This _can_ confuse some kinds of systems,
notably 'load balancers' that exist in front of a 'farm' of
"identical" servers.
Also, if the circuits go to different end-points, then you _are_
likely to have issues with 'larger than single packet' communications
to "anycast" servers. They may well be routed to _different_ servers.
Available evidence suggests that this would be a 'vanishingly small'
issue for 'typical' _laptop_-origin traffic.