[nflug] Routing Problem that I would love some input on

Mark Musone mmusone at shatterit.com
Fri Oct 26 13:59:17 EDT 2007


Hrmmm..doing some reading on regarding zebra, I found something that might
work..multipath routing (not really part of zebra)

 

Here's a url for some more info:

 

http://www.tipsternet.com/articles/advance%20routing.htm

 

specifically:

 

Connection based tracking give you the most control but it may or may not
require a patch to your kernel. The concept here is to classify packets by
src, dst, port, dev, tcp or udp, etc and mark it with a flag. You then
create a routing rule to detect the mark and fling those class of packets
out a particular interface. Simple? It is. It a very powerful way of
manipulating routes. 

Tip: Keep your router as a router. Don't start sticking applications and
services on it. Locally generated packets may seem to work the same but they
are exempt from certain routing logic and manipulations.

vi /etc/iproute2/rt_tables

-------------------------------

#

# reserved values

#

255     local

254     main

253     default

0       unspec

#

# local

#

#1      inr.ruhep

#

# Add the custom lookup tables into the route file

100     Eth0

101     Eth1

102     Eth2

--------------------------------

 

#Assign a default route to each table to send data out the related Interface
[C]

ip route flush table Eth0

ip route add table Eth0 default via $gwEth0 dev eth0

ip route flush table Eth1

ip route add table Eth1 default via $gwEth1 dev eth1

ip route flush table Eth2

ip route add table Eth2 default via $gwEth2 dev eth2

 

 

#Clear out old rules

ip rule show | grep -Ev '^(0|32766|32767):|iif lo' \

  | while read PRIO NATRULE; do

  ip rule del prio ${PRIO%%:*} $( echo $NATRULE | sed 's|all|0/0|' )

done

 

#We will fling packets based on the mark on Packet (this is the RPDB) [B]

ip rule add fwmark 100 table Eth0

ip rule add fwmark 101 table Eth1

ip rule add fwmark 102 table Eth2

 

#Now we simply mark the packets we want with the mark to the interface [A]

iptables -F PREROUTING   -t mangle

iptables -I PREROUTING 1 -t mangle -m conntrack --ctorigdst $Eth0 -j MARK
--set-mark 100 -m mark --mark 0

iptables -I PREROUTING 1 -t mangle -m conntrack --ctorigdst $Eth2 -j MARK
--set-mark 102 -m mark --mark 0

 

Going from bottom up, from the perspective of the packet. 

A. Our friendly linux kernel relates every packet moving through it to a
connection defined by the original source and destination + ports. Whether
if its SNATTED, DNATTED or anyting inbetween, If a packet was part of a
connection that was originally destined inbound to Eth0 we mark it with 100.


B. When it is time to make a routing decision for that packet, the kernel
checkes the RPDB (ip rule) and fling packets marked with 100 out lookup
Table Eth0. 

C. Routing Table Eth0 will then direct all packets out interface dev eth0. 

 

Fine Tuning Multipath Outbound

Route cache is decent in maintaining your "session" out an interface, but
there is a time when you want to maintain that routing decision yourself. In
the middle of a connection, a route cache may expire due to inactivity or a
random gamma particle hitting your box, leaving any future packet free to
choose one of the two available outbound paths. 

When the route cache expires, you have a fifty-fifty chance of going out the
wrong interface. 

To fix this the first thing to do is to return our default path out one
interface. 

ip route replace default via $gwEth0

Then we create three new routing tables so we can get better control of our
routes. 

vi /etc/iproute2/rt_tables
-------------------------------
#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local
#
#1      inr.ruhep
#
100     Eth0
101     Eth1
102     Eth2
# New tables Pref0, Equalize, and Pref2
200     Pref0
201     Equalize
202     Pref2
--------------------------------
 
#Assign multipath routes to each table, preferencing a particular interface
ip route flush table Pref0
ip route add table Pref0 default nexthop via $gwEth0 weight 1 nexthop via
$gwEth2 weight 100
ip route flush table Equalize
ip route add table Equalize equalize default nexthop via $gwEth0 weight 1
nexthop via $gwEth2 weight 1
ip route flush table Pref2
ip route add table Pref2 default nexthop via $gwEth0 weight 100 nexthop via
$gwEth2 weight 1
 
#Add the rules matching marks to lookup tables
ip rule add fwmark 200 table Pref0
ip rule add fwmark 201 table Equalize
ip rule add fwmark 202 table Pref2
 
#Table Pref0 will always send traffic marked with 200 out Eth0, unless Eth0
is down.
#Table Equalize will send traffic marked with 201 out both interfaces
equally like discussed earlier.
#Table Pref2 will always send traffic marked with 202 out Eth2, unless Eth2
is down.

Now we simply mark outbound packets the same way we marked inbound packets.
If you want Joe from internal Host 10.1.1.144 to only use ISP2 then you
would add:

#mark joe with 202 and let ip rule (RPDB) send out Pref2
iptables -I PREROUTING 1 -t mangle -s 10.1.1.144 -j MARK --set-mark 202 -m
mark --mark 0

If you want to send all mail out ISP1 then you would add:

#mark smtp with 200 and let ip rule (RPDB) send out Pref0
iptables -I PREROUTING 1 -t mangle -p tcp --dport 25 -i eth1 -j MARK
--set-mark 200 -m mark --mark 0

This packet fling method allows us to fully control what packet and traffic
type goes out what interface. Because classified traffic will be forced out
a particular interface, we do not need to worry about route cache. 

This leaves us with one routing table that can still cause us problems. The
Equalize table.

How do we influence the equalize route choice so that it picks the same
interface it did prior to the clearing of the route cache?

The equalize forces the system to randomly pick one of the two nexthop
choices, so the second time it's forced to equalize a route it doesn't know
about it's previous selection.

To get this to work properly we need to equalize only the initial connection
packet, remember which way it went and use that same route for all new
packets belonging to the connection. The only way to do this is with the
CONNMARK patch. Here's how.

#mark all packets send out an interface with the proper interface preference
mark (if it's been equalized)
iptables -I POSTROUTING 1 -t mangle -o eth0 -j MARK --set-mark 200 -m mark
--mark 201
iptables -I POSTROUTING 1 -t mangle -o eth2 -j MARK --set-mark 202 -m mark
--mark 201
 
#the last line in POSTROUTING is the magic statement that stores the mark
associated with the connection.
iptables -A POSTROUTING -t mangle -j CONNMARK --save-mark 
 
#first line in PREROUTING will pull out the existing mark on the connection
for the packet
iptables -I PREROUTING 1 -t mangle -i eth1 -j CONNMARK --restore-mark
 
#Equalize HTTP Traffic with mark 201
iptables -A PREROUTING -t mangle -p tcp --dport 80 -i eth1 -j MARK
--set-mark 201 -m mark --mark 0

Here's the packet flow. The first HTTP packet from the internal system will
not be affected by the restore. It will hit the prerouting mangle criteria
for tcp:80 and set the mark to 201. The kernel uses RPDB (ip rule) and
lookup the Equalize table to pick a path. Apon exiting POSTROUTING the mark
is set to the interface it exits from (200) and saved.

Pretend route cache is flushed.

Any additional HTTP packet from the internal system relating to the previous
connection will hit "restore-mark" and pull back a (200). Because the mark
is not 0, no other marking will occur. The kernel uses RPDB (ip rule) and
lookup the Pref0 table for mark 200 and send it out Eth0 without bothering
with equalize. 

There you have it. My multipath routing setup in a nutshell.

 

 

 

From: nflug-bounces at nflug.org [mailto:nflug-bounces at nflug.org] On Behalf Of
Joshua Ronne Altemoos
Sent: Friday, October 26, 2007 1:35 PM
To: nflug at nflug.org
Subject: RE: [nflug] Routing Problem that I would love some input on

 

There are advanced routing programs out there like zebra that should be able
to do this for you!

 

From: nflug-bounces at nflug.org [mailto:nflug-bounces at nflug.org] On Behalf Of
Brad Bartram
Sent: Thursday, October 25, 2007 6:29 PM
To: nflug at nflug.org
Subject: Re: [nflug] Routing Problem that I would love some input on

 

Grr, I was hoping there was a way to do it just using the routing tables.
Somehow I had it set in my brain that there was a way, but I just forgot how
to do it.  Oh well, I guess I'll go back to my original idea and just put
the apps on different boxes and not worry about dealing with iptables or any
of that jazz. 

Thanks for the responses.

On 10/25/07, Robert Wolfe <robertwolfe at localnet.com> wrote: 

On Thu, 25 Oct 2007 16:21:41 -0400
"Mark Musone" < <mailto:mmusone at shatterit.com>  mmusone at shatterit.com>
wrote:

> Unfortunately, using standard routing tables, you can't L (I've been down
> this road before)

Same here, I've just been too busy today to answer the original posting :P 

> It's going to go out your default route. (Remember when we had that
problem
> with one of your customers and we put them on a different network..the
> inbound packets were fine, but the outbound ones went out the original 
> provider)

Yeah, it would be nice if you COULD do what was originally inquired about.
If it were possible in ANY way, I could load balance all my network
traffic without any problems :)  (well, without spending loads of money, 
too <G>).

> The other option is to use iptables..i think you might be able to do some
> dependency routing and send data out specific nic cards (or at least do
some
> header nagling)

Ugh!  There's that nasty word again, "iptables."  Almost as nasty as "vi."
:P

--
     Robert Wolfe (robertwolfe at localnet.com) | Systems Administrator 
       LocalNet Corp | Williamsville, NY | http://www.localnet.com
_______________________________________________
nflug mailing list
nflug at nflug.org  <mailto:nflug at nflug.org> 
http://www.nflug.org/mailman/listinfo/nflug

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.nflug.org/pipermail/nflug/attachments/20071026/ba910771/attachment-0001.html


More information about the nflug mailing list