Next: Bibliography Up: Title Previous: Example Implementation

Using NAT

Static Address Translation

I have already lined out possible fields of application for static NAT, so I just want to give some examples of how to use my implementation for doing static NAT. Different from Linux firewall rules that many people know by now I have introduced a unique id (an integer) for each rule, so that deleting a certain rule and inserting rules at an arbitrary position in the chain is easy. If the id specified for a new rule that shall be inserted has already been taken by another rule this other rule's id (and that of following rules, if necessary) is incremented by one so that the administrator does not need to leave space in advance, as it was the case with line numbers in BASIC-programs, where it was custom to increment the line number by ten in case lines had to be inserted in between.

Changing localhosts IP

Here I only want to show how my implementation translates not just forwarded packets, so that packets destined for or originating from localhost will be treated equally. This is a result of the design of this implementation that makes NAT an additional layer around the kernel's network functions, see the figure on page .

We have two hosts, one of them is a Linux PC using the NAT module. Its local IP that is used to configure the network interface is 1.1.1.1, but on the network we want to appear as134.109.192.223 to the other host (IP 134.109.192.123).

Assuming the network (including routes!) has been configured already on both hosts I only mention the additional steps necessary to translate the local 1.1.1.1 address:

Tell the NAT module to translate the IP 1.1.1.1 to 134.109.192.223 in outgoing packets and to do the reverse for incoming packets:
Using one bidirectional rule:
ipnatadm -O -i -b -S 1.1.1.1/32 -M 134.109.192.223/32
Or using two rules (equivalent to the above rule):
ipnatadm -O -i -S 1.1.1.1/32 -M 134.109.192.223/32
ipnatadm -I -i -D 134.109.192.223/32 -N 1.1.1.1

The rules can be read like this:

insert (-i ) a bidirectional rule (-b ) into the chain of rules:
If the packet will be sent (-O ) through any interface (no -W ), carrying any protocol (TCP,UDP,...) (no -P ), the source IP (-S ) is 1.1.1.1 and the port does not matter, replace the source IP (-M ) by 134.109.192.223.
Since it is a bidirectional rule it will also match incoming packets (opposite of -O ), if the destination IP is 134.109.192.223. For matching packets this destination IP will be replaced by 1.1.1.1. The NAT module knows it is the opposite direction because the NAT-function was called from the IP packet receiving kernel function and the rule has been bound to the out-direction.
The two alternative rules do exactly the same, but here we don't rely on the mechanism for bidirectional rules but do the translation manually.

Now host 2 can communicate with the NAT host using the IP 134.109.192.223, using 1.1.1.1 won't work even if a route for this address is inserted into host 2's routing table. Note that the implementation does not translate IPs inside the packets, so for example non-passive FTP from the NAT host to host 2 cannot work (wrong PORT command, it still contains IP 1.1.1.1 but the packet comes from host 134.109.192.223 from host 2's point of view).

Translating a Network

This is the classic case of NAT usage. In our example we have a big worldwide corporate network using class A addresses 53.0.0.0. There is a small network 138.201.0.0 that belongs to a department of the same company using the 53.0.0.0 network, but since they thought they would never need any connectivity with others or that there would be other solutions (like IPv6) before they needed it the administrators of that department did not want to go through the bureaucracy of the company to get a class B subnet out of the class A 53.0.0.0 addresses assigned, instead they choose some arbitrary addresses (not even knowing about RFC 1918 and private IPs). As we now know (it is always easy to say ``I always knew it!'') the Internet has grown so fast and with it the Intranets that it was just a question of (a short) time until the department's administrators saw their mistake. However, even if it is easy to get some 53.0.0.0-addresses from the company the department just cannot change their addresses now, since they have (production-)connections to many customers that rely on the connectivity 24 hours a day, 7 days a week and are of course not willing to accept any change saying 'it's not our business', and they are right to say so.

The solution, or let us say 'a solution' for there are many (as always!), is get the 53.0.0.0-addresses anyway and use static NAT on the physical network connection to the corporate Intranet.

There are various ways to use the NAT-module in this case. If we choose to bind the translation on the internal interface to our 138.201-network we need to have a route to this interface for network 53.3.0.0, the one we have obtained from our company, but if we translate on the interface to the company 53.0.0.0-network we need to have a route to 138.201.0.0. We can also do separate translations for packets from our net to the 53.-network and for packets coming in from the 53-network, because since NAT is a layer around the kernel there are always two points in the NAT layer a packet has to pass. It does not hurt to do this, but it is not nice, I would prefer to bind the entire translation for incoming and outgoing packets to one point.

The bidirectional NAT-rule we need to do all translations on the interface 'wan' is:

ipnatadm -O -i -b -W wan -S 138.201.0.0/16 -M 53.3.0.0/16

or, also possible and equivalent,

ipnatadm -I -i -b -W wan -D 53.3.0.0/16 -N 138.201.0.0/16

Again, we could also use non-bidirectional rules where we have to take care for packets in the opposite direction by specifying another rule accordingly. When we omit -b in the above two rules we have the pair of rules needed when no bidirectional rules would be possible with the implementation. As we can see -b simplifies writing NAT rules a lot.

Translating Ports

Support for port translation is very basic because here we really needed to keep some state information. The problem is to insert the original port into packets that are answers for packets we translated. Unless we keep that information if we do the forward translation we are unable to do the translation for the return packets, since we have absolutely no way to determine the port the client may have used. This is why bidirectional rules are completely impossible to use with this implementation, and doing the backward translation 'manually' by specifying an extra rule for it is not generic. Of course, I can specify a rule like

ipnatadm -O -W eth1 -i -D webserver/32 80 \
-N temp-replacement/32 8888

This will work, since we know exactly the IP we have to insert in return packets: it is port 80. So the rule for the return packets will be

ipnatadm -I -W eth1 -i -S temp-replacement/32 8888 \
-M webserver/32 80

This will take care that the clients connecting to the webserver see the expected source address and port in the packets they get back, which must be from the IP and port they sent their packets to. In this example we have also done IP address translation, not just port translation. Port translation alone makes less sense than IP address translation, but it may still sometimes be useful.

Two Networks using the same Address space

Simple Case

Why I call this the simple case you will understand after you have looked the the complicated case below. Here is the situation: there are two networks that have existed independently of one another for a long time, and nobody had ever dreamed that one day there could be a need to connect these networks together. Incidentally the administrators of each network have chosen the same addresses out of the private address pool as described in RFC 1918, so both networks now use 10.1.0.0 addresses. For some unknown reason that really does not matter for this example both networks now need direct IP connectivity. The obvious solution is that one of the networks must change its addresses, or if we do not want the other administrator to have an advantage both have to change their addresses. However, in real (corporate) environments changing the addresses of an entire network often is painful process that needs lots of work and lots of talking and conferences, especially when more than one administration domain is involved. Sometimes parts of the hosts even cannot change their addresses at all because the customers using them don't want it (see the example above). So many administrators would welcome a fast and working solution like NAT. Of course, there are others who will say NAT is a dirty solution, but everybody has free choice and I am only writing this for those who consider using NAT. In the end, if it works it works, dirty or not, and often such quick solutions have lasted for years. One could now argue that this was bad because it prevented better solutions to be developed or used, but there are always examples for either side in this discussion and I really don't want to start one here. That is why I just assume the administrators of our two networks have had a meeting and have decided to use NAT after considering some alternatives, and since they know their situation best I will not try to convince them of another way, I simply tell them how to do what they want with my implementation (assuming everything would work and it would be out of the ALPHA and even the BETA-state by than).

We have the two networks, both using now and for the foreseeable future 10.1.0.0 addresses for the internal hosts. Network B will be addressed from network A using 10.3.0.0 addresses, and network A will be addressed from network B using 10.2.0.0 addresses.

There are other combinations of rules possible than the following ones, but I give the bidirectional rules needed to convert Net B addresses on the interface eth1 of the NAT router, and to convert Net A addresses on interface eth0, that means both incoming and outgoing packets for a network are translated on the same interface. Packets coming from the network to the router destined for the other network are translated when they come in, and return packets from the other network are translated just before they are sent out on the same interface.

These are the rules, one for either network:

ipnatadm -I -b -W eth0 -S 10.1.0.0/16 -M 10.2.0.0/16
ipnatadm -I -b -W eth1 -S 10.1.0.0/16 -M 10.3.0.0/16

Now, when a host in Net A contacts a host in Net B, its IP (the source IP of the packet) is converted to a 10.2.0.0-address, so that it appears to have come from that network for the host in Net B. Net B sends its response to this 10.2.0.0-address, which will be routed to interface eth0 on the NAT router (see the routing table in the figure). Just before this answer packet is sent out to the host on Net A its destination address is changed to the 10.1.0.0-address of the host in Net A, so that this host recognizes the packet as an answer for the packet it had sent earlier. The router's kernel only sees the 10.2.0.0- and the 10.3.0.0-addresses but never the local 10.1.0.0-addresses, since the NAT-layer hides them from the kernel's network functions. The router's routing therefore works on virtual IPs that belong to no real host on any of the two networks the router serves. Here is the point to mention a bug in the implementation. It is not really a bug but missing corporation between the module and the kernel. What happens is that the NAT router will issue ARP requests for those virtual addressed. Everything works fine, though, it is just that there are some senseless ARP packets. I have not investigated the problem further after it became clear that first of all everything works and second, non-trivial changes to the kernel would be necessary. The latter conflicts with my intention to not interfere with the other kernel code unless it was absolutely essential for the module to work at all. This is even more important when we consider that large parts of the networking code have been rewritten in the 2.1 kernel series and I don't know how this new kernel version acts in this ARP case. It is not simply finding the function call causing ARP resolution before a packet is sent and placing my NAT function right before it, thereby preventing virtual IPs from being used in such requests, because this I have done already, NAT would otherwise not work at all in this virtual IP case. It seems to be connected to Linux' routing code that creates an entry in an rt(route)-cache, which also contains fields for address resolution information. Especially the routing code has been radically redesigned in the 2.1-series, so I don't see much sense in messing around in the 2.0 kernel code in this case.

Complicated Case

This is a variation of the above example. A similar case was introduced and long discussed when the group of people who started this whole NAT implementation thing discussed the aspects of NAT and what features we would like to have and so on. Unfortunately I was the only one left to do the implementation after months of heated Email discussions, although I really don't want to blame anyone for this; for me it serves as a (University) project I have to do anyway while the others already used lots of their spare time during the discussions.

The situation is basically the same as above, we only add something: There is a third network, using 138.20.0.0 addresses in this example, let's call it Net C. The tricky thing is Net A and Net C have already been connected using the router that now is our NAT-router and all the people in both networks have become used to the other network's IPs so that they are hardwired not just in the brains of the people but also in lots of code, e.g. in firewall rules in subnetwork-firewalls (where using DNS is a bad idea since DNS can be spoofed), or in /etc/hosts files. To summarize, whatever the reasons may be, Net C wants to continue talking to Net A using the 10.1.0.0 addresses and Net A people want to say 138.20.x.y to Net C-hosts.

This does not sound that complicated and it indeed is not, but I want to use it to show two different ways to solve the problem, one is using the packet matching code and the other one is to use a completely virtual address space. We have already had a completely virtual address space in the example above, but in this example if Net C connects to Net A using Net A's real IPs this is simple routing, I will make it more complicated for the example's sake. Note that it will be unnecessarily complicated for the real world, but this is an example used for demonstration.

Again, when we look at the routing table in the figure above there is not a single real world IP in it. This time the situation is different from the first example, though, since Net A needs to use the real IPs of Net C while in the above (simple) example all cross-network communication was done using virtual IPs.

At first the rules to create the virtual address space for the NAT router:

ipnatadm -I -b -W eth0 -S 10.1.0.0/16 -M 10.2.0.0/16
ipnatadm -I -b -W eth1 -S 10.1.0.0/16 -M 10.3.0.0/16
ipnatadm -I -b -W eth2 -S 138.201.0.0/16 -M 10.4.0.0/16

And now the rules needed for the 'specials' in this setup:

Net A wants to address Net C using 138.20.0.0, so convert this destination address to Net C's virtual addresses for routing:
ipnatadm -I -b -W eth0 -D 138.20.0.0/16 -N 10.4.0.0/16

Net C wants to see Net A as 10.1.0.0:
ipnatadm -I -b -W eth2 -D 10.1.0.0/16 -N 10.2.0.0/16

Now

Net A connects to Net C using 138.20.0.0, to Net B using 10.3.0.0
Net B connects to Net A using 10.2.0.0, to Net C using 10.4.0.0
Net C connects to Net A using 10.1.0.0, to Net B using 10.3.0.0

The number of rules doubles when we don't use bidirectional rules, which would make it even clearer what translations happen when and where, but would be more difficult to read. As long as the code used for bidirectional rules works as expected we can just rely on it to make our life and specifying NAT rules easier, because one thing is clear for me after writing both lots of firewall- and NAT-rules: debugging less than ten NAT-rules is more difficult than debugging 100 firewall rules. tcpdump was the tool I used most often.

Dynamic Address Translation

My example implementation does not include dynamic NAT. There are stubs for it, though, so including it is easy. Originally I wanted to implement static and dynamic NAT, but when I found virtual servers I changed my plans and abandoned dynamic NAT in favor for virtual servers. The Linux kernel already contains the special kind of dynamic NAT Linux'ers call masquerading, anyway.

RFC 1631, which describes dynamic NAT in detail, also tells us about possible uses. Another example for a possible use besides the ones described in the RFC is the following example. I did not completely invent the setup, I had the idea because in the company where I worked when I wrote NAT I encountered a similar situation, so nobody can say it is completely artificial and just a product of my imagination and there will never be such a situation.

Imagine the following setup: There are two departments, each one with their own private network (with some connections to the outside). For some reason they work together on a project and therefore connect their networks. However, department B is concerned about security and purchases a firewall, so that department A's access to the network B can be controlled. The procedure department A has to follow in order to get department B's firewall administrator to change or add rules is relatively complicated and slow, one reason being that nobody at department B has much experience with firewalls.

Now, after some time has passed and everything has worked well (more or less) department A decides it needs to hire more employees and therefore to increase its network. Since the class C network (network A-1) they have used so far does not contain many more free IPs a new class C network (network A-2) is used. The employees in that new network also want to access department B's servers in network B, but the firewall only allows network A-1 through. In addition, department B's firewall administrator is on vacation and the others don't dare touch the firewall. Luckily department A employs a bright administrator who knows NAT. He installs a NAT-router and establishes a dynamic NAT rule on it, mapping both network A-1 and A-2 dynamically to network A-1 addresses, thereby cheating the firewall.

This setup is indeed a bit unusual, but it also is a real live example. Maybe it is unlikely someone else will be in the same situation and it is also possible to find other solutions, but especially the latter that is not a good argument because it is always applicable. The purpose of this example is just that, to give an example, not to tell anyone what to do.

Another not so obvious example would be a redirector. Dynamic NAT could for instance be used to redirect all packets for any IP/port 'something fixed' to a single IP. Another way to achieve this with Linux is using the local redirect feature and have a user space program do this. The advantage here is that this redirector program also gets to know the original destination, which is essential for using this feature to redirect all port 80 connections to a local web cache, because the web cache must be told what IP to connect to.

Masquerading

To complete this dynamic NAT section one example for masquerading as it is widely used these days. We have a small office- or home-network consisting of two hosts and a Linux-server. The Linux server possibly provides print- and fileservices, and it also serves as a NAT-router to the Internet for the other hosts. All internal IPs are translated using the official IP given by the ISP, this IP is the Linux router's IP on the network interface to the ISP.

Linux masquerading is extremely popular and many application specific modules have been written, that (among other things) take care of translating IPs transmitted in the data part of IP packets. Unfortunately, I don't know of any serious efforts to combine the various NAT-parts that have been developed for Linux, like masquerading or some very basic NAT in the routing code of 2.1-kernels, and, not to forget, my implementation.

Virtual Hosts

As an example for a virtual server I have chosen a virtual webserver, due to its popularity and the simplicity of the protocol (compared with FTP, for instance).

The following rule, which gets the unique rule id (-Y ) 10, inserts a virtual server rule. The virtual server IP is 138.201.148.222, the algorithm used to select a real server is byte-counter, i.e. the server that has delivered less bytes than the other(s) is used:

ipnatadm -I -i -Y 10 -D 138.201.148.222/32 80 -X byte

Let us now add (-a ) the real servers to the virtual server rule identified by the unique id (-Y ) ten in the chain of NAT rules:

ipnatadm -I -a -Y 10 -D 138.201.148.171/32 8888
ipnatadm -I -a -Y 10 -D 138.201.148.150/32 -w 2

The first server's http-daemon listens on port 8888 for some reason we don't care about now. The second server is much bigger than the first server, so we assign it a weight of 2, so that it gets used twice as much as server one.

If one of the servers fails it can easily be removed from the virtual server rule, so that the other server(s) continue alone:

ipnatadm -d -Y 10 -D 138.201.148.171/32 8888

This command deletes the first server from the rule so that the virtual server now consists of just one real server. Here we get to another missing feature, which can easily be added but illustrates the long way we still have to go for a reliable implementation: When the server one has been repaired we want to take it back online, so we issue the command to append (-a ) its IP to the virtual server rule. What happens now is that the byte-counter for this server gets initialized with 0, and the algorithm used to determine the real server to use for new connections will use only the new IP for a while since the other server(s), which has(have) been serving alone for some time have a much higher byte-count. The same happens when a virtual server is online continuously for a very long time, when the byte counter produces an overflow, but this is less likely with an 'unsigned long' counter.

Virtual Routes

I have not implemented this feature although for me it is the most interesting one together with virtual servers, mostly because the idea occurred to me relatively late, when I was already working on virtual servers. I also do not know of any other implementation of it.

Performance

For testing the performance I used the following setup:

three Linux-PCs (Host 1 and 2 + the NAT-router)
Host 1 <-> NAT-router: 10Base2 (Host 1 has an additional LAN-connection)
Host 2 <-> NAT-router: 10BaseT (Crossover)
transfer (FTP) a ca. 20 MB file from Host 1 to Host 2 through the NAT-router and vice versa
measure the time using the command 'time '

The procedure was not meant to produce highly accurate numbers, it should just give an impression of the delays we can expect when using NAT. The conditions in networks are generally so complex I did not see much sense in trying to be more accurate. The numbers obtained are good enough for getting a feeling for what it means to have a NAT router, and that is all they are for.

I measured five different setups:

1.: The NAT module has not been inserted into the kernel.
2.: The NAT module has been inserted into the kernel, but there are no NAT rules (the chain is empty).
3.: There is exactly one (bidirectional) rule in the chain:
ipnatadm -I -i -b -S 1.1.1.1/32 -M 1.1.1.3/32

The result is the NAT router and Host 2 see Host 1's IP as 1.1.1.3, and not 1.1.1.1.
4.: There are four NAT rules in the chain:
ipnatadm -I -i -W eth0 -S 1.1.1.1/32 -M 55.55.55.55/32
ipnatadm -O -i -W eth0 -D 55.55.55.55/32 -N 1.1.1.1/32

ipnatadm -I -i -W eth1 -D 1.1.1.1/32 -N 55.55.55.55/32
ipnatadm -O -i -W eth1 -S 55.55.55.55/32 -M 1.1.1.1/32
route add -net 55.55.55.0 netmask 255.255.255.0 eth0

The result of the first two rules is essentially the same as above, the difference is that instead of making one of the two rules bidirectional we specify everything 'manually', also, we convert Host 1's IP to 55.55.55.55 and not to 1.1.1.3, but that does not make any difference. The other two rules reverse the first two rules, so that Host 2 still sees Host 1 as having the address 1.1.1.1. This setup shall test the worst case, where a packet has to be translated twice, when the NAT router receives it and just before it is going to be sent out again.
Note that we need to establish a route for network 55.55.55.0, since Host 1's IP will be one of that network when the routing decision has to be made.
5.: The same as above, but there are 46 (garbage) rules right before the four 'real' rules, so that the algorithm has to scan through all the 46 rules first (which do not match any packet), making this a 50-rules test.
Unlike firewall rules, in reality there will almost never be that many NAT rules, because mostly one (static or dynamic) rule per network is sufficient.
ipnatadm -I -i -b -S 123.123.123.0/24 -N 42.42.42.0/24
.
. (this one [garbage]rule repeated 46 times)
.
ipnatadm -I -i -W eth0 -S 1.1.1.1/32 -M 55.55.55.55/32
ipnatadm -O -i -W eth0 -D 55.55.55.55/32 -N 1.1.1.1/32

ipnatadm -I -i -W eth1 -D 1.1.1.1/32 -N 55.55.55.55/32
ipnatadm -O -i -W eth1 -S 55.55.55.55/32 -M 1.1.1.1/32
route add -net 55.55.55.0 netmask 255.255.255.0 eth0

The following tables show the numbers I obtained. There are two rows for each setup. The upper one shows the numbers when I transferred the 20 MB file from Host 2 to Host 1, the lower row is for the transfer from Host 1 to Host 2. I always completed one direction before I started taking numbers for the other one. Swap space was unused on both hosts, there where only the basic processes running.

Table 1: raw numbers from 'time '-command

#	t1	t2	t3	t4	t5	t6	average
	seconds
1	43.43	43.27	39.05	36.94	37.88	37.90	39.75
	40.39	41.31	39.66	37.29	34.95	34.80	38.07
2	40.57	38.86	36.17	37.70	35.94	37.08	37.72
	40.24	41.06	40.10	35.68	35.22	34.42	37.79
3	45.70	44.64	42.25	38.84	39.86	38.79	41.68
	40.54	43.03	39.43	37.42	35.85	35.44	38.62
4	49.27	48.96	41.85	39.97	39.20	39.94	43.20
	41.57	42.19	40.73	36.44	35.79	36.25	38.83
5	52.90	44.45	43.21	44.88	45.11	48.35	46.48
	45.82	45.80	42.39	40.75	41.02	41.93	42.95

Table 2: numbers calculated from Table 1

#	t1	t2	t3	t4	t5	t6	average
	Kbytes/sec (ca.)
1	482	484	536	567	553	553	527
	519	507	528	562	599	602	550
2	516	539	579	556	583	565	555
	520	510	522	587	595	608	554
3	458	469	496	539	525	540	502
	517	487	531	560	584	591	542
4	425	428	500	524	534	524	485
	504	496	514	575	585	578	539
5	396	471	485	467	464	433	451
	457	457	494	514	510	499	487

It is interesting to note that even on the tiny private and closed network used for the tests there is a wide variation in the numbers. However, the general direction is clear. All test transfers started slow and became gradually faster, here we see TCP's slow-start algorithm at work. I have intentionally chosen a setup where the hosts and not the network are the bottleneck, because otherwise obviously the numbers would be worthless. I did no tests with virtual servers, because first I did not expect any surprises, i.e. completely different numbers, and second, I have implemented the structure that stores the dynamic client data as a list, which should be replaced by a more sophisticated structure like hashes or a binary tree anyway. The list, however, was the choice in order to not complicate my very first virtual server implementation unnecessarily. It made bug tracking easier. Third, testing the algorithm used for selecting a real server to map a client to is not useful since this choice is done just once.

Next: Bibliography Up: Title Previous: Example Implementation

Michael Hasenstein