Here I only want to show how my implementation translates not just forwarded packets, so that packets destined for or originating from localhost will be treated equally. This is a result of the design of this implementation that makes NAT an additional layer around the kernel's network functions, see the figure on page .
We have two hosts, one of them is a Linux PC using the NAT module. Its local IP that is used to configure the network interface is 1.1.1.1, but on the network we want to appear as134.109.192.223 to the other host (IP 134.109.192.123).
Assuming the network (including routes!) has been configured already on both hosts I only mention the additional steps necessary to translate the local 1.1.1.1 address:
The solution, or let us say 'a solution' for there are many (as always!), is get the 53.0.0.0-addresses anyway and use static NAT on the physical network connection to the corporate Intranet.
There are various ways to use the NAT-module in this case. If we choose to bind the translation on the internal interface to our 138.201-network we need to have a route to this interface for network 53.3.0.0, the one we have obtained from our company, but if we translate on the interface to the company 53.0.0.0-network we need to have a route to 138.201.0.0. We can also do separate translations for packets from our net to the 53.-network and for packets coming in from the 53-network, because since NAT is a layer around the kernel there are always two points in the NAT layer a packet has to pass. It does not hurt to do this, but it is not nice, I would prefer to bind the entire translation for incoming and outgoing packets to one point.
The bidirectional NAT-rule we need to do all translations on the interface 'wan' is:
ipnatadm -O -i -b -W wan -S 138.201.0.0/16 -M 53.3.0.0/16
or, also possible and equivalent,
ipnatadm -I -i -b -W wan -D 53.3.0.0/16 -N 138.201.0.0/16
Again, we could also use non-bidirectional rules where we have to take care for packets in the opposite direction by specifying another rule accordingly. When we omit -b in the above two rules we have the pair of rules needed when no bidirectional rules would be possible with the implementation. As we can see -b simplifies writing NAT rules a lot.
ipnatadm -O -W eth1 -i -D webserver/32 80 \
-N temp-replacement/32 8888
This will work, since we know exactly the IP we have to insert in return packets: it is port 80. So the rule for the return packets will be
ipnatadm -I -W eth1 -i -S temp-replacement/32 8888 \
-M webserver/32 80
This will take care that the clients connecting to the webserver see the expected source address and port in the packets they get back, which must be from the IP and port they sent their packets to. In this example we have also done IP address translation, not just port translation. Port translation alone makes less sense than IP address translation, but it may still sometimes be useful.
We have the two networks, both using now and for the foreseeable future 10.1.0.0 addresses for the internal hosts. Network B will be addressed from network A using 10.3.0.0 addresses, and network A will be addressed from network B using 10.2.0.0 addresses.
There are other combinations of rules possible than the following ones, but I give the bidirectional rules needed to convert Net B addresses on the interface eth1 of the NAT router, and to convert Net A addresses on interface eth0, that means both incoming and outgoing packets for a network are translated on the same interface. Packets coming from the network to the router destined for the other network are translated when they come in, and return packets from the other network are translated just before they are sent out on the same interface.
These are the rules, one for either network:
ipnatadm -I -b -W eth0 -S 10.1.0.0/16 -M 10.2.0.0/16
ipnatadm -I -b -W eth1 -S 10.1.0.0/16 -M 10.3.0.0/16
Now, when a host in Net A contacts a host in Net B, its IP (the source IP of the packet) is converted to a 10.2.0.0-address, so that it appears to have come from that network for the host in Net B. Net B sends its response to this 10.2.0.0-address, which will be routed to interface eth0 on the NAT router (see the routing table in the figure). Just before this answer packet is sent out to the host on Net A its destination address is changed to the 10.1.0.0-address of the host in Net A, so that this host recognizes the packet as an answer for the packet it had sent earlier. The router's kernel only sees the 10.2.0.0- and the 10.3.0.0-addresses but never the local 10.1.0.0-addresses, since the NAT-layer hides them from the kernel's network functions. The router's routing therefore works on virtual IPs that belong to no real host on any of the two networks the router serves. Here is the point to mention a bug in the implementation. It is not really a bug but missing corporation between the module and the kernel. What happens is that the NAT router will issue ARP requests for those virtual addressed. Everything works fine, though, it is just that there are some senseless ARP packets. I have not investigated the problem further after it became clear that first of all everything works and second, non-trivial changes to the kernel would be necessary. The latter conflicts with my intention to not interfere with the other kernel code unless it was absolutely essential for the module to work at all. This is even more important when we consider that large parts of the networking code have been rewritten in the 2.1 kernel series and I don't know how this new kernel version acts in this ARP case. It is not simply finding the function call causing ARP resolution before a packet is sent and placing my NAT function right before it, thereby preventing virtual IPs from being used in such requests, because this I have done already, NAT would otherwise not work at all in this virtual IP case. It seems to be connected to Linux' routing code that creates an entry in an rt(route)-cache, which also contains fields for address resolution information. Especially the routing code has been radically redesigned in the 2.1-series, so I don't see much sense in messing around in the 2.0 kernel code in this case.
The situation is basically the same as above, we only add something: There is a third network, using 138.20.0.0 addresses in this example, let's call it Net C. The tricky thing is Net A and Net C have already been connected using the router that now is our NAT-router and all the people in both networks have become used to the other network's IPs so that they are hardwired not just in the brains of the people but also in lots of code, e.g. in firewall rules in subnetwork-firewalls (where using DNS is a bad idea since DNS can be spoofed), or in /etc/hosts files. To summarize, whatever the reasons may be, Net C wants to continue talking to Net A using the 10.1.0.0 addresses and Net A people want to say 138.20.x.y to Net C-hosts.
This does not sound that complicated and it indeed is not, but I want to use it to show two different ways to solve the problem, one is using the packet matching code and the other one is to use a completely virtual address space. We have already had a completely virtual address space in the example above, but in this example if Net C connects to Net A using Net A's real IPs this is simple routing, I will make it more complicated for the example's sake. Note that it will be unnecessarily complicated for the real world, but this is an example used for demonstration.
Again, when we look at the routing table in the figure above there is not a single real world IP in it. This time the situation is different from the first example, though, since Net A needs to use the real IPs of Net C while in the above (simple) example all cross-network communication was done using virtual IPs.
At first the rules to create the virtual address space for the NAT router:
ipnatadm -I -b -W eth0 -S 10.1.0.0/16 -M 10.2.0.0/16
ipnatadm -I -b -W eth1 -S 10.1.0.0/16 -M 10.3.0.0/16
ipnatadm -I -b -W eth2 -S 138.201.0.0/16 -M 10.4.0.0/16
And now the rules needed for the 'specials' in this setup:
Net A wants to address Net C using 138.20.0.0, so convert this destination address
to Net C's virtual addresses for routing:
ipnatadm -I -b -W eth0 -D 138.20.0.0/16 -N 10.4.0.0/16
Net C wants to see Net A as 10.1.0.0:
ipnatadm -I -b -W eth2 -D 10.1.0.0/16 -N 10.2.0.0/16
Now
RFC 1631, which describes dynamic NAT in detail, also tells us about possible uses. Another example for a possible use besides the ones described in the RFC is the following example. I did not completely invent the setup, I had the idea because in the company where I worked when I wrote NAT I encountered a similar situation, so nobody can say it is completely artificial and just a product of my imagination and there will never be such a situation.
Imagine the following setup: There are two departments, each one with their own private network (with some connections to the outside). For some reason they work together on a project and therefore connect their networks. However, department B is concerned about security and purchases a firewall, so that department A's access to the network B can be controlled. The procedure department A has to follow in order to get department B's firewall administrator to change or add rules is relatively complicated and slow, one reason being that nobody at department B has much experience with firewalls.
Now, after some time has passed and everything has worked well (more or less) department A decides it needs to hire more employees and therefore to increase its network. Since the class C network (network A-1) they have used so far does not contain many more free IPs a new class C network (network A-2) is used. The employees in that new network also want to access department B's servers in network B, but the firewall only allows network A-1 through. In addition, department B's firewall administrator is on vacation and the others don't dare touch the firewall. Luckily department A employs a bright administrator who knows NAT. He installs a NAT-router and establishes a dynamic NAT rule on it, mapping both network A-1 and A-2 dynamically to network A-1 addresses, thereby cheating the firewall.
This setup is indeed a bit unusual, but it also is a real live example. Maybe it is unlikely someone else will be in the same situation and it is also possible to find other solutions, but especially the latter that is not a good argument because it is always applicable. The purpose of this example is just that, to give an example, not to tell anyone what to do.
Another not so obvious example would be a redirector. Dynamic NAT could for instance be used to redirect all packets for any IP/port 'something fixed' to a single IP. Another way to achieve this with Linux is using the local redirect feature and have a user space program do this. The advantage here is that this redirector program also gets to know the original destination, which is essential for using this feature to redirect all port 80 connections to a local web cache, because the web cache must be told what IP to connect to.
Linux masquerading is extremely popular and many application specific modules have been written, that (among other things) take care of translating IPs transmitted in the data part of IP packets. Unfortunately, I don't know of any serious efforts to combine the various NAT-parts that have been developed for Linux, like masquerading or some very basic NAT in the routing code of 2.1-kernels, and, not to forget, my implementation.
The following rule, which gets the unique rule id (-Y ) 10, inserts a virtual server rule. The virtual server IP is 138.201.148.222, the algorithm used to select a real server is byte-counter, i.e. the server that has delivered less bytes than the other(s) is used:
ipnatadm -I -i -Y 10 -D 138.201.148.222/32 80 -X byte
Let us now add (-a ) the real servers to the virtual server rule identified by the unique id (-Y ) ten in the chain of NAT rules:
ipnatadm -I -a -Y 10 -D 138.201.148.171/32 8888
ipnatadm -I -a -Y 10 -D 138.201.148.150/32 -w 2
The first server's http-daemon listens on port 8888 for some reason we don't care about now. The second server is much bigger than the first server, so we assign it a weight of 2, so that it gets used twice as much as server one.
If one of the servers fails it can easily be removed from the virtual server rule, so that the other server(s) continue alone:
ipnatadm -d -Y 10 -D 138.201.148.171/32 8888
This command deletes the first server from the rule so that the virtual server now consists of just one real server. Here we get to another missing feature, which can easily be added but illustrates the long way we still have to go for a reliable implementation: When the server one has been repaired we want to take it back online, so we issue the command to append (-a ) its IP to the virtual server rule. What happens now is that the byte-counter for this server gets initialized with 0, and the algorithm used to determine the real server to use for new connections will use only the new IP for a while since the other server(s), which has(have) been serving alone for some time have a much higher byte-count. The same happens when a virtual server is online continuously for a very long time, when the byte counter produces an overflow, but this is less likely with an 'unsigned long' counter.
The procedure was not meant to produce highly accurate numbers, it should just give an impression of the delays we can expect when using NAT. The conditions in networks are generally so complex I did not see much sense in trying to be more accurate. The numbers obtained are good enough for getting a feeling for what it means to have a NAT router, and that is all they are for.
I measured five different setups:
Table 1: raw numbers from 'time '-command
# | t1 | t2 | t3 | t4 | t5 | t6 | average |
seconds | |||||||
1 | 43.43 | 43.27 | 39.05 | 36.94 | 37.88 | 37.90 | 39.75 |
40.39 | 41.31 | 39.66 | 37.29 | 34.95 | 34.80 | 38.07 | |
2 | 40.57 | 38.86 | 36.17 | 37.70 | 35.94 | 37.08 | 37.72 |
40.24 | 41.06 | 40.10 | 35.68 | 35.22 | 34.42 | 37.79 | |
3 | 45.70 | 44.64 | 42.25 | 38.84 | 39.86 | 38.79 | 41.68 |
40.54 | 43.03 | 39.43 | 37.42 | 35.85 | 35.44 | 38.62 | |
4 | 49.27 | 48.96 | 41.85 | 39.97 | 39.20 | 39.94 | 43.20 |
41.57 | 42.19 | 40.73 | 36.44 | 35.79 | 36.25 | 38.83 | |
5 | 52.90 | 44.45 | 43.21 | 44.88 | 45.11 | 48.35 | 46.48 |
45.82 | 45.80 | 42.39 | 40.75 | 41.02 | 41.93 | 42.95 |
Table 2: numbers calculated from Table 1
# | t1 | t2 | t3 | t4 | t5 | t6 | average |
Kbytes/sec (ca.) | |||||||
1 | 482 | 484 | 536 | 567 | 553 | 553 | 527 |
519 | 507 | 528 | 562 | 599 | 602 | 550 | |
2 | 516 | 539 | 579 | 556 | 583 | 565 | 555 |
520 | 510 | 522 | 587 | 595 | 608 | 554 | |
3 | 458 | 469 | 496 | 539 | 525 | 540 | 502 |
517 | 487 | 531 | 560 | 584 | 591 | 542 | |
4 | 425 | 428 | 500 | 524 | 534 | 524 | 485 |
504 | 496 | 514 | 575 | 585 | 578 | 539 | |
5 | 396 | 471 | 485 | 467 | 464 | 433 | 451 |
457 | 457 | 494 | 514 | 510 | 499 | 487 |
It is interesting to note that even on the tiny private and closed network used for the tests there is a wide variation in the numbers. However, the general direction is clear. All test transfers started slow and became gradually faster, here we see TCP's slow-start algorithm at work. I have intentionally chosen a setup where the hosts and not the network are the bottleneck, because otherwise obviously the numbers would be worthless. I did no tests with virtual servers, because first I did not expect any surprises, i.e. completely different numbers, and second, I have implemented the structure that stores the dynamic client data as a list, which should be replaced by a more sophisticated structure like hashes or a binary tree anyway. The list, however, was the choice in order to not complicate my very first virtual server implementation unnecessarily. It made bug tracking easier. Third, testing the algorithm used for selecting a real server to map a client to is not useful since this choice is done just once.