Configuring an Xserve as a Router
Volume Number: 19 (2003)
Issue Number: 5
Column Tag: Sysadmin
Configuring an Xserve as a Router
A Case Study in System Administration for Digital Janitors
by Richard Patterson
An Xserve user experience
Illusion Arts purchased an Xserve in September 2002 after a very persuasive marketing seminar by Apple. Generally when a new computer product is introduced, some users find it to be a completely-plug-and-play dream while others descend into a nether world where everything that should work doesn't. Our experience with the Xserve has definitely been of the surreal variety, and I am tempted to chronicle the whole nightmare so that the reader is forced to submit to the same initiation ritual I endured; but this is not the place for that. What I am hoping to accomplish here is to harvest the fruit in a way that spares at least one other person from having to negotiate the maze and permits him or her to get on with life.
The kitchen sink
Illusion Arts is a small visual effects facility with about 20 employees which was spun off from the Universal Studios Matte Department in 1985. We use Macs for painting and compositing and for some 3D work. We also use Windows computers to run 3D, tracking and other image processing tools that do not exist on the Mac. For years we relied on sneakernet to move data through the facility, but in 2000 we centralized our storage with a Sun server. We bought an Xserve for a variety of reasons, and it had to fit into a network containing the following components:
1) A Sun 450 with a 1.5 terabyte RAID serving files to Macs and Windows workstations and twelve Ethernet ports including two fiber and one copper gigabit ports.
2) A 48 port 100 baseT switch with a dual fiber gigabit uplink to the Sun.
3) About 30 Mac workstations running OS 9.2
4) 5 Windows 2000 workstations
5) 8 Macs on a renderfarm
6) 12 Windows 2000 render computers
7) A 24 port 100 baseT switch with a copper gigabit uplink
8) A Cobalt Qube internet gateway, email and webpage server
We ordered a fiber gigabit Ethernet card for the Xserve with the idea that it could connect to the rest of the network via one of the uplinks on the 48-port switch. This could have been replaced by a converter to go from copper to fiber gigabit; but I originally thought the fiber card could be ordered instead of a second copper card, and I had trouble finding the converter. We ordered the Xserver with two copper Ethernet cards just because it was faster to get the off the shelf configuration rather than a "custom build." Theoretically the second copper gigabit card in the Xserve is redundant. We have used it for a direct connection to the Sun, which may have some pedagogical value as a example of dealing with network connections. We ended up getting a copper to fiber converter as well in order to set up the network the way we finally decided we wanted it. By the time I was done I found myself purchasing a 12-port gigabit switch so that I could reconfigure everything again. The first moral of my story is that it is more economical to know what you want to do and what is possible to do before you start buying components.
Originally both the gigabit uplinks on the 48-port switch had been used to connect to the Sun as a means of balancing the load, but several people had told me that this was probably unnecessary. The bottleneck, if there was one, would not be in the uplink connection to the Sun or in the gigabit card in the Sun.
OS 10.2 Server does not have a graphical user interface for configuring the Xserve as a router or a bridge or a gateway. Perhaps by the time anyone reads this a future release of the system software will provide an easy, intuitive graphical interface for setting up an Xserve to do what we wanted to do. For now, however, it is necessary to pretend that you are a UNIX system administrator to make it work. The best time to do this is when you first get your Xserve and are game to reinstall the system from scratch if anything gets screwed up. In fact I might recommend re-installing the system a couple of time just for the hell of it so that you feel completely free to mess up anything and everything with an inadvertent command. Don't let a seasoned system administrator put the fear of God in you with stories about how novices typing in the command line can take an entire system down for the count. Just do it.
How much do you need to understand about TCP/IP networks in order to set up an Xserve with more than one Ethernet card? If I describe the steps well enough you may not have to understand anything at all, but I always think an experience is much more enjoyable if I feel I understood something about what happened. So my answer to the question is, "Enough to feel satisfied." I shall try to include the amount of contextual information that it takes to satisfy me and indicate the areas where I am still unsatisfied.
First and foremost it is helpful to understand something about an IP address. I first accepted IP addresses into my life when I had to set up a modem connection to an internet service provider. They gave me IP addresses to enter like a phone number and I tried to write them down somewhere I could remember so that I could set the modem up again every time I re-installed the system. An IP address is very much like a phone number. It is actually a string of 32 bits, which for the sake of convenience are broken up into 4 groups of 8 bits and referred to by the decimal equivalent of each 8 bit group. The notational convention uses a "." to separate each portion as in 184.108.40.206. If all the bits were set to 1 the result would be 255.255.255.255.
Each portion of the IP address has a different significance just like the area code, exchange and number function differently in a phone number. There are conventions and rules governing the use of various ranges of numbers which need not concern us here. I am going to assume that the reader already has some IP addresses in play on his network. The two portions of the IP address that matter to us are the last two numbers. The next to last number can be thought of as comparable to the exchange portion of a telephone number. (I am giving away my age here, perhaps, by referring to the first three numbers in an American phone number as the "exhange." In my day, Kid, phone numbers started with things like Alpine 2, which was dialed as AL2 and eventually just became 252.) In the IP address 220.127.116.11 the 5 can generally be taken as specifying the "subnet."
Actually it is probably the use of the mask which defines the subnet. (Please note that whenever I use an qualifier like "probably" or "generally" we have entered an area where my understanding of things is less than totally satisfying, but I am not letting that stop me.) The mask is usually specified as 255.255.255.0. This indicates a string of bits where the first 24 are 1's and the last 8 are 0's. A mask functions as a sieve and only lets things through where there are holes, i.e. zeroes. It is used to separate the portion of the IP address which defines the individual device that to which the message is addressed from the rest of the address which function like an area code and exchange to define networks and subnets. With a private network like ours the easiest way to create subnets is by using a mask of 255.255.255.0 and letting the third "octet" specify the subnet as in 172.1.1.x, 172.1.2.x, etc. This permits each subnet to have 254 unique addresses. (Zero or 255 cannot be used for a device as they are reserved for other purposes.) Subnets can be used by creating by masks with more than 24 ones as in 255.255.255.192 which has 26 ones followed by six zeros. This is useful in many cases but requires more effort to determine which subnet a particular address will be in. In this case the two extra ones mean that there are four different possible networks having 172.1.1 as their first three numbers. The result is that 18.104.22.168 is in a different subnet from 22.214.171.124 and you can not use the addresses of 126.96.36.199 or 188.8.131.52, because they become the equivalent of 0 and 255. If you really want to understand this I suggest you pursue it later via Microsoft Knowledge Base Article #164015 or any number of other discussions of TCP/IP addressing.
A mask of 255.255.255.0 applied to an address of 184.108.40.206 tells the system that 203 is the unique part of the address. Another notation used to refer to a mask like this is 24/0 as in 172.1.5 24/0. The 24/0 means the mask has 24 1's and the rest 0's. There are times when this is a more convenient way to specify the mask, but we are not going there right now.
We now know enough to encounter our first obstacle (or opportunity for achievement if you prefer). Two Ethernet ports in one computer are not generally happy if they live in the same subnet. There may very well be numerous exceptions to this or ways to avoid it altogether, but for the purposes of this exercise it is being awarded the status of Fact Of Life. We are also going to simplify life by declaring the subnet to be defined by the next to last portion of the IP address. I got used to this idea when the experts who set up our Sun server loaded it up with 12 Ethernet ports and we had 12 different subnets. In the Chooser on a Mac workstation this produced 12 "Zones." With Appletalk a subnet becomes a Zone, but beware of using the term "zone" when you are discussing your problem with an honest-to-god UNIX or Microsoft System Administrator. He or she will probably not know what the term means and only think you are demonstrating your ignorance or confusion.
IP addresses can be static or dynamic. A static IP address is manually assigned to the computer or device and it stays the same until a human being changes it. A dynamic address is assigned by another device "on the fly" such that the human being never needs to care what the current IP address is. As enticing as this sounds, I have had no truck with dynamic IP addresses. Our network is small enough that it is feasible for us to assign and keep track of IP addresses. I have also been under the impression that some of the devices on which we depended, depend in turn on static IP addresses. This discussion is limited to a network based on static IP addresses.
OS 10.2 Server does have a graphical interface for assigning IP addresses to the Ethernet cards installed in your Xserve. Even if you intend to run your Xserve without a monitor and keyboard attached to it, I strongly recommend that you temporarily put a monitor and keyboard on it until you are satisfied that it is functioning properly. I do not even know if it is possible to install (or re-install) the operating system without a monitor and keyboard, and I have never been tempted to find out. I am reasonably certain it is possible to assign IP addresses to Ethernet cards by logging in remotely, but it is certainly easier to do via the graphical interface. The Network system preference panel lets you assign the IP address to each card just as you can assign an IP address to any workstation in System 9 or 10. If you have more than one card installed there should be a pull down menu allowing you to select the card you want to configure.
With our Xserve I initially used IP addresses of 220.127.116.11, 18.104.22.168 and 22.214.171.124 for the three Ethernet cards. The 172.1 is inherited from our existing network configuration. Since this is an internal network which is not visible (we hope) to the internet, we are free to use whatever range of IP addresses feels good. I have no idea why 172.1.x.x was chosen by the vendor who set up our server, but I see no reason to change it. I chose the subnets based on the fact that the majority of our facility had been using IP addresses of 172.1.1.x and 172.1.2.x. The 48 port switch to which most of our workstations were attached was set up with half of the ports in Zone 1 and half in Zone 2. Since I was stealing the second gigabit uplink, my plan was to put all of the workstations in Zone 1 and connect the Xserve to the second gigabit uplink.
I shall describe later how I had to reconfigure the switch, but for now the important point is that putting three Ethernet cards in an Xserve and giving them the IP addresses I did created three sub-nets that did not automatically connect with each other even though they shared the same bed. The Xserve itself could communicate with any device attached to each of the cards, but a computer at 126.96.36.199 (attached to the third card in the Xserve) could not communicate with a computer at 188.8.131.52 (attached to the first card). To make this happen we must configure the Xserve as a router.
The first thing to do is enable IP forwarding. IP forwarding apparently does exactly what it sounds like it should do. IP packets arriving via one Ethernet card are passed along to the networks connected to the other Ethernet card(s). I believe IP forwarding by itself will only forward packets to a card with an address whose subnet matches that of the address to which the packet is being sent. If there are additional subnets branching off downstream somewhere, additional routing configurations are required.
With OS X 10.2 Server IP forwarding is turned off by default. I think in earlier versions it may have been on by default. This would at least explain why some users felt some functionality had gone hiding with the release of 10.2. If IP forwarding is off, nothing attached to one Ethernet card will be able to communicate with anything attached to the second Ethernet card. Enabling IP forwarding is necessary but not always sufficient to make this happen. Depending on the complexity of your network it may be that the only other necessary step is to set the default router on your workstations to the IP address of the Ethernet card in the Xserve to which it is connected.
There is a system file called hostconfig which tells your Xserve whether or not to enable IP forwarding when it starts up. It is a text file but it is normally hidden from view. The easiest way to inspect the file is via the Terminal window. Open the Terminal, and type cat /etc/hostconfig.
[athena:~] admin% cat /etc/hostconfig
# This file is maintained by the system control panels
# Network configuration
Note the line that says IPFORWARDING=-NO-. This is telling the Xserve not to enable IP forwarding and is the default state when the machine is shipped. What you need to do is change that line so that it says IPFORWARDING=-YES-. Unfortunately you can not do this in the display produced by the cat command. Cat (which I believe is short for concatenate) is one of several ways to view the contents of a text file; but it is only a display of the contents and not a means of changing them and saving the result. There are text editors built into UNIX designed for this task, but they are non-intuitive in the extreme. Proficiency in, and a preference for, one of these editors is a sign of a true Initiate. Fortunately for the rest of us who do not have time to play this game, there is BBEdit Lite, the free version of Bare Bones Text Editing software. Version 6.1 which includes an OS-X version can be downloaded from www.barebones.com. It is an invaluable tool, which also helps in moving text files between Mac and Windows operating systems.
With BBEdit you can select Hidden Files from the File menu and you will see all of the system directories and files. The etc directory is on the root level and contains hostconfig. You can open it, edit it and save it just like you would expect to be able to do. It may be wise to save a backup version of it before you save your edited version. Creating a file called hostconfig.backup will not confuse the operating system. If you run into permissions problems when you try to save it, try logging in as root and editing the file.
Once you have saved the version of hostconfig containing the line IPFORWARDING=-YES-, the computer will always enable IP forwarding when it starts up until you re-edit this file. To enable IP forwarding now, you must restart the computer.
After you restart you can verify that IP forwarding has in fact been enabled by entering the command sysctl -a in the terminal window. You will get back a long list a variable settings including one line which says net.inet.ip.forwarding: 1. If it says net.inet.ip.forwarding: 0, then IP forwarding has not been enabled for some reason. So long as you use the -a flag with the command, sysctl will not change anything about the way your system is configured.
To avoid having to look through the long list of variables you can use grep to have it show only you what you are interested in. Using the | to "pipe" the output of one command through another is one of the essential tricks to using the command line. The grep command is equally essential. It extracts from a flood of text only the lines containing the word or "character string" you tell it you are looking for.
[athena:~] admin% sysctl -a | grep ip.forwarding
Once you know IP forwarding is enabled, make sure the computers connected to your Xserve are set up with the IP address of the right Ethernet card in the Xserve as its router. With OS-X this is done in the Network Preferences panel.
Note that the workstation's IP address and the Router IP address must be in the same sub-net. In this example they both begin with 172.1.1. If 184.108.40.206 were the address of one of the Ethernet cards in my Xserve IP forwarding would enable me to connect to any other computer attached to the Xserve provided that computer is in the same subnet as the Ethernet card to which it is attached. Without IP forwarding in the Xserve I would only be able to see computers in my own 172.1.1 subnet.
If you have turned on IP forwarding and set the router address on your workstations and you discover that everybody connect to everybody else, declare victory and quit the field.
If life is not so simple, the first thing to do is go to workstations and see if you can ping a workstation on another subnet. Ping is a command found in UNIX and Windows alike which sends a signal out to the IP address you specify and requests a response. If a response comes back it tells you how long it took; if no response comes back, you still have a problem. OS-X has a nice Network Utility (found in the Utilities sub-directory under Applications) which will ping and also do a traceroute. A traceroute sends out a message to an IP address and reports back the address of each step it has to take along the way. In our example a computer at 220.127.116.11 attempting to connect to a computer at 18.104.22.168 will report that it first went to 22.214.171.124 and then found 126.96.36.199. In other words 188.8.131.52 is connected to one card in the Xserve with an address of 184.108.40.206 and the message is forwarded to the card with the address of 220.127.116.11 which is in turn connected to the computer at 18.104.22.168. The two cards in the Xserve are considered one "hop" along the way.
If you are unable to ping computers on other subnets, it is possible you need to clean up the routing table in the XServe or to explicitly add routes to it. Using the Network Utility select the Netstat tab and check the Display Routing Table Information button. This is the equivalent of entering the netstat -r command in the terminal window. I have found that the Netstat function in the Network Utility can take a long time to display anything if you have just recently rebooted. If it seems not to be working, use the command line netstat -r which is generally more responsive.
Most of what it shows you is the routing table that has been generated as a result of configuration settings and activity that has actually taken place. The table seems to divide into two sections, and I know nothing about the section labeled Internet6. I do not believe it is relevant to this discussion. Most of the first part of the table is clear enough.
The Destination is the IP address or the range of IP address to which a message may be sent. Don't ask me about the 169.254 and don't worry about it. I wasted a fair amount of time worrying about it and even trying to remove it from the routing table until I noticed that it is always there and does nothing to get in my way. Things like that are best left alone.
Localhost is a name for the computer in which this table resides. Since computers have evolved like human beings to the point where they like and even need to talk to themselves, there is a loopback route in which the localhost arrives at itself by using itself as the Gateway.
The Gateway is the what you have to be routed through in order to connect with your destination. The three Ethernet cards show up here as link#4, link#5 and link#6 and are listed as the Gateways for the subnets attached to them. In other words if you want to get to any address beginning with 172.1.2 you go through link#4,
The Netif is, I believe, the network interface or the name by which the Xserve knows the physical devices. In this case en0, en1 and en2 are the three Ethernet cards and lo0 is the Xserve itself.
The Flags are shorthand indications of things about the routes and even the definitive book on TCP/IP says they are of only marginal value. It is comforting to know that the U means the route is up and operational, but the chances are good that it would not be in the table at all if it were not. The Refs, Use and Expire information is of no relevance to this discussion.
The reason to look at the routing table is to peruse the destinations and gateways. If a connection has been made to a specific computer or device, the routing table may store its MAC or hardware address instead of its IP address. You can be happy they are connected and move on. What you are looking for is whether there is an appropriate route for each subnet and whether there are contradictory or conflicting routes. There should only be one routing statement for each subnet. The subnet statements are the ones with the Destinations like 172.1.2/24. This means any IP address beginning with 172.1.2. If for some reason there are two routes for a given subnet, then remove one using the route command.
route delete 172.1.1/24 22.214.171.124
This command would delete the route for subnet 172.1.1 but not just any user is allowed to issue commands like this The surest way to achieve this goal is to log into the server as root, in which case you can be certain the Xserve will attend to your every command. You may also be able to use the sudo command to make the Xserve pay attention. Sudo stands for "superuser do" or "I am the super user, do what I tell you."
sudo route delete 172.1.1/24 126.96.36.199
Any route you remove, you can also restore with the route command.
sudo route add 172.1.1/24 188.8.131.52
This is how I manually added the route to the table listed above. The reason I did so and the reason I also changed the default route by deleting the original default route and adding the revised one is that I have a switch at 184.108.40.206. Most of the computers attached to this switch have addresses beginning 172.1.1 and the Sun server with the RAID is at 220.127.116.11. This switch is also the intermediate connection to the Qube and the DSL modem through which we access the internet. Setting the default route to this switch enables computers attached to the Xserve to find their way to the internet.
The default route is the route assigned to any incoming message with a destination for which there is no explicit routing instruction.
Fortunately for me the switch at 18.104.22.168 is a "managed" switch. That is to say it has software in it which enables it to be configured to function in a variety of ways. One of the ways it can be set up is to have ports assigned to different subnets. The technospeak for this is a virtual local area network or VLAN. Generally a managed switch these days will have a user interface which can be accessed with a web browser via TCP/IP. In the case of the Extreme switch it is a very simple matter to assign ports to VLANs. This enables my switch to function as a router as well as the Xserve. I was able to reassign all the ports that had been connected to computers with IP addresses putting them in Zone 2 to the VLAN for Zone 1 and then change the IP addresses on the workstations from 172.1.2.x to 172.1.1.x. I left only two ports assigned to Zone 2, the gigabit uplink which was now connected to the Xserve and one lonely port for the workstation which is serving a FileMaker Pro database filled with scripts referencing the IP address of the database server. (This is the kind of thing that makes life so exciting for wannabe System Administrators.)
At this point I myself am going to declare victory and quit the field with only a few more parting shots. When I began the odyssey that led to all this wisdom, I had many people tell me that I had to use NAT or IPFW to do what I wanted. Some recommended using a piece of third party software call Brickhouse to make the job less painful. Even the highest lever of support at Apple had me convinced that I had to use the routed (pronounced "rout-dee") command to activate the routing daemon. The actual solution turned out to be much simpler than many people thought, and in retrospect I am still puzzled by why I had such a hard time figuring out what I needed to do and making it work. NAT is network address translation. The need for it assumes that there is routing going on, but its real function is to change the IP address that is visible in the message downstream for security reasons. IPFW has to do with setting up and maintaining a firewall. It also assumes routing is going on. Presumably the steps required to implement NAT or IPFW will also make the Xserve function as a router, but they are not the simplest way to make it route. The big breakthrough that occurred when I learned how to activate the routing daemon using routed, now appears in retrospect to me to have been an illusion based on the fact that entries in the routing table had really solved the problem. I believe that OS-X Server has built into it the intelligence to use its version of the route daemon whenever IP Forwarding is turned on and there is more than one Ethernet card installed. The only thing I have not figured out is how to script the necessary entries in the routing table so that they are added automatically every time the Xserve restarts. (Contrary to a prevalent myth, we find that OS-X and OS-X Server often reach an impasse that requires restarting the computer.) Since it is possible to paste text into the command line in the Terminal window, we have a Stickies note containing the necessary commands which we copy and paste after the system restarts.
Richard Patterson is in charge of digital imaging at Illusion Arts, a visual effects facility in Van Nuys, CA, specializing in matte paintings and bluescreen compositing for movies. You can reach him at firstname.lastname@example.org.