In my last post I went into the reasons why exporting the network hardware from dom0 to an unprivileged driver domain is good for security. This time the “how” is our focus. The documentation out there isn’t perfect and it could use a bit of updating so expect to see a few edits to the relevant Xen wiki page  in the near future.
How you configure your Xen system is super important. The remainder of this post assumes you’re running the latest Xen from the unstable mercurial repository (4.0.1) with the latest 2.6.32 paravirt_ops kernel  from Jeremy Fitzhardinge’s git tree (188.8.131.52). If you’re running older versions of either Xen or the Linux kernel this may not work so you should consider updating.
For this post I’ll have 3 virtual machines (VMs).
- the administrative domain (dom0) which is required to boot the system
- an unprivileged domain (domU) that we’ll call “nicdom” which is short for network interface card (NIC) domain. You guessed it, this will become our network driver domain.
- another unprivileged domain (domU or client domain) that will get its virtual network interface from nicdom
I don’t really care how you build your virtual machines. Use whatever method you’re comfortable with. Personally I’m a command line junkie so I’ll be debootstrapping mine on LVM partitions as minimal Debian squeeze/sid systems running the latest pvops kernel. Initially the configuration files used to start up these two domUs will be nearly identical:
kernel="/boot/vmlinuz-184.108.40.206-xen-amd64" ramdisk="/boot/initrd.img-220.127.116.11-xen-amd64" memory=256 name="nicdom" disk=["phy:/dev/lvgroup/nicdom_root,xvda,w"] root="/dev/xvda ro" extra="console=hvc0 xencons=tty"
kernel="/boot/vmlinuz-18.104.22.168-xen-amd64" ramdisk="/boot/initrd.img-22.214.171.124-xen-amd64" memory=1024 name="client" disk=[ "phy:/dev/lvgroup/client_root,xvda,w", "phy:/dev/lvgroup/client_swap,xvdb,w", ] root="/dev/xvda ro" extra="console=hvc0 xencons=tty"
I’ve given the client a swap partition and more ram because I intend to turn it into a desktop. The nicdom (driver domain) has been kept as small as possible since it’s basically a utility that won’t have many logins. Obviously there’s more to it than just load up these config files but installing VMs is beyond the scope of this document.
PCI pass through
The first step in configuring the nicdom is passing the network card directly through to it. The xen-pciback driver is the first step in this process. It hides the PCI device from dom0 which will later allow us to bind the device to a domU through configuration when we boot it using
There’s two ways to configure the xen-pciback driver:
- kernel parameters at dom0 boot time
- dynamic configuration using sysfs
xen-pciback kernel parameter
The first is the easiest so we’ll start there. You need to pass the kernel some parameters to tell it which PCI device to pass to the xen-pciback driver. Your grub kernel line should look something like this:
module /vmlinuz-126.96.36.199-xen-amd64 /vmlinuz-188.8.131.52-xen-amd64 root=/dev/something ro console=tty0 xen-pciback.hide=(00:19.0) intel_iommu=on
The important part here is the
xen-pciback.hide parameter that identifies the PCI device to hide. I’m using a mixed Debian squeeze/sid system so getting used to grub2 is a bit of a task. Automating the configuration through grub is outside the scope of this document so I’ll assume you have a working grub.cfg or a way to build one.
Once you boot up your dom0 you’ll notice that
lspci still shows the PCI device. That’s fine because the device is still there, it’s just the kernel is ignoring it. What’s important is that when you issue an
ip addr you don’t have a network device for this PCI device. On my system all I see is the loopback (lo) device, no eth0.
dynamic configuration with sysfs
If you don’t want to restart your system you can pass the network device to the xen-pciback driver dynamically. First you need to unload all drivers that access the device:
modprobe -r e1000e. This is the e1000e driver in my case.
Next we tell the xen-pciback driver to hide the device by passing it the device address:
echo "0000:00:19.0" | sudo tee /sys/bus/pci/drivers/pciback/new_slot echo "0000:00:19.0" | sudo tee /sys/bus/pci/drivers/pciback/bind
Some of you may be thinking “what’s a slot” and I’ve got no good answer. If someone reading this knows, leave me something in the comments if you’ve got the time.
passing pci device to driver domain
Now that dom0 isn’t using the PCI device we can pass it off to our nicdom. We do this by including the line:
in the configuration file for the nicdom. We can pass more than one device to this domain by placing another address between the square brackets like so:
Also we want to tell Xen that this domain is going to be a network driver domain and we have to configure IOMMU:
netif="yes" extra="console=hvc0 xencons=tty iommu=soft"
Honestly I’m not sure exactly what these last two configuration lines do. There are a number of mailing list posts giving a number of magic configurations that are required to get PCI passthrough to work right. These ones worked for me so YMMV. If anyone wants to explain please leave a comment.
Now when this domU boots we can
lspci and we’ll see these two devices listed. Their address may be the same as in dom0 but this depends on how you’ve configured your kernel. Make sure to read the Xen wiki page for PCIPassthrough  as it’s quite complete.
Depending on how you’ve set up your nicdom you may already have some networking configuration in place. I’m partial to debootstrapping my installs on a LVM partition so I end up doing the network configuration by hand. I’ll dedicate a whole post to configuring the networking in the nicdom later. For now just get it working however you know how.
the driver domain
As much as we want to just jump in and make the driver domain work there’s still a few configurations that we need to run through first.
Xen split drivers
Xen split drivers exist in two halves. The backend of the driver is located in the domain that owns the physical device. Each client domain that is serviced by the backend has a frontend driver that exposes a virtual device for the client. This is typically referred to as xen split drivers .
The xen networking drivers exist in two halves. For our nicdom to serve its purpose we need to load the
xen-netback driver along with the
xen-evtchn and the
xenfs. We’ve already discussed what the
xen-netback driver so let’s talk about what the others are.
xenfs driver will exposes some xen specific stuff form the kernel to user space through the
/proc file system. Exactly what this “stuff” is I’m still figuring out. If you dig into the code for the xen tools (
xenstored and the various
xenstore-* utilities) you’ll see a number of references to files in
proc. From my preliminary reading this is where a lot of the xenstore data is exposed to domUs.
xen-evtchn is a bit more mysterious to me at the moment. The name makes me think it’s responsible for the events used for communication between backend and frontend drivers but that’s just a guess.
So long story short, we need these modules loaded in nicdom:
modprobe -i xenfs xen-evtchn xen-netback
In the client we need the
xen-evtchn and the
xen-netfront modules loaded.
Xen scripts and udev rules
Just like the Xen wiki says, we need to install the udev rules and the associated networking scripts. If you’re like me you like to know exactly what’s happening though, so you may want to trigger the backend / frontend and see the events coming from udev before you just blindly copy these files over.
To do this you need both the nicdom and the client VM up and running with no networking configured (see configs above). Once their both up start
udevadm monitor --kernel --udev in each VM. Then try to create the network front and backends using
xm. This is done from dom0 with a command like:
xm network-attach client mac=XX:XX:XX:XX:XX:XX,backend=nicdom
I’ll let the man page for
xm explain the parameters 🙂
In the nicdom you should see the udev events creating the backend vif:
KERNEL[timestamp] online /devices/vif/-4-0 (xen-backend) UDEV_LOG=3 ACTION=online DEVPATH=/devices/vif-4-0 SUBSYSTEM=xen-backend XENBUS_TYPE=vif XENBUS_PATH=backend/vif/4/0 XENBUS_BASE_PATH=backend script=/etc/xen/scripts/vif-nat vif=vif4.0
There are actually quite a few events but this one is the most important mostly because of the
script is how the udev rule configures the network interface in the driver domain and the
vif tells us the new interface name.
Really we don’t care what udev events happend in the client since the kernel will just magically create an eth0 device like any other. You can configure it using
/etc/network/interfaces or any other method. If you’re interested in which events are triggered in the client I recommend recreating this experiment for yourself.
Without any udev rules and scripts in place the
xm network-attach command should fail after a time out period. If you’re into reading network scripts or xend log files you’ll see that xend is waiting for the nicdom to report the status of the network-attach in a xenstore variable:
DEBUG (DevController:144) Waiting for 0. DEBUG (DevController:628) hotplugStatusCallback /local/domain/1/backend/vif/3/0/hotplug-status
installing rules, scripts and tools
Now that we’ve seen the udev events we want to install the rules for Xen that will wait for the right event and will then trigger the necessary script. From the
udevadm output above we’ve seen that dom0 passes the script name through the udev event. This script name is actually configured in the
xend-config.xsp file in dom0:
You can use whatever xen networking script you want (bridge is likely the easiest).
So how to install the udev rules and the scripts? Well you could just copy them over manually (mount the nicdom partition in dom0 and literally
cp them into place). This method got me in trouble though and this detail is omitted from the relevant Xen wiki page . What I didn’t know is the info I just supplied above: that dom0 waits for the driver domain to report its status through the xenstore. The networking scripts that get run in nicdom report this status but they require some
xenstore-* utilities that aren’t installed in a client domain by default.
Worse yet I couldn’t see any logging out put from the script indicating that it was trying to execute
xenstore-write and failing because there wasn’t an executable by that name on it’s path. Once I tracked down this problem (literally two weeks of code reading and bugging people on mailing lists) it was smooth sailing. You can install these utilities by hand to keep your nicdom as minimal as possible. What I did was copy over the whole xen-unstable source tree to my home directory on nicdom with the
make tools target already built. Then I just ran
make -C tools install to install all of the tools.
This is a bit heavy handed since it installs
xenstored which we don’t need. Not a big deal IMHO at this point. That’s pretty much it. If you want your vif to be created when your client VM is created just add a
vif line to its configuration:
In short the Xen DriverDomain has nearly all the information you need to get a driver domain up and running. What they’re missing are the little configuation tweeks that likely change from time to time and that the
xenstore-* tools need to be installed in the driver domain. This last bit really stumped me since there seems to be virtually no debug info that comes out of the networking scripts.
If anyone out there tries to follow this leave me some feedback. There’s a lot of info here and I’m sure I forgot something. I’m interested in any way I can make this better / more clear so let me know what you think.
14 thoughts on “Xen Network Driver Domain: How”
Great post! There is quite a bit of chatter in the literature and on blogs about Device Driver Domains, but this is the first actual set of instructions I’ve encountered.
I was able to reproduce your results, and the system has remained stable through several days of varying network load. Thanks for putting this guide together.
Really great! I tried to follow this setting a 3-domU server with Xen and Debian, but I’m still having some truble.
I’ve followed the guide till “Xen split drivers”, i cannot load that, but i guess they are included into the backend-domU kernel.
>cat /boot/config-2.6.26-2-xen-amd64 | grep -i xen
# CONFIG_X86_XEN is not set
# CONFIG_XEN_PCIDEV_FE_DEBUG is not set
# CONFIG_TCG_XEN is not set
# CONFIG_XEN_UNPRIVILEGED_GUEST is not set
# CONFIG_XEN_NETDEV_PIPELINED_TRANSMITTER is not set
# CONFIG_XEN_PCIDEV_BACKEND_PASS is not set
# CONFIG_XEN_PCIDEV_BACKEND_SLOT is not set
# CONFIG_XEN_PCIDEV_BACKEND_CONTROLLER is not set
# CONFIG_XEN_PCIDEV_BE_DEBUG is not set
# CONFIG_XEN_TPMDEV_BACKEND is not set
# CONFIG_XEN_DISABLE_SERIAL is not set
# CONFIG_XEN_COMPAT_030002_AND_LATER is not set
# CONFIG_XEN_COMPAT_030004_AND_LATER is not set
# CONFIG_XEN_COMPAT_LATEST_ONLY is not set
So i’ve grabbed the udev event from the backend-domU:
UDEV [1285695535.992731] add /devices/xen-backend/vif-0-3 (xen-backend)
UEVENT[1285695535.999024] add /class/net/vif0.3 (net)
UEVENT[1285695535.999046] online /devices/xen-backend/vif-0-3 (xen-backend)
UDEV [1285695536.003037] online /devices/xen-backend/vif-0-3 (xen-backend)
UDEV [1285695536.029058] add /class/net/vif0.3 (net)
Then i’ve simply launched a:
> apt-get install xen-linux-system-2.6.26-2-xen-amd64
into the same domU for the scripts to install.
Now when i try to connect another domU to the backend-domU i get this error:
> Device * could not be connected. Could not find bridge and none was specified.
Am i missing something?
And another question, is it possible to connect dom0 to the backend-domU in the same way?
Sorry for the bad english.
In your driver domain you need the “backend” portion of the network driver. The kernel config you show above has this built in: CONFIG_XEN_NETDEV_BACKEND=y
The error message you’re seeing is from the xen bridged network script. It’s telling you that it can’t find a bridge device to configure. I’ve found that the error messages that come out of these scripts aren’t always accurate but if I were you I’d start by going through this script and doing some debugging.
I tried to follow this setting 3 virtual machines( dom0, nicdom, client domain in HVM )with Xen and Debian, but I’m still having some truble.
I’ve followed the guide till “udev events”, and i make “xm network-attach client mac=XX:XX:XX:XX:XX:XX,backend=nicdom”, but i get this error:
Error: Device 0 (vif) could not be connected. Hotplug scripts not working.
so i looked at the /etc/xen/xen-hotplug.log, the error is “can`t add vif1.0 to bridge eth0: Operation not supported”.
Am i missing something?
The followings are my udev rules, scripts:
Thanks Allen, glad this was useful. First off I’ve never tried th is with an HVM guest. I don’t think this will not work unless you have PV network drivers in your client as the driver VM (nicdom) will only offer network services through the netback driver.
The error you’re seeing in the hotplug log seems to be from attempting to add the virtual interface to the bridge. I’d guess that ‘eth0’ isn’t the name of your bridge, likely that’s the physical ethernet device. This may not be the case though because I remember the xen networking scripts doing some crazy stuff renaming interfaces and bridges so it may be that the script stole the name of the ethernet interface and gave it to the bridge. Regardless I’d check to see that when the script is running eth0 is in fact the device you think it is (the bridge).
Good luck and let me know how your experiment turns out. If you’re using a more recent version of Xen let me know. I’d love to update this howto for the latest Xen version soon.
Thanks for your attention, Philip, glad for communicating with you about xen.
i think about what you said ,but after exporting the network hardware from dom0 to nicdom, i make “ifconfig -a” in domain0, there are three links :lo, pan0, and tmpbridge;
But you know ,when starting the xend ,the script of “/etc/xen/scripts/network-bridge “, finds the device eth0 by default, at this time ,there is not eth0 in domian0, so i confused with this ,
/etc/sysconfig/network-scripts/ifcfg-eth0 configuration file, because in my system ,i can not find it .
Well once you pass the network hardware through to the nicdom it won’t show up in dom0 any longer. The networking scripts you’re looking at in dom0 shouldn’t get run (throw some echo’s in there to prove it to yourself). You’ll need those same scripts in the nicdom though because that’s where they’ll be run by the udev rules that you’ll need to install in the nicdom as well.
I am very thankful to you , Philip. I also readed your bolg again , there is a problem confused me all the time , and in your article you said that when you hide the PCI device successfully, in your system , you issue ip addr ,you only see the loopback(lo) device , but in my system , there are three devices , and they are lo, tmpbridge, and pan0. I did nothing , but i don’t know why my system has other two devices different from your system.
On a multi-NIC system, unloading the network driver could stop more than one NIC. To hide exactly one NIC from dom0, while leaving others active, use this sequence:
echo -n 0000:00:19.0 > /sys/bus/pci/drivers/e1000e/unbind
echo -n 0000:00:19.0 > /sys/bus/pci/drivers/pciback/new_slot
echo -n 0000:00:19.0 > /sys/bus/pci/drivers/pciback/bind
This works whether xen-pciback is compiled in a monolithic kernel, or compiled as a module (Debian default).
I have a question about xen driver development. Currently I want to update xen-netback driver, but in order to deploy a new version into kernel I have to restart my host machine, which is time consuming operation. Is there any way to update xen drivers on the fly. Also I’ve heard about hotplug scripts in xen, what is the purpose of that?