Pages - Menu

Showing posts with label libvirt. Show all posts
Showing posts with label libvirt. Show all posts

Wednesday, January 16, 2013

how to network jumbo frames to a kvm guest

Getting a jumbo frame to a KVM guest is not something which works out of the box, but can be configured to work quite easily.

One way get jumbo frames to a KVM guest is to directly pass the guest a physical NIC through PCI Passthrough, or a Virtual Function through SR-IOV.

Another way is to use the traditional libvirt bridge. Create a bridge on the host, and bridge the guest's NICs into this. The reason this doesn't work by default is all the interfaces are created with the default Ethernet MTU of 1500 bytes.

You can set MTU of the physical NIC, the bridge, and the guest interface with the network-scripts, but you cannot automatically set the MTU of the tap interface with the network-scripts. The tap interface is how the virtual network interface connects to the "real world". You'll see these tap devices with names like "vnet0" in your bridge.

So how do we do this?

First, some requirements:
  • A physical interface which supports jumbo frames.
    The Linux bridge code will only allow the bridge to change to the smallest MTU of all its interfaces.
  • A bridge with the larger required MTU
    Technically a jumbo frame covers MTU from 1501 to 9000, but almost everyone just wants to set 9000
  • virtio-net or Intel e1000 network interface inside the guest
    The real-life Realtek 8139 does not support an MTU of 9000, therefore there is no driver support for large MTU in the emulated version either.
Our first step is to make all tap interfaces on the host have an MTU of 9000. We do this by adding a udev rule to the top of the network device creation as follows:

/etc/udev/rules.d/70-persistent-net.rules (on host)
SUBSYSTEM=="net", ACTION=="add", KERNEL=="vnet*", ATTR{mtu}="9000"
Then set your physical interface MTU to 9000:

/etc/sysconfig/network-scripts/ifcfg-eth0 (on host):
DEVICE=eth0
HWADDR=[physical MAC address]
TYPE=Ethernet
ONBOOT=yes
BRIDGE=br0
MTU=9000
And do the same inside the bridge:

/etc/sysconfig/network-scripts/ifcfg-br0 (on host)
DEVICE=br0
TYPE=Bridge
ONBOOT=yes
DELAY=0
MTU=9000
Now inside your guest, set your virtio or e1000 interface to also have the larger MTU:

/etc/sysconfig/network-scripts/ifcfg-eth0 (in guest)
DEVICE=eth0
HWADDR=[guest MAC address]
TYPE=Ethernet
ONBOOT=yes
MTU=9000
You'll need to perform a "service network restart" on the host, or at least bring the bridge and eth interfaces down and up. Start your guest and confirm all interfaces have been made with the correct MTU:
# ip link
2: eth0: mtu 9000 qdisc pfifo_fast state UP qlen 1000
    link/ether 01:23:45:67:89:0A brd ff:ff:ff:ff:ff:ff
3: br0: mtu 9000 qdisc noqueue state UNKNOWN
    link/ether 01:23:45:67:89:0A brd ff:ff:ff:ff:ff:ff
4: vnet0: mtu 9000 qdisc pfifo_fast state UNKNOWN qlen 500
    link/ether fe:54:00:01:23:45 brd ff:ff:ff:ff:ff:ff
This should be enough to send a "Do Not Fragment" 9000 byte frame over the network. Let's test with a ping:
# ping -c 4 -s 8972 -M do 172.16.0.2
PING 172.16.0.2 (172.16.0.2) 8972(9000) bytes of data.
8980 bytes from 172.16.0.2: icmp_seq=1 ttl=64 time=1.27 ms
8980 bytes from 172.16.0.2: icmp_seq=2 ttl=64 time=0.284 ms
8980 bytes from 172.16.0.2: icmp_seq=3 ttl=64 time=0.202 ms
8980 bytes from 172.16.0.2: icmp_seq=4 ttl=64 time=0.260 ms
Success!

Saturday, August 4, 2012

kvm virtual machine won't start

Recently I brought my KVM host down for an upgrade. I shut down the guests within their OSes, confirmed they were powered off with a virsh list --all, and everything went well with the upgrade.

However, upon returning the guests to service, one of them would not start up. Typing virsh start I got the helpful error message

Error restoring domain: cannot send monitor command
Connection reset by peer

Why was this one VM broken but the others were fine?

I played around with virsh autostart and virsh autostart --disable but that had no effect, nor did a reboot of the hypervisor.

After some searching around, it turns out libvirt has the capability to keep "managed save states" of guests, kinda like a sleep mode or snapshot, to save you fully powering a guest OS off.

For some reason, a managed save for this one guest had been created, perhaps it had not shut down properly, or perhaps there's an errorr in libvirt. I could view the saved state with

# virsh list --all --managed-save
 Id Name                 State
----------------------------------
  - guest1               shut off
  - guest2               saved


Now a virsh managedsave-remove guest2 returned it to the "shut off" state, and I could start it properly with virsh start as per usual.