Sunday, June 1, 2014

kernel panic scsi_wait_scan on elrepo kernel-ml

Browsing around ELRepo, you can see they have the elrepo-kernel repository which contains the long-life kernel-lt (currently 3.10) and mainline kernel-ml (currently 3.14) packages.

I'd always been curious about these, so I spun up a test VM and did a CentOS 6 Basic Server install, then added ELRepo and installed them.

kernel-lt works fine first time.

However, kernel-ml will not boot, and simply repeats:

FATAL: Module scsi_wait_scan not found.

This was discussed on the upstream kernel bug tracker:

Bug 60758 - module scsi_wait_scan not found kernel panic on boot

As noted in that bug, the problem can be permanently resolved by adding a file like:

echo 'add_drivers+="virtio_blk"' >/etc/dracut.conf.d/force-vitio_blk-to-ensure-boot.conf

Then backup and rebuild the initramfs:

cp /boot/initramfs-3.14.4-1.el6.elrepo.x86_64.img /boot/initramfs-3.14.4-1.el6.elrepo.x86_64.img.bak
dracut -f /boot/initramfs-3.14.4-1.el6.elrepo.x86_64.img 3.14.4-1.el6.elrepo.x86_64

The root cause is that the virtio disk driver has changed from using blk_init_queue() in EL6 kernel to blk_mq_init_queue() in kernel 3.13 onwards, but the older EL6 dracut does not consider the latter symbol to be required for the storage drivers, so the initramfs is built without the storage driver.

This was fixed on Fedora 20 Bug 1067669, and has been cloned to RHEL6 Bug 1103455.

Sunday, March 16, 2014

hosting http and https domains with lighttpd

I have two domains, one with an SSL certificate and one without. I wanted to host these on the same lighttpd server which only has one IP address.

There are several forum threads and lighty wiki articles about this, but none really cover a concise example of how to do it. Many parts of the documentation are confusing and non-obvious, the situation is not helped by the fact that you cannot do name-based SSL hosting, so what you'd expect to be the logical configuration doesn't work.

After staring at these for a few hours:
And a lot of trial-and-error, here's the config I ended up using:

$HTTP["scheme"] == "http" {

  $HTTP["host"] =~ "(^|\.)" { = ""
    server.document-root = "/var/www/"

  else $HTTP["host"] =~ "(^|\.)" {
    url.redirect = (".*" => "$0")


$HTTP["scheme"] == "https" {

  $SERVER["socket"] == "X.X.X.X:443" { = ""
    ssl.engine  = "enable" = "/etc/lighttpd/certs/ca-cert-class1.crt"
    ssl.pemfile = "/etc/lighttpd/certs/"

    server.document-root = "/var/www/"

    # mitigate BEAST attack
    ssl.cipher-list = "ECDHE-RSA-AES256-SHA384:AES256-SHA256:RC4-SHA:RC4:HIGH:!MD5:!aNULL:!EDH:!AESGCM"
    # mitigate CVE-2009-3555
    ssl.disable-client-renegotiation = "enable"

The "X.X.X.X" needs to be your server's IP address, the same IP which the domain in your SSL cert resolves to.

I haven't tested, but you could probably scale this up to host multiple non-SSL domains on the same host as the SSL domain. Hosting multiple SSL domains is well-covered in the lighty documentation.

The SSL cert and setup I did by following Switch to HTTPS Now, For Free over at Eric Mill's blog.

When using lighttpd, you'll also need these instructions to make a unified certificate.

Tuesday, February 4, 2014

realtek RTL8188CUS slow on Raspberry Pi

I recently had an adventure troubleshooting slow wifi on one of my Raspberry Pi systems. No matter what I did, I could not get more than 57 kilobytes per second transfer speed to it.

First I tried different transfer methods (Samba and SSH) with no change, so it wasn't the transfer method.

Then I tested the storage. I changed to a fast USB hard drive, moved the USB hard drive around the USB hub in case of power/throughput issues, but was eventually satisfied it wasn't storage when I could get 22 megabyte throughput with a hdparm -t. I should have just done that first.

Next I suspected either USB or wifi.

A lot of forum posts suggest to try dwc_otg.speed=1 in cmdline.txt however that reduces the chipset to USB1 speeds (11Mb/sec) which wasn't a compromise I was willing to make. There was mention of something called FIQ from 2011/2012, but these improvements are included in the latest (2014-01-07) Raspbian image, so there's no need to tinker with FIQ anymore.

I wondered if it was the wifi signal, so I moved the Pi right next to the router, but no change. I plugged in an ethernet cable and the speed improved immensely. So it was either USB or wifi.

Searching around, I read many reports of people having problem with these Realtek RTL8188CUS (driver 8192cu) wifi dongles. There were several suggestions to make a file like /etc/modprobe.d/8192cu.conf and turn off the adaptor's power management features with the contents:

options 8192cu rtw_power_mgnt=0 rtw_enusbss=0

I tried this but still no luck, transfers still sat at 57kb/sec.

At this point I remembered my other Raspberry Pi was using a different USB wifi adaptor, an RaLink RT5370 (driver rt2800usb). I tried this dongle and the speed instantly improved. We can now rule out USB and wifi signal, and place the blame on the wifi adaptor.

I have two of these Realtek adaptors, a black one with EDUP on it, and a white one with COMFAST on it. Both produced the slow transfer speed, so it wasn't unique to this one adaptor.

As best I can figure out, either Realtek's driver or the implementation of the wifi hardware is rubbish, and there's no way it can be fixed.

I've ordered another RaLink dongle off eBay.

Thursday, November 21, 2013

yum grouplist doesn't display all groups

I've noticed in Fedora that the yum grouplist command doesn't actually display all the groups:

# yum grouplist | grep Virt
# yum grouplist | grep Virt | wc -l 0

However you can still see a group with groupinfo:

# yum groupinfo Virtualization
Loaded plugins: langpacks, list-data, presto, refresh-packagekit
Group: Virtualization
 Group-Id: virtualization
 Description: These packages provide a virtualization environment.

Apparently there can be a "hidden" tag on groups, as reported in this Bugzilla entry:

Bug 986531 - unhide all groups from yum grouplist

You can see all the groups with the yum grouplist hidden command:

# yum grouplist | wc -l
# yum grouplist hidden | wc -l

Wednesday, November 13, 2013

how does mke2fs decide automatic check mount count?

Whenever you create an ext3 or ext4 filesystem with mkfs or mke2fs, a message is printed saying "This filesystem will be automatically checked every X mounts or Y days, whichever comes first".

However, the number seems to vary depending on the filesystem. How is this value calculated? I used the source of e2fsprogs-1.42.8 and the cscope source code tool to find this out.

Searching for the origin of the message, we can search for a short part of the string like "mounts or", and we see it's printed by this function:

    281 void print_check_message(int mnt, unsigned int check)
    282 {
    283     if (mnt < 0)
    284         mnt = 0;
    285     if (!mnt && !check)
    286         return;
    287     printf(_("This filesystem will be automatically "
    288          "checked every %d mounts or\n"
    289          "%g days, whichever comes first.  "
    290          "Use tune2fs -c or -i to override.\n"),
    291            mnt, ((double) check) / (3600 * 24));
    292 }

Which is called here:

    628 /*
    629  * Add a journal to the filesystem.
    630  */
    631 static int add_journal(ext2_filsys fs)
    632 {
    696     print_check_message(fs->super->s_max_mnt_count,
    697                 fs->super->s_checkinterval);

So now we need to find what puts the value into s_max_mnt_count. Hunting for uses of that symbol, we see an equals sign here:

   2440     if (get_bool_from_profile(fs_types, "enable_periodic_fsck", 0)) {
   2441         fs->super->s_checkinterval = EXT2_DFL_CHECKINTERVAL;
   2442         fs->super->s_max_mnt_count = EXT2_DFL_MAX_MNT_COUNT;

   2443         /*
   2444          * Add "jitter" to the superblock's check interval so that we
   2445          * don't check all the filesystems at the same time.  We use a
   2446          * kludgy hack of using the UUID to derive a random jitter value
   2447          */
   2448         for (i = 0, val = 0 ; i < sizeof(fs->super->s_uuid); i++)
   2449             val += fs->super->s_uuid[i];
   2450         fs->super->s_max_mnt_count += val % EXT2_DFL_MAX_MNT_COUNT;

And we can see the MAX_MNT_COUNT is just a precompiler definition:

    519  * Maximal mount counts between two filesystem checks
    520  */
    521 #define EXT2_DFL_MAX_MNT_COUNT      20  /* Allow 20 mounts */

So we take 20, add up all the ASCII characters of the UUID and modulo that by 20, then add it to the original 20. It's just maths from here. 20+(x%20) gives an effective range of 20 to 39 days.

We can also tell from the comment in mke2fs.c as to the reason behind this randomness.