FibreChannel and Linux at CDF

Background

The CDF project is using FibreChannel extensively, both for moving data from the detector and for subsequent access to it. We have recently begun testing FC testing with Linux to see what level of support and performance we can attain.

Note that I've moved the links for vendors, software and so forth to the Resources page.

Current Status

As of 2/01 or so, we have settled on the 2.4.2 kernel, and are quite happy with the system. We are still experimenting with hot-plug on the FC loop, but other than that, things are fast and stable.

We still have not had the opportunity to test the Emulex cards, so all experience has been with the QLogix cards.

Hardware

  • The main test PC is an SGI 1450 (quad Xeon 700s, 4GB memory, very spiffy.)
  • For FibreChannel cards, we are using the QLogic 2200 and 2100 cards purchased from TeamExcess.
  • We have some Emulex cards (LP7000 and LP8000) that now have released Linux drivers; we have not yet tested them.
  • Disks are assorted, using both plain FC disks as well as Chapparal and Adaptec FC RAID controllers.
  • Software

  • Linux drivers are included with 2.2 and newer kernels as the "qlogicfc" module in the SCSI category. We have used kernel 2.4.0test8 through 2.4.1
  • Distributions are RedHat 6.2 and Debian 2.2.
  • The scsiadd program is useful for re-scanning the FC loop if something is added after the driver is loaded. By default, Linux does not add any disks that weren't present at driver load. Most annoying, but scsiadd does an OK job.
  • The bonnie benchmark from Tim Bray is useful for performance testing; we run it with a 2GB test file size: 'bonnie -s 2047'. See the notes below for discussion of the results.
  • The revised bonnie++ benchmark has many improvments; including a removal of the 2G limit. We'll start posting results using the updated version as time permits.


    Caveats and Discussion

    Memory Size Effects and the 2G Limit

    There is a problem with testing on the 1450 - it has 4GB of memory, more than enough to cache the test file in memory, rendering the results worthless. Normally, one just specifies a file larger than the memory and that ensures that the cache will overflow. However, in this case, the current file size limit is 2GB, mostly due to the limitations of routines like fseek().

    Due to this, the bonnie results should be taken skeptically. We hope that iozone will have a workaround for the problem. Short of physically reducing the amount of memory present, or rewriting our benchmarks, we've not found a simple workaround.

    Note that, as of 4/02, I have email from the iozone author that this has been fixed in newer versions of iozone. See the URL below to get them.

    Bonnie++ (an updated version of Bonnie with a new maintainer) is addressing these problems; we don't know how its results compare to Bonnie however.

    Driver Issues, Logistics and Utilities

    In general, the QLogic driver for Linux seems to work well, but breaking the FC loop causes problems that range from new devices not being found to complete machine crashes. For example, if you power up a new device while writing a filesystem on a different disk, the loop reinitializes and the kernel panics. Given that one of the advantages of FC is hot-pluggability, we are looking into the difficulty of modifying the driver to work around the problem.

    SGI Irix boxes have a useful program called 'scsiha' that can be used to reset and probe a SCSI or FC bus; the 'scsiadd' program is useful but nowhere near as complete. On Solaris, the 'devfsadm' program will re-probe all devices and rebuild the /dev and /devices entries; quite convenient. Both of these are more polished than the Linux equivalent; how much this affects operations depends on how dynamic the FC loop or fabric is. For static configurations, it is a non-issue.

    Problems

    We had a problem with a repeatably hanging the FC loop; this was fixed by a patch to the Chapparal firmware. If it recurs we'll investigate further.

    Performance Results

    For reference purposes, here are the results of running the bonnie benchmark from Tim Bray on one of the internal IBM 9GB SCSI disks, with a 1900MB test size:
                  -------Sequential Output-------- ---Sequential Input-- --Random--
                  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
    Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
             1900 13156 89.1 16934 10.1 15135 10.4 14926 100.0 373872 100.1 58656.2 293.3
    

    (This is the SGI 1450, quad Xeon 700s, 4G memory, Adaptec 7899 onboard controller, RedHat 6.2, with kernel 2.4.0test8 with the sd.c module patch.)

    We are also looking at the iozone benchmark to get a more complete picture. We also have internally-developed RAID bencharks that do large concurrent sequential reads / writes for measuring scalability for CDF-type applications; those are on the way.


    Navigation Links

  • Introduction / HOWTO: Fibre Channel
  • FC at CDF page
  • FC at home page
  • FC at home, part two - chassis and tapes
  • Resources page
  • Back to home page