Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1384

MDS Kernel Panic when trying to mount the lustre file system

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.3.0
    • Lustre 2.2.0, Lustre 2.3.0
    • None
    • 3
    • 4605

    Description

      After the mkfs of all the FS I was able to mount it, and do a simple 'dd' to create few files. Once that I mount it on 12 client with lustre 1.8.4 and trying to make IOR benchmark, using 2 nodes for a total of 12 cores the file system immediately hang and the MDS01 had a kernel panic, as follow:
      Message from syslogd@mds01 at May 8 12:00:59 ...
      kernel:LustreError: 3523:0:(mdd_object.c:635:mdd_big_lmm_get()) ASSERTION( ma->ma_lmm_size > 0 ) failed:

      Message from syslogd@mds01 at May 8 12:00:59 ...
      kernel:LustreError: 3523:0:(mdd_object.c:635:mdd_big_lmm_get()) LBUG
      Write failed: Broken pipe

      The heartbeat tried to takeover but immediately had kernel panic too:

      Message from syslogd@mds02 at May 8 12:04:05 ...
      kernel:LustreError: 3657:0:(mdd_object.c:635:mdd_big_lmm_get()) ASSERTION( ma->ma_lmm_size > 0 ) failed:

      Message from syslogd@mds02 at May 8 12:04:05 ...
      kernel:LustreError: 3657:0:(mdd_object.c:635:mdd_big_lmm_get()) LBUG
      Write failed: Broken pipe

      To make the file system I did as the attached file weisshorn_mkfs.sh

      The SSD Lun is built on a LSI SSD controller with RAID10.

      Any suggestions or input that I can try to fix the problem?
      Attached also the /var/log/messages with the kernel messages.

      Attachments

        1. error5.png
          error5.png
          11 kB
        2. kernel_panic.png
          kernel_panic.png
          11 kB
        3. kernel_panic2.png
          kernel_panic2.png
          17 kB
        4. kernel_panic3.png
          kernel_panic3.png
          28 kB
        5. lfs_check_servers.log
          6 kB
        6. messages1
          515 kB
        7. weisshorn_mkfs.sh
          12 kB

        Activity

          People

            bobijam Zhenyu Xu
            fverzell Fabio Verzelloni
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: