Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4557

Negative used block number of OST after OSS crashes and reboots

    XMLWordPrintable

Details

    • 3
    • 12446

    Description

      Durding active I/O on OSS (e.g. IOR from client), if OSS is reset (not umount, but like force reset), and when OSS comes up, the mount all OSTs, it shows strange OST size like below.

      [root@noss01 mount]# df -h -t lustre
      Filesystem Size Used Avail Use% Mounted on
      /dev/mapper/OST00 22T -17G 22T 0% /mnt/lustre/OST00
      /dev/mapper/OST01 22T -19G 22T 0% /mnt/lustre/OST01
      /dev/mapper/OST02 22T -17G 22T 0% /mnt/lustre/OST02
      /dev/mapper/OST03 22T -19G 22T 0% /mnt/lustre/OST03
      /dev/mapper/OST04 22T -17G 22T 0% /mnt/lustre/OST04

      It is easy to reproduce the problem. The script "run.sh" is able to reproduce the problme on a server named "server1" and a virtual machine named "vm1".

      After some investigation, we found some facts about this problem. After the problem happends, the OST file system is corrupted. Following is the fsck result.
      ===============================================================================

      1. fsck -y /dev/sdb3
        fsck from util-linux-ng 2.17.2
        e2fsck 1.42.7.wc1 (12-Apr-2013)
        server1-OST0002 contains a file system with errors, check forced.
        Pass 1: Checking inodes, blocks, and sizes
        Pass 2: Checking directory structure
        Pass 3: Checking directory connectivity
        Pass 4: Checking reference counts
        Pass 5: Checking group summary information
        Free blocks count wrong (560315, counted=490939).
        Fix? yes

      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1220608, 253) != expected (0, 32)
      Update quota info for quota type 0? yes

      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1220608, 253) != expected (0, 32)
      Update quota info for quota type 1? yes

      server1-OST0002: ***** FILE SYSTEM WAS MODIFIED *****
      server1-OST0002: 262/131648 files (0.4% non-contiguous), 35189/526128 blocks
      ===============================================================================
      Second, after the OSS crashes and before mounts the OST again, fsck shows the free inode/space in the super block is false. That is not a big problem since fsck is able to fix that problem easily. Somehow Lustre makes the problem bigger if this tiny problem is not fixed.
      ===============================================================================
      [root@vm1 ~]# fsck -n /dev/sdb3
      fsck from util-linux-ng 2.17.2
      e2fsck 1.42.7.wc1 (12-Apr-2013)
      Warning: skipping journal recovery because doing a read-only filesystem check.
      server1-OST0002: clean, 13/131648 files, 34900/526128 blocks
      [root@vm1 ~]# fsck /dev/sdb3
      fsck from util-linux-ng 2.17.2
      e2fsck 1.42.7.wc1 (12-Apr-2013)
      server1-OST0002: recovering journal
      Setting free inodes count to 131387 (was 131635)
      Setting free blocks count to 420283 (was 491228)
      server1-OST0002: clean, 261/131648 files, 105845/526128 blocks
      ===============================================================================
      What's more, after the OSS crashes and before mounts the OST again, we have two ways to prevent to problem from happening, fsck that OST or mount/umount that OST using ldiskfs.
      We also found that this problem is not reproducable on Lustre versions before 6a6561972406043efe41ae43b64fd278f360a4b9, simply because versions before that commit do a premount/umount before start OST service.

      Attachments

        Activity

          People

            hongchao.zhang Hongchao Zhang
            lixi Li Xi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: