Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.4.2
-
3
-
12446
Description
Durding active I/O on OSS (e.g. IOR from client), if OSS is reset (not umount, but like force reset), and when OSS comes up, the mount all OSTs, it shows strange OST size like below.
[root@noss01 mount]# df -h -t lustre
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/OST00 22T -17G 22T 0% /mnt/lustre/OST00
/dev/mapper/OST01 22T -19G 22T 0% /mnt/lustre/OST01
/dev/mapper/OST02 22T -17G 22T 0% /mnt/lustre/OST02
/dev/mapper/OST03 22T -19G 22T 0% /mnt/lustre/OST03
/dev/mapper/OST04 22T -17G 22T 0% /mnt/lustre/OST04
It is easy to reproduce the problem. The script "run.sh" is able to reproduce the problme on a server named "server1" and a virtual machine named "vm1".
After some investigation, we found some facts about this problem. After the problem happends, the OST file system is corrupted. Following is the fsck result.
===============================================================================
- fsck -y /dev/sdb3
fsck from util-linux-ng 2.17.2
e2fsck 1.42.7.wc1 (12-Apr-2013)
server1-OST0002 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (560315, counted=490939).
Fix? yes
[QUOTA WARNING] Usage inconsistent for ID 0:actual (1220608, 253) != expected (0, 32)
Update quota info for quota type 0? yes
[QUOTA WARNING] Usage inconsistent for ID 0:actual (1220608, 253) != expected (0, 32)
Update quota info for quota type 1? yes
server1-OST0002: ***** FILE SYSTEM WAS MODIFIED *****
server1-OST0002: 262/131648 files (0.4% non-contiguous), 35189/526128 blocks
===============================================================================
Second, after the OSS crashes and before mounts the OST again, fsck shows the free inode/space in the super block is false. That is not a big problem since fsck is able to fix that problem easily. Somehow Lustre makes the problem bigger if this tiny problem is not fixed.
===============================================================================
[root@vm1 ~]# fsck -n /dev/sdb3
fsck from util-linux-ng 2.17.2
e2fsck 1.42.7.wc1 (12-Apr-2013)
Warning: skipping journal recovery because doing a read-only filesystem check.
server1-OST0002: clean, 13/131648 files, 34900/526128 blocks
[root@vm1 ~]# fsck /dev/sdb3
fsck from util-linux-ng 2.17.2
e2fsck 1.42.7.wc1 (12-Apr-2013)
server1-OST0002: recovering journal
Setting free inodes count to 131387 (was 131635)
Setting free blocks count to 420283 (was 491228)
server1-OST0002: clean, 261/131648 files, 105845/526128 blocks
===============================================================================
What's more, after the OSS crashes and before mounts the OST again, we have two ways to prevent to problem from happening, fsck that OST or mount/umount that OST using ldiskfs.
We also found that this problem is not reproducable on Lustre versions before 6a6561972406043efe41ae43b64fd278f360a4b9, simply because versions before that commit do a premount/umount before start OST service.