Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.11.0, Lustre 2.12.0
-
None
-
3
-
9223372036854775807
Description
An attempt to run racer.sh on a DNE system with rpms built from wc master branch:
[root@cslmodev100 racer]# sh racer.sh -t 180 -T 7 -f 20 -c -d /mnt/testfs/racer-dir/ Directory: /mnt/testfs/racer-dir/ Time Limit: 180 Lustre Tests: 1 Max Files: 20 Threads: 7 Running Tests: lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote MDS Count: 3 Running racer.sh for 180 seconds. CTRL-C to exit file_create: FILE=/mnt/testfs/racer-dir//10 SIZE=207136 file_create: FILE=/mnt/testfs/racer-dir//17 SIZE=4800 file_create: FILE=/mnt/testfs/racer-dir//5 SIZE=73416 file_create: FILE=/mnt/testfs/racer-dir//19 SIZE=116024 ... file_create: FILE=/mnt/testfs/racer-dir//0 SIZE=234400 file_create: FILE=/mnt/testfs/racer-dir//2 SIZE=136432 file_create: FILE=/mnt/testfs/racer-dir//8 SIZE=53296 file_create: FILE=/mnt/testfs/racer-dir//12 SIZE=233528 racer cleanup sleeping 5 sec ... lustre_file_create.sh: no process found dir_create.sh: no process found file_rm.sh: no process found file_rename.sh: no process found file_link.sh: no process found file_symlink.sh: no process found file_list.sh: no process found file_concat.sh: no process found file_exec.sh: no process found dir_remote.sh: no process found there should be NO racer processes: root 201964 0.0 0.0 112660 988 pts/24 S+ 11:03 0:00 grep -E lustre_file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat|file_exec|dir_remote Filesystem 1K-blocks Used Available Use% Mounted on 172.18.1.3@o2ib1,172.18.1.4@o2ib1:/testfs 240559470792 4289196 238129624412 1% /mnt/testfs We survived racer.sh for 180 seconds.
e2fsck on the MDT0 device:
[root@cslmodev103 ~]# umount /dev/md66 [root@cslmodev103 ~]# e2fsck -fvn /dev/md66 e2fsck 1.42.13.x6 (01-Mar-2018) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Inode 2061541938 ref count is 3, should be 2. Fix? no Unattached inode 2061541970 Connect to /lost+found? no Inode 2061541986 ref count is 2, should be 1. Fix? no Unattached inode 2061542181 Connect to /lost+found? no Inode 2061542575 ref count is 10, should be 9. Fix? no Inode 2061542583 ref count is 6, should be 5. Fix? no Pass 5: Checking group summary information [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295) Update quota info for quota type 0? no [QUOTA WARNING] Usage inconsistent for ID 0:actual (1248931840, 1295) != expected (1248919552, 1295) Update quota info for quota type 1? no testfs-MDT0000: ********** WARNING: Filesystem still has errors ********** 1304 inodes used (0.00%, out of 3042005760) 85 non-contiguous files (6.5%) 2 non-contiguous directories (0.2%) # of inodes with ind/dind/tind blocks: 72/64/0 381833025 blocks used (25.10%, out of 1520996090) 0 bad blocks 2 large files 555 regular files 491 directories 0 character device files 0 block device files 0 fifos 107 links 249 symbolic links (249 fast symbolic links) 0 sockets ------------ 1400 files [root@cslmodev103 ~]# [root@cslmodev103 ~]# dumpe2fs -h /dev/md66 | grep -i state dumpe2fs 1.42.13.x6 (01-Mar-2018) Filesystem state: clean [root@cslmodev103 ~]#
Invalid symlink inodes are due to LU-11130, wrong nlinks are due to LU-11446,
unattached inodes are what this ticket is about.
The racer test script doesn't use migrate or striped dirs,
just "lustre_file_create dir_create file_rm file_rename file_link file_symlink file_list file_concat file_exec dir_remote". Also there is no failovers.
Lustre is built from the tip of the wc master branch:
$ git log --oneline wc/master fe7c13bd48 (wc/master) LU-11329 utils: create tests maintainers list 70a01a6c9c LU-11276 ldlm: don't apply ELC to converting and DOM locks 72372486a5 LU-11347 osd: do not use pagecache for I/O 8b9105d828 LU-11199 mdt: Attempt lookup lock on open 697e8fe6f3 LU-11473 doc: add lfs-getsom man page ed0c19d250 LU-1095 misc: quiet console messages at startup ....
[root@cslmodev103 ~]# rpm -q lustre_ib lustre_ib-2.11.56_16_gfe7c13b-1.el7.centos.x86_64 [root@cslmodev103 ~]#
Attachments
Issue Links
- is related to
-
LU-12848 Add test case for LU-11549
-
- Resolved
-
-
LU-13346 Fix link and rename race on zfs odb
-
- Resolved
-
-
LU-11706 create a lustre tunable to enable/disable experimental features
-
- Resolved
-
- is related to
-
LU-11446 ldiskfs inodes nlink mismatch with DNE
-
- Open
-
-
LU-11130 cross-target rename creates invalid symlink inodes
-
- Resolved
-
-
LU-3537 allow cross-MDT for all metadata operations
-
- Resolved
-
The fix for this problem has been landed for 2.13.0 so this ticket should be closed so that it can be tracked properly for the release. Since the test patch is not currently passing testing I've opened a separate ticket to track that landing.