Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17397

mdtest failed (Lustre became read-only) under high stress

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.15.0, Lustre 2.15.3
    • None
    • client/server: CentOS-8.5.2111 + Lustre 2.15.3
      Linux 4.18.0-348.2.1.el8_lustre.x86_64 #1 SMP Fri Jun 17 00:10:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

    Description

      We test metadata performance in a simple Lustre environment, where we deploy two servers (#server01, #server02) and both connect to a SAN storage:

      • For #server01: we mount a MGT, a MDT, and four OSTs 
      • For #server02: we mount a MDT, and four OSTs

      Here, MDS and OSS run in the same server, and Lustre includes two MDTs and 8 OSTs.

      [root@client02 lustre]# lfs df -h
      UUID                       bytes        Used   Available Use% Mounted on
      l_lfs-MDT0000_UUID          1.8T       39.2G        1.6T   3% /lustre[MDT:0] 
      l_lfs-MDT0001_UUID          1.8T       39.4G        1.6T   3% /lustre[MDT:1] 
      l_lfs-OST0000_UUID         11.9T        3.5T        7.8T  31% /lustre[OST:0] 
      l_lfs-OST0001_UUID         11.9T        3.6T        7.7T  32% /lustre[OST:1] 
      l_lfs-OST0002_UUID         11.9T        3.6T        7.7T  32% /lustre[OST:2] 
      l_lfs-OST0003_UUID         11.9T        3.6T        7.7T  32% /lustre[OST:3] 
      l_lfs-OST0004_UUID         11.9T        3.8T        7.5T  34% /lustre[OST:4] 
      l_lfs-OST0005_UUID         11.9T        3.5T        7.8T  32% /lustre[OST:5] 
      l_lfs-OST0006_UUID         11.9T        3.5T        7.8T  31% /lustre[OST:6] 
      l_lfs-OST0007_UUID         11.9T        3.6T        7.7T  32% /lustre[OST:7] 

      filesystem_summary:        95.1T       28.6T       61.8T  32% /lustre

       

      We leverage mdtest, mpirun with two clients to test metadate performance under the configuration above, the test command is as follows:

      • $> mpirun --allow-run-as-root --oversubscribe -mca btl ^openib --mca btl_tcp_if_include 40.40.22.0/24 -np 64 -host client01:32,client02:32 --map-by node mdtest -L -z 3 -b 2 -I 160000 -i 1 -d /lustre/mdtest_demo | tee 2client_64np_3z_2b_160000I.log

       

      After stably running around 15 mins, Lustre becomes read-only (blocks the whole test) and generate the sys log as follows:

      [Fri Jan  5 17:29:36 2024] Lustre: l_lfs-OST0001: deleting orphan objects from 0x440000400:26730785 to 0x440000400:26744321
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] Aborting journal on device ultrapatha-8.
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs (ultrapatha): Remounting filesystem read-only
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): ldiskfs_journal_check_start:61: Detected aborted journal
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LustreError: 61165:0:(osd_handler.c:1790:osd_trans_commit_cb()) transaction @0x0000000082b2d9d3 commit error: 2
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt08_003: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt08_003: directory leaf block found instead of index block
      [Fri Jan  5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt05_001: directory leaf block found instead of index block
      [Fri Jan  5 17:43:20 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
      [Fri Jan  5 17:43:20 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt18_002: directory leaf block found instead of index block
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error: 355 callbacks suppressed
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block
      [Fri Jan  5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
      [Fri Jan  5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
      [Fri Jan  5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt18_001: directory leaf block found instead of index block
      [Fri Jan  5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
      [Fri Jan  5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block

       

      We repeat the test many times, and still get the similar result (i.e., the LDISKFS-fs error in MDT0 or MDT1), and the workload scale is as follow:

       

      [root@client01 lustre]# lfs quota -u root /lustre/
      Disk quotas for usr root (uid 0):
           Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
             /lustre/ 30805903960       0       0       - 96453928       0       0       -

       

      Originally, we find this issue with 2.15.0 and we try to upgrade to 2.15.3, but this issue still exists and block our test.

       

       

       

       

       

      Attachments

        Activity

          [LU-17397] mdtest failed (Lustre became read-only) under high stress
          yzr95924 Zuoru Yang added a comment -

          @Andreas Dilger Sure, we have evaluated the same test case in AlmaLinux 8.8 + 2.15.3 with the new kernel (4.18.0-477.10.1.el8_lustre.x86_64), now the issue did not occur. Thanks again!

          yzr95924 Zuoru Yang added a comment - @Andreas Dilger Sure, we have evaluated the same test case in AlmaLinux 8.8 + 2.15.3 with the new kernel (4.18.0-477.10.1.el8_lustre.x86_64), now the issue did not occur. Thanks again!

          Time to upgrade your server kernel and rebuild in that case.

          adilger Andreas Dilger added a comment - Time to upgrade your server kernel and rebuild in that case.
          yzr95924 Zuoru Yang added a comment -

          @Andreas Dilger, Hi Andreas, thanks for your insights. We double-checked the linux kernel in our env  (actually, we install the kernel package from the Whamcloud with 2.15.0 repo (later upgrade Lustre server to 2.15.3): https://downloads.whamcloud.com/public/lustre/lustre-2.15.0-ib/MOFED-5.6-1.0.3.3/el8.5.2111/server/RPMS/x86_64/), and we confirm that the kernel in the link does not have the patch. 

          yzr95924 Zuoru Yang added a comment - @Andreas Dilger, Hi Andreas, thanks for your insights. We double-checked the linux kernel in our env  (actually, we install the kernel package from the Whamcloud with 2.15.0 repo (later upgrade Lustre server to 2.15.3): https://downloads.whamcloud.com/public/lustre/lustre-2.15.0-ib/MOFED-5.6-1.0.3.3/el8.5.2111/server/RPMS/x86_64/), and we confirm that the kernel in the link does not have the patch. 
          adilger Andreas Dilger added a comment - - edited

          yzr95924, thank you for your launchpad reference. Indeed that bug looks like it could be related. That patch is reported included in upstream kernel 5.14 and stable kernel 5.11, and fixing a bug originally in kernel 5.11 (but also backported to the RHEL kernel):

          commit 877ba3f729fd3d8ef0e29bc2a55e57cfa54b2e43
          Author:     Theodore Ts'o <tytso@mit.edu>
          AuthorDate: Wed Aug 4 14:23:55 2021 -0400
          
              ext4: fix potential htree corruption when growing large_dir directories
              
              Commit b5776e7524af ("ext4: fix potential htree index checksum
              corruption) removed a required restart when multiple levels of index
              nodes need to be split.  Fix this to avoid directory htree corruptions
              when using the large_dir feature.
              
              Cc: stable@kernel.org # v5.11
              Cc: Artem Blagodarenko <artem.blagodarenko@gmail.com>
              Fixes: b5776e7524af ("ext4: fix potential htree index checksum corruption)
              Reported-by: Denis <denis@voxelsoft.com>
              Signed-off-by: Theodore Ts'o <tytso@mit.edu>
          

          I can confirm that the patch is applied in 4.18.0-425.13.1.el8_7.x86_64 in fs/ext4/namei.c:

                                  if (err)
                                          goto journal_error;
                                  err = ext4_handle_dirty_dx_node(handle, dir,
                                                                  frame->bh);
                                  if (restart || err)
                                          goto journal_error;
          

          but I'm not sure whether it is applied in your kernel 4.18.0-348.2.1.el8_lustre.x86_64.

          adilger Andreas Dilger added a comment - - edited yzr95924 , thank you for your launchpad reference. Indeed that bug looks like it could be related. That patch is reported included in upstream kernel 5.14 and stable kernel 5.11, and fixing a bug originally in kernel 5.11 (but also backported to the RHEL kernel): commit 877ba3f729fd3d8ef0e29bc2a55e57cfa54b2e43 Author: Theodore Ts'o <tytso@mit.edu> AuthorDate: Wed Aug 4 14:23:55 2021 -0400 ext4: fix potential htree corruption when growing large_dir directories Commit b5776e7524af ("ext4: fix potential htree index checksum corruption) removed a required restart when multiple levels of index nodes need to be split. Fix this to avoid directory htree corruptions when using the large_dir feature. Cc: stable@kernel.org # v5.11 Cc: Artem Blagodarenko <artem.blagodarenko@gmail.com> Fixes: b5776e7524af ("ext4: fix potential htree index checksum corruption) Reported-by: Denis <denis@voxelsoft.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> I can confirm that the patch is applied in 4.18.0-425.13.1.el8_7.x86_64 in fs/ext4/namei.c : if (err) goto journal_error; err = ext4_handle_dirty_dx_node(handle, dir, frame->bh); if (restart || err) goto journal_error; but I'm not sure whether it is applied in your kernel 4.18.0-348.2.1.el8_lustre.x86_64 .
          yzr95924 Zuoru Yang added a comment -

          @Andreas Dilger BTW, the reason why I initially consider this issue is related large_dir is this link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933074

          which also reports "directory leaf block found instead of index block" when there are millions of files on ext4. Never mind, we will test this issue with a newer kernel (e.g., in AlmaLinux 8.8 + 2.15.3)

          yzr95924 Zuoru Yang added a comment - @Andreas Dilger BTW, the reason why I initially consider this issue is related large_dir is this link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933074 which also reports "directory leaf block found instead of index block" when there are millions of files on ext4. Never mind, we will test this issue with a newer kernel (e.g., in AlmaLinux 8.8 + 2.15.3)
          yzr95924 Zuoru Yang added a comment -

          @Andreas Dilger Thanks Andreas! We will follow this direction and try the same test with a newer kernel.

          yzr95924 Zuoru Yang added a comment - @Andreas Dilger Thanks Andreas! We will follow this direction and try the same test with a newer kernel.

          Also, have you tried updating to a newer kernel? It is possible that the ext4 in the kernel (and ldiskfs that is generated from this) has a bug that has since been fixed.

          adilger Andreas Dilger added a comment - Also, have you tried updating to a newer kernel? It is possible that the ext4 in the kernel (and ldiskfs that is generated from this) has a bug that has since been fixed.

          Lustre does not modify the on-disk data structures of ldiskfs directly, although it is accessing the filesystem somewhat differently than a regular ext4 mount does. I don't think the issue is with large_dir, but more likely with parallel directory locking and updates. There would need to be some kind of bug in ext4 or the ldiskfs patches applied. It is not possible for the clients to corrupt the server filesystem directly.

          That said, it appears from the e2fsck output that the on-disk data structures are not corrupted, so it seems like this is some kind of in-memory corruption? The free blocks/inode counts quota usage messages are normal for a filesystem that is in use.

          There is a tunable parameter to disable the parallel directory locking and updates with "lctl set_param osd-ldiskfs.lustre-MDT*.pdo=0" on the MDS nodes. Note, that this is never tested and potentially could have some issues, beyond being much slower, but it would be useful to test if this avoids the issue.

          adilger Andreas Dilger added a comment - Lustre does not modify the on-disk data structures of ldiskfs directly, although it is accessing the filesystem somewhat differently than a regular ext4 mount does. I don't think the issue is with large_dir, but more likely with parallel directory locking and updates. There would need to be some kind of bug in ext4 or the ldiskfs patches applied. It is not possible for the clients to corrupt the server filesystem directly. That said, it appears from the e2fsck output that the on-disk data structures are not corrupted, so it seems like this is some kind of in-memory corruption? The free blocks/inode counts quota usage messages are normal for a filesystem that is in use. There is a tunable parameter to disable the parallel directory locking and updates with " lctl set_param osd-ldiskfs.lustre-MDT*.pdo=0 " on the MDS nodes. Note, that this is never tested and potentially could have some issues, beyond being much slower, but it would be useful to test if this avoids the issue.
          yzr95924 Zuoru Yang added a comment -

          @Andreas Dilger Sorry for my late reply, we spend some time to check our RAID to ensure that this is not caused by the storage backend. We consider that it might be a bug in EXT4?

          Yes, there exists some files in the filesystem of previous experiments, and we remove them and try again with the same test command this time. The issue still occurs, and the info is as follows (since it does not create all files due to this issue):

          [root@client02 ~]# lfs quota -u root /lustre/
          Disk quotas for usr root (uid 0):
               Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
                 /lustre/ 185332124       0       0       - 45025839       0       0       -

          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] Aborting journal on device ultrapathb-8.
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs (ultrapathb): Remounting filesystem read-only
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): ldiskfs_journal_check_start:61: Detected aborted journal
          [Tue Jan  9 20:51:22 2024] LustreError: 260307:0:(osd_handler.c:1790:osd_trans_commit_cb()) transaction @0x0000000069cf0f59 commit error: 2
          [Tue Jan  9 20:51:22 2024] LustreError: 260307:0:(osd_handler.c:1790:osd_trans_commit_cb()) Skipped 52 previous similar messages
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt21_003: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_001: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt07_001: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt19_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error: 180 callbacks suppressed
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt20_004: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_003: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt20_004: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt07_003: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt19_000: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_003: directory leaf block found instead of index block
          [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block

           

          Note that device ultrapathb is the backend of MDT1, and the following is the process record when we do e2fsck for device ultrapathb

           

          Script started on 2024-01-09 21:09:49+08:00
          []0;root@server02:~^G[root@server02 ~]# e2fsck -f /dev/ut^H[[Klt^Grapathb^M
          e2fsck 1.46.6-wc1 (10-Jan-2023)^M
          MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...^M
          l_lfs-MDT0001: recovering journal^M
          Pass 1: Checking inodes, blocks, and sizes^M
          Pass 2: Checking directory structure^M
          Pass 3: Checking directory connectivity^M
          Pass 4: Checking reference counts^M
          Pass 5: Checking group summary information^M
          Free blocks count wrong (142139109, counted=142167445).^M
          Fix<y>? yes^M
          Free inodes count wrong (412219565, counted=412247109).^M
          Fix<y>? yes^M
          [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M
          Update quota info for quota type 0<y>? yes^M
          [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M
          Update quota info for quota type 1<y>? yes^M
          [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M
          Update quota info for quota type 2<y>? yes^M
          ^M
          l_lfs-MDT0001: ***** FILE SYSTEM WAS MODIFIED *****^M
          l_lfs-MDT0001: 17302011/429549120 files (0.0% non-contiguous), 126261419/268428864 blocks^M
          ^[]0;root@server02:~^G[root@server02 ~]# exit^M
          exit^M

          Script done on 2024-01-09 21:16:33+08:00

           

          Is that possible an issue from ext4 with large_dir? 

          yzr95924 Zuoru Yang added a comment - @Andreas Dilger Sorry for my late reply, we spend some time to check our RAID to ensure that this is not caused by the storage backend. We consider that it might be a bug in EXT4? Yes, there exists some files in the filesystem of previous experiments, and we remove them and try again with the same test command this time. The issue still occurs, and the info is as follows (since it does not create all files due to this issue): [root@client02 ~] # lfs quota -u root /lustre/ Disk quotas for usr root (uid 0):      Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace        /lustre/ 185332124       0       0       - 45025839       0       0       - [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] Aborting journal on device ultrapathb-8. [Tue Jan  9 20:51:22 2024] LDISKFS-fs (ultrapathb): Remounting filesystem read-only [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): ldiskfs_journal_check_start:61: Detected aborted journal [Tue Jan  9 20:51:22 2024] LustreError: 260307:0:(osd_handler.c:1790:osd_trans_commit_cb()) transaction @0x0000000069cf0f59 commit error: 2 [Tue Jan  9 20:51:22 2024] LustreError: 260307:0:(osd_handler.c:1790:osd_trans_commit_cb()) Skipped 52 previous similar messages [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt21_003: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_001: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt07_001: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt19_002: directory leaf block found instead of index block [Tue Jan  9 20:51:22 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error: 180 callbacks suppressed [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt20_004: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt18_002: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_003: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt20_004: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt07_003: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt19_000: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt10_003: directory leaf block found instead of index block [Tue Jan  9 20:51:27 2024] LDISKFS-fs error (device ultrapathb): dx_probe:1169: inode #104316384: block 149479: comm mdt05_002: directory leaf block found instead of index block   Note that device ultrapathb is the backend of MDT1, and the following is the process record when we do e2fsck for device ultrapathb   Script started on 2024-01-09 21:09:49+08:00 []0;root@server02:~^G [root@server02 ~] # e2fsck -f /dev/ut^H [[Klt^Grapathb^M e2fsck 1.46.6-wc1 (10-Jan-2023)^M MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...^M l_lfs-MDT0001: recovering journal^M Pass 1: Checking inodes, blocks, and sizes^M Pass 2: Checking directory structure^M Pass 3: Checking directory connectivity^M Pass 4: Checking reference counts^M Pass 5: Checking group summary information^M Free blocks count wrong (142139109, counted=142167445).^M Fix<y>? yes^M Free inodes count wrong (412219565, counted=412247109).^M Fix<y>? yes^M [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M Update quota info for quota type 0<y>? yes^M [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M Update quota info for quota type 1<y>? yes^M [QUOTA WARNING] Usage inconsistent for ID 0:actual (72814075904, 17302001) != expected (72899485696, 17302001)^M Update quota info for quota type 2<y>? yes^M ^M l_lfs-MDT0001: ***** FILE SYSTEM WAS MODIFIED *****^M l_lfs-MDT0001: 17302011/429549120 files (0.0% non-contiguous), 126261419/268428864 blocks^M ^[]0;root@server02:~^G [root@server02 ~] # exit^M exit^M Script done on 2024-01-09 21:16:33+08:00   Is that possible an issue from ext4 with large_dir? 

          Just to confirm the test being run, each rank is creating 160000 files in a separate subdirectory from the other ranks, and there are 2^3 leaf subdirectories (branching factor 2, depth 3)? That would create about 82M files, but it looks like there are some existing files in the filesystem.

          What does e2fsck show when run on the corrupt MDT?

          adilger Andreas Dilger added a comment - Just to confirm the test being run, each rank is creating 160000 files in a separate subdirectory from the other ranks, and there are 2^3 leaf subdirectories (branching factor 2, depth 3)? That would create about 82M files, but it looks like there are some existing files in the filesystem. What does e2fsck show when run on the corrupt MDT?

          People

            wc-triage WC Triage
            yzr95924 Zuoru Yang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: