Details
-
Bug
-
Resolution: Duplicate
-
Blocker
-
None
-
Lustre 2.15.0, Lustre 2.15.3
-
None
-
client/server: CentOS-8.5.2111 + Lustre 2.15.3
Linux 4.18.0-348.2.1.el8_lustre.x86_64 #1 SMP Fri Jun 17 00:10:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
-
3
-
9223372036854775807
Description
We test metadata performance in a simple Lustre environment, where we deploy two servers (#server01, #server02) and both connect to a SAN storage:
- For #server01: we mount a MGT, a MDT, and four OSTs
- For #server02: we mount a MDT, and four OSTs
Here, MDS and OSS run in the same server, and Lustre includes two MDTs and 8 OSTs.
[root@client02 lustre]# lfs df -h
UUID bytes Used Available Use% Mounted on
l_lfs-MDT0000_UUID 1.8T 39.2G 1.6T 3% /lustre[MDT:0]
l_lfs-MDT0001_UUID 1.8T 39.4G 1.6T 3% /lustre[MDT:1]
l_lfs-OST0000_UUID 11.9T 3.5T 7.8T 31% /lustre[OST:0]
l_lfs-OST0001_UUID 11.9T 3.6T 7.7T 32% /lustre[OST:1]
l_lfs-OST0002_UUID 11.9T 3.6T 7.7T 32% /lustre[OST:2]
l_lfs-OST0003_UUID 11.9T 3.6T 7.7T 32% /lustre[OST:3]
l_lfs-OST0004_UUID 11.9T 3.8T 7.5T 34% /lustre[OST:4]
l_lfs-OST0005_UUID 11.9T 3.5T 7.8T 32% /lustre[OST:5]
l_lfs-OST0006_UUID 11.9T 3.5T 7.8T 31% /lustre[OST:6]
l_lfs-OST0007_UUID 11.9T 3.6T 7.7T 32% /lustre[OST:7]
filesystem_summary: 95.1T 28.6T 61.8T 32% /lustre
We leverage mdtest, mpirun with two clients to test metadate performance under the configuration above, the test command is as follows:
- $> mpirun --allow-run-as-root --oversubscribe -mca btl ^openib --mca btl_tcp_if_include 40.40.22.0/24 -np 64 -host client01:32,client02:32 --map-by node mdtest -L -z 3 -b 2 -I 160000 -i 1 -d /lustre/mdtest_demo | tee 2client_64np_3z_2b_160000I.log
After stably running around 15 mins, Lustre becomes read-only (blocks the whole test) and generate the sys log as follows:
[Fri Jan 5 17:29:36 2024] Lustre: l_lfs-OST0001: deleting orphan objects from 0x440000400:26730785 to 0x440000400:26744321
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] Aborting journal on device ultrapatha-8.
[Fri Jan 5 17:43:19 2024] LDISKFS-fs (ultrapatha): Remounting filesystem read-only
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): ldiskfs_journal_check_start:61: Detected aborted journal
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LustreError: 61165:0:(osd_handler.c:1790:osd_trans_commit_cb()) transaction @0x0000000082b2d9d3 commit error: 2
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt08_003: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt08_003: directory leaf block found instead of index block
[Fri Jan 5 17:43:19 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt05_001: directory leaf block found instead of index block
[Fri Jan 5 17:43:20 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
[Fri Jan 5 17:43:20 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt18_002: directory leaf block found instead of index block
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error: 355 callbacks suppressed
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt21_000: directory leaf block found instead of index block
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block
[Fri Jan 5 17:43:24 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
[Fri Jan 5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt19_001: directory leaf block found instead of index block
[Fri Jan 5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt18_001: directory leaf block found instead of index block
[Fri Jan 5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt07_004: directory leaf block found instead of index block
[Fri Jan 5 17:43:25 2024] LDISKFS-fs error (device ultrapatha): dx_probe:1169: inode #61343829: block 151386: comm mdt20_000: directory leaf block found instead of index block
We repeat the test many times, and still get the similar result (i.e., the LDISKFS-fs error in MDT0 or MDT1), and the workload scale is as follow:
[root@client01 lustre]# lfs quota -u root /lustre/
Disk quotas for usr root (uid 0):
Filesystem kbytes quota limit grace files quota limit grace
/lustre/ 30805903960 0 0 - 96453928 0 0 -
Originally, we find this issue with 2.15.0 and we try to upgrade to 2.15.3, but this issue still exists and block our test.
yzr95924, thank you for your launchpad reference. Indeed that bug looks like it could be related. That patch is reported included in upstream kernel 5.14 and stable kernel 5.11, and fixing a bug originally in kernel 5.11 (but also backported to the RHEL kernel):
I can confirm that the patch is applied in 4.18.0-425.13.1.el8_7.x86_64 in fs/ext4/namei.c:
but I'm not sure whether it is applied in your kernel 4.18.0-348.2.1.el8_lustre.x86_64.