Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
openEuler 22.03 kernel: 5.10.0-60.79.0.103.oe2203.aarch64
-
3
-
9223372036854775807
Description
Directory corruption when running io500 test on openEuler 22.03:
Client side log
[openeuler@oe2203-test io500]$ sudo /io500.sh config-minimal.ini
IO500 version io500-sc22_v2 (standard)
[RESULT] ior-easy-write 0.105593 GiB/s : time 338.211 seconds
ERROR: open64("/mnt/lustre/datafiles/2023.02.14-10.12.17/mdtest-easy/test-dir.0-0/mdtest_tree.0.0/file.mdtest.1.85", 66, 0664) failed. Error: Read-only file system, (aiori-POSIX.c:569)
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode -1.
Server side log
[ 9962.007724] LDISKFS-fs error (device dm-0): ldiskfs_find_dest_de:2412: inode #5767170: block 3771253: comm mdt00_000: bad entry in directory: rec_len is smaller than minimal - offset=0, inode=0, rec_len=8, name_len=0, size=4096 [ 9962.051171] Aborting journal on device dm-0-8. [ 9962.058456] LDISKFS-fs (dm-0): Remounting filesystem read-only [ 9962.059877] LDISKFS-fs error (device dm-0) in iam_txn_add:547: Journal has aborted [ 9962.064365] LustreError: 11366:0:(osd_io.c:2222:osd_ldiskfs_write_record()) journal_get_write_access() returned error -30 [ 9962.066805] LustreError: 11366:0:(llog_cat.c:592:llog_cat_add_rec()) llog_write_rec -30: lh=00000000c04e4ff3 [ 9962.069137] LustreError: 11366:0:(tgt_lastrcvd.c:1326:tgt_add_reply_data()) lustre-MDT0000: can't update reply_data file: rc = -30 [ 9962.071742] LustreError: 11366:0:(osd_handler.c:2089:osd_trans_stop()) lustre-MDT0000: failed in transaction hook: rc = -30 [ 9962.074184] LustreError: 11366:0:(osd_handler.c:2099:osd_trans_stop()) lustre-MDT0000: failed to stop transaction: rc = -30 [ 9962.074274] LustreError: 11348:0:(osd_handler.c:1789:osd_trans_commit_cb()) transaction @0x00000000c73ec34c commit error: 2
Attachments
Issue Links
- is related to
-
LU-12268 LDISKFS-fs error: ldiskfs_find_dest_de:2066: bad entry in directory: rec_len is smaller than minimal - offset=0( 0), inode=201, rec_len=0, name_len=0
-
- Resolved
-
I was just going to add that after going through the ext4-pdirop.patch again,
The rhel9.1 version does have the code block introduced by ext4-pdirop.patch before the block
from 3ba733f879c2 ext4: avoid cycles in directory h-tree, and previous comments mentioned that rhel9.1
kernel doesn't have the problem.
so the ordering of the code blocks is irrelevant, the real fix is as Xinliang suggested the move of "de2 = dx_move_dirents(...)"
Having said that, could still be a good idea to check cycles in tree blocks before we do anything?