Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.4.0
-
3
-
7263
Description
This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/43161786-8f2c-11e2-92ff-52540035b04c.
The sub-test test_17b failed with the following error in the MDS dmesg log:
Lustre: DEBUG MARKER: == sanity test 17a: symlinks: create, remove (real) ==================== 16:19:07 (1363475947)
attempt to access beyond end of device
dm-0: rw=0, want=855117409799080, limit=4194304
attempt to access beyond end of device
dm-0: rw=0, want=855117409799080, limit=4194304
attempt to access beyond end of device
dm-0: rw=0, want=855117409799080, limit=4194304
attempt to access beyond end of device
dm-0: rw=0, want=855117409799080, limit=4194304
LDISKFS-fs error (device dm-0): ldiskfs_xattr_delete_inode: inode 524372: block 106889676224884 read error
Aborting journal on device dm-0-8.
LDISKFS-fs (dm-0): Remounting filesystem read-only
LDISKFS-fs error (device dm-0) in ldiskfs_free_inode: Journal has aborted
LustreError: 2764:0:(osd_handler.c:635:osd_trans_commit_cb()) transaction @0xffff88007cdd41c0 commit error: 2
It seems that test_17a is somehow corrupting the filesystem, since the block number is the same 106889676224884 in the few MDS dmesg logs I looked at, and this seems to be ASCII text from the test_17a() run.
(gdb) p /x 106889676224884 $1 = 0x6137312e7974
This is the ASCII string "ty.17a<NUL><NUL>", which might be a fragment from $tdir or similar "sani[ty.17a]".
Info required for matching: sanity 17b