Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
aborted mdtest job leaves unattached inodes:
Job Script: command started at Mon Mar 2 16:40:31 CST 2020
-- started at 03/02/2020 16:40:32 --
mdtest-1.9.3 was launched with 32 total task(s) on 1 node(s)
Command line used: /cray/css/ostest/binaries/xt/rel.70.aries.cray/xtcnl/ostest/ROOT.latest/tests/gold/ioperf/mdtest/mdtest -f 32 -l 32 -n10000 -i1 -d /lus/snx11281/disk/ostest.vers/alsorun.20200302130005.752.saturn-p4/CL_mdtest_4s_fo.4.tjgw4U.1583188829/CL_mdtest_4s_fo -tuv
V-1: main: Setting create/stat/read/remove_only to True
V-1: Entering valid_tests...
barriers : True
collective_creates : False
create_only : True
dirpath(s):
/lus/snx11281/disk/ostest.vers/alsorun.20200302130005.752.saturn-p4/CL_mdtest_4s_fo.4.tjgw4U.1583188829/CL_mdtest_4s_fo
dirs_only : True
read_bytes : 0
read_only : True
first : 32
files_only : True
iterations : 1
items_per_dir : 0
last : 32
leaf_only : False
items : 10000
nstride : 0
pre_delay : 0
remove_only : False
random_seed : 0
stride : 1
shared_file : False
time_unique_dir_overhead: True
stat_only : True
unique_dir_per_task : True
write_bytes : 0
sync_file : False
depth : 0
V-1: Entering display_freespace...
V-1: Entering show_file_system_size...
Path: /lus/snx11281/disk/ostest.vers/alsorun.20200302130005.752.saturn-p4/CL_mdtest_4s_fo.4.tjgw4U.1583188829
FS: 483.4 TiB Used FS: 1.5% Inodes: 487.0 Mi Used Inodes: 0.1%
32 tasks, 320000 files/directories
Operation Duration Rate
--------- -------- ----
V-1: main: * iteration 1 *
V-1: Entering create_remove_directory_tree, currDepth = 0...
V-1: Entering create_remove_directory_tree, currDepth = 1...
V-1: main: Tree creation : 0.011 sec, 88.658 ops/sec
V-1: Entering directory_test...
V-1: Entering unique_dir_access...
V-1: Entering create_remove_items, currDepth = 0...
V-1: Entering create_remove_items_helper...
V-1: Entering unique_dir_access...
V-1: Entering mdtest_stat...
V-1: Entering unique_dir_access...
V-1: Entering unique_dir_access...
V-1: Entering create_remove_items, currDepth = 0...
V-1: Entering create_remove_items_helper...
V-1: Entering unique_dir_access...
V-1: Directory creation: 25.644 sec, 12478.741 ops/sec
V-1: Directory stat : 4.194 sec, 76299.371 ops/sec
V-1: Directory removal : 8.918 sec, 35883.016 ops/sec
V-1: Entering file_test...
V-1: Entering unique_dir_access...
V-1: Entering create_remove_items, currDepth = 0...
V-1: Entering create_remove_items_helper...
aprun: Apid 5441901: Caught signal Terminated, sending to application
_pmiu_daemon(SIGCHLD): [NID 00545] [c0-1c2s8n1] [Mon Mar 2 18:15:52 2020] PE RANK 25 exit signal Terminated
Application 5441901 exit codes: 143
Application 5441901 resources: utime ~0s, stime ~6s, Rss ~11768, inblocks ~0, outblocks ~0
Job Script: command stopped at Mon Mar 2 18:16:38 CST 2020
Job Script: command runtime was 5767 seconds
e2fsck logs:
[root@snx11281n000 ~]# grep Unattached /home/admin/e2fsck.* /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056499 /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056500 /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056501 /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056507 /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056509 /home/admin/e2fsck.snx11281n002.3.4-010.81.202003091157.out:Unattached inode 3367056510 [root@snx11281n000 ~]#
unattached inode stat:
debugfs: stat <3367056499> Inode: 3367056499 Type: regular Mode: 0644 Flags: 0x0 Generation: 3271840411 Version: 0x000001ee:00bad136 User: 1356 Group: 11121 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 atime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 mtime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 crtime: 0x5e5da1e3:c1b98cf8 -- Mon Mar 2 18:16:35 2020 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 6e 11 10 00 02 00 00 00 84 46 01 00 00 00 00 00 lma: fid=[0x20010116e:0x14684:0x0] compat=0 incompat=0 trusted.lov (144) trusted.link (60) BLOCKS: debugfs: stat <3367056500> Inode: 3367056500 Type: regular Mode: 0644 Flags: 0x0 Generation: 3271840414 Version: 0x000001ee:00bad13a User: 1356 Group: 11121 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 atime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 mtime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 crtime: 0x5e5da1e3:c1b98cf8 -- Mon Mar 2 18:16:35 2020 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 6e 11 10 00 02 00 00 00 85 46 01 00 00 00 00 00 lma: fid=[0x20010116e:0x14685:0x0] compat=0 incompat=0 trusted.lov (144) trusted.link (61) BLOCKS: debugfs: stat <3367056501> Inode: 3367056501 Type: regular Mode: 0644 Flags: 0x0 Generation: 3271840413 Version: 0x000001ee:00bad143 User: 1356 Group: 11121 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 atime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 mtime: 0x5e5d9f91:00000000 -- Mon Mar 2 18:06:41 2020 crtime: 0x5e5da1e3:c1b98cf8 -- Mon Mar 2 18:16:35 2020 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 6e 11 10 00 02 00 00 00 83 46 01 00 00 00 00 00 lma: fid=[0x20010116e:0x14683:0x0] compat=0 incompat=0 trusted.lov (96) trusted.link (61) BLOCKS: debugfs: stat <3367056507> Inode: 3367056507 Type: regular Mode: 0644 Flags: 0x0 Generation: 3271840428 Version: 0x000001ee:00bad146 User: 1356 Group: 11121 Project: 0 Size: 0 File ACL: 0 Links: 1 Blockcount: 0 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x5e5d9f96:00000000 -- Mon Mar 2 18:06:46 2020 atime: 0x5e5d9f96:00000000 -- Mon Mar 2 18:06:46 2020 mtime: 0x5e5d9f96:00000000 -- Mon Mar 2 18:06:46 2020 crtime: 0x5e5da1e3:c1f69600 -- Mon Mar 2 18:16:35 2020 Size of extra inode fields: 32 Extended attributes: trusted.lma (24) = 00 00 00 00 00 00 00 00 6e 11 10 00 02 00 00 00 98 46 01 00 00 00 00 00 lma: fid=[0x20010116e:0x14698:0x0] compat=0 incompat=0 trusted.lov (144) trusted.link (61) BLOCKS:
dmesg contains the following Lustre error message at the moment of inode creation:
Mar 2 18:16:35 snx11281n002 kernel: LustreError: 2640:0:(osd_handler.c:2009:osd_trans_stop()) snx11281-MDT0000: failed in transaction hook: rc = -114
LFSCK was able to successfully re-attach the inodes:
00100000:10000000:1.0:1583880053.552740:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14682:0x0] with the name file.mdtest.17.4708 and type 100000 to the parent [0x200101166:0x1731:0x0]: rc = 1 00100000:10000000:1.0:1583880053.626588:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14683:0x0] with the name file.mdtest.11.4748 and type 100000 to the parent [0x200101166:0x1727:0x0]: rc = 1 00100000:10000000:9.0:1583880053.699071:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14684:0x0] with the name file.mdtest.0.4749 and type 100000 to the parent [0x200101166:0x1730:0x0]: rc = 1 00100000:10000000:6.0F:1583880053.764322:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14685:0x0] with the name file.mdtest.25.4668 and type 100000 to the parent [0x200101166:0x1721:0x0]: rc = 1 00100000:10000000:7.0:1583880053.830815:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14686:0x0] with the name file.mdtest.26.4699 and type 100000 to the parent [0x200101166:0x1725:0x0]: rc = 1 00100000:10000000:9.0:1583880053.963978:0:31921:0:(lfsck_namespace.c:1243:lfsck_namespace_insert_normal()) snx11281-MDT0000-osd: namespace LFSCK insert object [0x20010116e:0x14698:0x0] with the name file.mdtest.22.4696 and type 100000 to the parent [0x200101166:0x1716:0x0]: rc = 1