Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.12.4
-
None
-
CentOS 7.6
-
2
-
9223372036854775807
Description
Following several server crashes (eg. LU-13511) when running lfs migrate, we decided to run lfsck on Fir (Lustre 2.12.4). Today, users are reporting that some of their files have been truncated to 128MB (strangely the size of the first component matches the one from our new default PFL layout).
What led to this situation is likely the following scenario:
- files were created originally using DoM + PFL (default setup)
- we changed our default layout to PFL with the first OST component set to 128MB (stripe count 1) to avoid new DoM files
- because of issues with DoM, we have restriped most of the existing DoM files using lfs migrate -c 1 (DoM/PFL to plain layout) this was done several months ago without problems
- two days ago, we started to run lfsck namespace + layout
- today, users are reporting truncated files, only the ones with plain layout > 128MB
I'm wondering if this could be related to LU-13426. We consider this issue Sev 2 at least as lfsck is likely corrupting files that have been migrated to plain layout.
More information below.
Example with file:
/fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
[root@fir-rbh01 ~]# stat /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa File: ‘/fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa’ Size: 134217728 Blocks: 262152 IO Block: 4194304 regular file Device: e64e03a8h/3863872424d Inode: 144119811155193635 Links: 1 Access: (0644/-rw-r--r--) Uid: (65488/ mgebala) Gid: (52067/astraigh) Access: 2020-05-07 11:18:32.000000000 -0700 Modify: 2020-04-08 23:24:19.000000000 -0700 Change: 2020-04-29 11:26:53.000000000 -0700 Birth: -
[root@fir-rbh01 ~]# lfs getstripe /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa lmm_stripe_count: 1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 80 obdidx objid objid group 80 17475505 0x10aa7b1 0x1700000402
FID is: [0x200043465:0x6f23:0x0]
[root@fir-rbh01 ~]# lfs path2fid /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa [0x200043465:0x6f23:0x0]
Thanks to Robinhood, we know that the file size was ~132MB and not 128MB.
MariaDB [robinhood_fir]> select * from ENTRIES where id='0x200043465:0x6f23:0x0'; +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+ | id | uid | gid | size | blocks | creation_time | last_access | last_mod | last_mdchange | type | mode | nlink | md_update | invalid | fileclass | class_update | alert_status | checkdv_status | alert_lstchk | alert_lstalrt | checkdv_lstchk | checkdv_lstsuc | checkdv_out | +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+ | 0x200043465:0x6f23:0x0 | mgebala | astraigh | 138323718 | 270176 | 1586607743 | 1586413465 | 1586413459 | 1588184813 | file | 420 | 1 | 1588185083 | 0 | +groups+ | 1588185083 | | ok | 0 | 0 | 1588184813 | 1588184813 | 60239190574:1586607743 | +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+ 1 row in set (0.00 sec)
Also the original data_version was 60239190574 but now it's:
[root@fir-rbh01 ~]# lfs data_version /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa 30120416758
This file is on MDT0 and lfsck logs show that something was fixed for this FID 0x200043465:0x6f23:0x0:
[root@fir-rbh01 ~]# grep 0x200043465:0x6f23:0x0 lfsck.fir-md1-s1.log 00100000:10000000:24.0:1588797550.743684:0:126810:0:(lfsck_layout.c:4033:lfsck_layout_repair_owner()) fir-MDT0000-osd: layout LFSCK assistant repaired inconsistent file owner for: parent [0x200043465:0x6f23:0x0], child [0x1340000401:0x10bc4c3:0x0], OST-index 65, stripe-index 1, old owner 0/0, new owner 65488/52067: rc = 1
Robinhood also shows that the file was previously stripped on two OSTs, but Robinhood doesn't support DoM or migration, so that is from the original striping info:
MariaDB [robinhood_fir]> select * from STRIPE_ITEMS where id='0x200043465:0x6f23:0x0'; +------------------------+--------------+--------+----------------------+ | id | stripe_index | ostidx | details | +------------------------+--------------+--------+----------------------+ |43465:0x6f23:0x0 | 0 | 64 | ?? | 0x200043465:0x6f23:0x0 | 1 | 65 | @ ?? | +------------------------+--------------+--------+----------------------+ 2 rows in set (0.00 sec)
LFSCK layout has fixed many files like that:
[root@fir-hn01 sthiell.root]# clush -w@mds -R exec -bL 'tgt=$(printf fir-MDT%%04x %n); ssh %h lctl get_param -n mdd.$tgt.lfsck_layout' | grep status fir-md1-s[1-4]: status: completed [root@fir-hn01 sthiell.root]# clush -w@mds -R exec -bL 'tgt=$(printf fir-MDT%%04x %n); ssh %h lctl get_param -n mdd.$tgt.lfsck_layout' | grep repaired fir-md1-s[1,4]: repaired_dangling: 0 fir-md1-s[2-3]: repaired_dangling: 1 fir-md1-s[1-4]: repaired_unmatched_pair: 0 fir-md1-s[1-4]: repaired_multiple_referenced: 0 fir-md1-s[1-4]: repaired_orphan: 0 fir-md1-s1: repaired_inconsistent_owner: 10494922 fir-md1-s2: repaired_inconsistent_owner: 26336224 fir-md1-s3: repaired_inconsistent_owner: 36300505 fir-md1-s4: repaired_inconsistent_owner: 15102845 fir-md1-s1: repaired_others: 429814 fir-md1-s2: repaired_others: 46955127 fir-md1-s3: repaired_others: 0 fir-md1-s4: repaired_others: 1716650
Do you confirm this could be due to LFSCK? I'm not sure why "inconsistent file owner" would corrupt files, but this is the only pointer that we have now. If that's the case, do you think there is a way to repair what LFSCK has "fixed"?