Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13535

Files truncated/corruption due to lfsck

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.14.0, Lustre 2.12.5
    • Lustre 2.12.4
    • None
    • CentOS 7.6
    • 2
    • 9223372036854775807

    Description

      Following several server crashes (eg. LU-13511) when running lfs migrate, we decided to run lfsck on Fir (Lustre 2.12.4). Today, users are reporting that some of their files have been truncated to 128MB (strangely the size of the first component matches the one from our new default PFL layout).

      What led to this situation is likely the following scenario:

      • files were created originally using DoM + PFL (default setup)
      • we changed our default layout to PFL with the first OST component set to 128MB (stripe count 1) to avoid new DoM files
      • because of issues with DoM, we have restriped most of the existing DoM files using lfs migrate -c 1 (DoM/PFL to plain layout) this was done several months ago without problems
      • two days ago, we started to run lfsck namespace + layout
      • today, users are reporting truncated files, only the ones with plain layout > 128MB

      I'm wondering if this could be related to LU-13426. We consider this issue Sev 2 at least as lfsck is likely corrupting files that have been migrated to plain layout.

      More information below.

      Example with file:

      /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
      
      [root@fir-rbh01 ~]# stat /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
        File: ‘/fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa’
        Size: 134217728 	Blocks: 262152     IO Block: 4194304 regular file
      Device: e64e03a8h/3863872424d	Inode: 144119811155193635  Links: 1
      Access: (0644/-rw-r--r--)  Uid: (65488/ mgebala)   Gid: (52067/astraigh)
      Access: 2020-05-07 11:18:32.000000000 -0700
      Modify: 2020-04-08 23:24:19.000000000 -0700
      Change: 2020-04-29 11:26:53.000000000 -0700
       Birth: -
      
      [root@fir-rbh01 ~]# lfs getstripe /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
      /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
      lmm_stripe_count:  1
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 80
      	obdidx		 objid		 objid		 group
      	    80	      17475505	    0x10aa7b1	  0x1700000402
      

      FID is: [0x200043465:0x6f23:0x0]

      [root@fir-rbh01 ~]# lfs path2fid /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
      [0x200043465:0x6f23:0x0]
      

      Thanks to Robinhood, we know that the file size was ~132MB and not 128MB.

      MariaDB [robinhood_fir]> select * from ENTRIES where id='0x200043465:0x6f23:0x0';
      +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+
      | id                     | uid     | gid      | size      | blocks | creation_time | last_access | last_mod   | last_mdchange | type | mode | nlink | md_update  | invalid | fileclass | class_update | alert_status | checkdv_status | alert_lstchk | alert_lstalrt | checkdv_lstchk | checkdv_lstsuc | checkdv_out            |
      +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+
      | 0x200043465:0x6f23:0x0 | mgebala | astraigh | 138323718 | 270176 |    1586607743 |  1586413465 | 1586413459 |    1588184813 | file |  420 |     1 | 1588185083 |       0 | +groups+  |   1588185083 |              | ok             |            0 |             0 |     1588184813 |     1588184813 | 60239190574:1586607743 |
      +------------------------+---------+----------+-----------+--------+---------------+-------------+------------+---------------+------+------+-------+------------+---------+-----------+--------------+--------------+----------------+--------------+---------------+----------------+----------------+------------------------+
      1 row in set (0.00 sec)
      

      Also the original data_version was 60239190574 but now it's:

      [root@fir-rbh01 ~]# lfs data_version /fir/groups/astraigh/Magda/fachinettie19/fachinetti_CC_DLD/extracted/fachinetti_CCr1oCAr1.k25.ci10.madx5.r1.singleline.fa
      30120416758
      

      This file is on MDT0 and lfsck logs show that something was fixed for this FID 0x200043465:0x6f23:0x0:

      [root@fir-rbh01 ~]# grep 0x200043465:0x6f23:0x0 lfsck.fir-md1-s1.log 
      00100000:10000000:24.0:1588797550.743684:0:126810:0:(lfsck_layout.c:4033:lfsck_layout_repair_owner()) fir-MDT0000-osd: layout LFSCK assistant repaired inconsistent file owner for: parent [0x200043465:0x6f23:0x0], child [0x1340000401:0x10bc4c3:0x0], OST-index 65, stripe-index 1, old owner 0/0, new owner 65488/52067: rc = 1
      

      Robinhood also shows that the file was previously stripped on two OSTs, but Robinhood doesn't support DoM or migration, so that is from the original striping info:

      MariaDB [robinhood_fir]> select * from STRIPE_ITEMS where id='0x200043465:0x6f23:0x0';
      +------------------------+--------------+--------+----------------------+
      | id                     | stripe_index | ostidx | details              |
      +------------------------+--------------+--------+----------------------+
             |43465:0x6f23:0x0 |            0 |     64 |          ??
      | 0x200043465:0x6f23:0x0 |            1 |     65 |      @   ??
                                                                           |
      +------------------------+--------------+--------+----------------------+
      2 rows in set (0.00 sec)
      

      LFSCK layout has fixed many files like that:

      [root@fir-hn01 sthiell.root]# clush -w@mds -R exec -bL 'tgt=$(printf fir-MDT%%04x %n); ssh %h lctl get_param -n mdd.$tgt.lfsck_layout' | grep status
      fir-md1-s[1-4]: status: completed
      [root@fir-hn01 sthiell.root]# clush -w@mds -R exec -bL 'tgt=$(printf fir-MDT%%04x %n); ssh %h lctl get_param -n mdd.$tgt.lfsck_layout' | grep repaired
      fir-md1-s[1,4]: repaired_dangling: 0
      fir-md1-s[2-3]: repaired_dangling: 1
      fir-md1-s[1-4]: repaired_unmatched_pair: 0
      fir-md1-s[1-4]: repaired_multiple_referenced: 0
      fir-md1-s[1-4]: repaired_orphan: 0
      fir-md1-s1: repaired_inconsistent_owner: 10494922
      fir-md1-s2: repaired_inconsistent_owner: 26336224
      fir-md1-s3: repaired_inconsistent_owner: 36300505
      fir-md1-s4: repaired_inconsistent_owner: 15102845
      fir-md1-s1: repaired_others: 429814
      fir-md1-s2: repaired_others: 46955127
      fir-md1-s3: repaired_others: 0
      fir-md1-s4: repaired_others: 1716650
      

      Do you confirm this could be due to LFSCK? I'm not sure why "inconsistent file owner" would corrupt files, but this is the only pointer that we have now. If that's the case, do you think there is a way to repair what LFSCK has "fixed"?

      Attachments

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: