Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16689

upgrade to 2.15.2 lost sever top level directories

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • 2
    • 9223372036854775807

    Description

      After upgrading filesystem from 2.12 to 2.15.2 Several top level directories got corrupted. 

      [root@nbp11-srv1 ~]# ls -l /nobackupp11/
      ls: cannot access '/nobackupp11/ylin4': No such file or directory
      ls: cannot access '/nobackupp11/mbarad': No such file or directory
      ls: cannot access '/nobackupp11/ldgrant': No such file or directory
      ls: cannot access '/nobackupp11/kknizhni': No such file or directory
      ls: cannot access '/nobackupp11/mzhao4': No such file or directory
      ls: cannot access '/nobackupp11/afahad': No such file or directory
      ls: cannot access '/nobackupp11/jliu7': No such file or directory
      ls: cannot access '/nobackupp11/jswest': No such file or directory
      ls: cannot access '/nobackupp11/hsp': No such file or directory
      ls: cannot access '/nobackupp11/vjespos1': No such file or directory
      ls: cannot access '/nobackupp11/ssepka': No such file or directory
      ls: cannot access '/nobackupp11/cjang1': No such file or directory

       

      debugfs:  stat ylin4
      Inode: 43051102   Type: directory    Mode:  0000   Flags: 0x80000
      Generation: 503057142    Version: 0x00000000:00000000
      User:     0   Group:     0   Project:     0   Size: 4096
      File ACL: 0
      Links: 2   Blockcount: 8
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x63dd8e2f:22c83c08 – Fri Feb  3 14:43:59 2023
       atime: 0x63dd8e2f:22c83c08 – Fri Feb  3 14:43:59 2023
       mtime: 0x63dd8e2f:22c83c08 – Fri Feb  3 14:43:59 2023
      crtime: 0x63dd8e2f:22c83c08 – Fri Feb  3 14:43:59 2023
      Size of extra inode fields: 32
      Extended attributes:
        lma: fid=[0x280015902:0x2:0x0] compat=0 incompat=2
      EXTENTS:
      (0):671099035

       

      Not thinking I delete these via ldiskfs. The data is still there how can we recover the director data.

      1. lfs quota -u ylin4  /nobackupp11
        Disk quotas for usr ylin4 (uid 11560):
             Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /nobackupp11 11337707848* 1073741824 2147483648       -  208359  500000  600000       -

       

       

      Attachments

        Issue Links

          Activity

            [LU-16689] upgrade to 2.15.2 lost sever top level directories
            pjones Peter Jones added a comment - - edited

            Closing this as a duplicate of LU-16655

            pjones Peter Jones added a comment - - edited Closing this as a duplicate of LU-16655
            pjones Peter Jones added a comment -

            Mahmoud

            I'm just checking in on this one. Presumably you have the  LU-16655 fix in place on the NASA distribution now (and it is already merged for the upcoming 2.15.3 release) but has the restoration of the OI files for the impacted system been completed?

            Peter

            pjones Peter Jones added a comment - Mahmoud I'm just checking in on this one. Presumably you have the   LU-16655 fix in place on the NASA distribution now (and it is already merged for the upcoming 2.15.3 release) but has the restoration of the OI files for the impacted system been completed? Peter

            Mahmoud, do you have any logs from the mount after the upgrade that indicate OI Scrub has been run/completed on the MDTs? It would be worthwhile to check the state of the OI files on the MDTs to confirm that they are correct:

            mds# lctl get_param osd-ldiskfs.*.oi_scrub
            osd-ldiskfs.testfs-MDT0000.oi_scrub=
            name: OI_scrub
            magic: 0x4c5fd252
            oi_files: 64
            status: completed
            flags:
            param:
            time_since_last_completed: 6 seconds
            time_since_latest_start: 6 seconds
            time_since_last_checkpoint: 6 seconds
            
            :
            

            The important information here is that it reports oi_files: 64 and not some other number (which is what the LU-16655 bug broke). The OI files map the Lustre FIDs to local inode numbers, so without these most of the by-FID lookups will be broken. The OI Scrub can rebuild the OI files from the FID xattr stored in each inode.

            If this is showing "oi_files: 1" or 2 or similar, my recommendation would be to mount the MDTs with "-o resetoi" to force a rebuild of the OI files, or alternately mount MDTs with ldiskfs and move the "oi.16.X" files out of the filesystem and then remount as Lustre and it should rebuild them automatically at mount (this will take a few minutes). Having a small number of OI files will cause scalability/performance issues.

            adilger Andreas Dilger added a comment - Mahmoud, do you have any logs from the mount after the upgrade that indicate OI Scrub has been run/completed on the MDTs? It would be worthwhile to check the state of the OI files on the MDTs to confirm that they are correct: mds# lctl get_param osd-ldiskfs.*.oi_scrub osd-ldiskfs.testfs-MDT0000.oi_scrub= name: OI_scrub magic: 0x4c5fd252 oi_files: 64 status: completed flags: param: time_since_last_completed: 6 seconds time_since_latest_start: 6 seconds time_since_last_checkpoint: 6 seconds : The important information here is that it reports oi_files: 64 and not some other number (which is what the LU-16655 bug broke). The OI files map the Lustre FIDs to local inode numbers, so without these most of the by-FID lookups will be broken. The OI Scrub can rebuild the OI files from the FID xattr stored in each inode. If this is showing " oi_files: 1 " or 2 or similar, my recommendation would be to mount the MDTs with " -o resetoi " to force a rebuild of the OI files, or alternately mount MDTs with ldiskfs and move the " oi.16.X " files out of the filesystem and then remount as Lustre and it should rebuild them automatically at mount (this will take a few minutes). Having a small number of OI files will cause scalability/performance issues.

            I used debugfs to dump all fid in /REMOTE_DIR on each MDT. Then I did a lookup of the fid2path to match the directories that were missing. I then cd into the /fs/.lustre/fid/fidnum and moved all contents to its new location.

            Dry-run lfsck still running and finding lots of these
            Mar 30 12:53:39 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb98:0x8467:0x0]: rc = 56
            Mar 30 12:53:39 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb98:0x8468:0x0]: rc = 56

            These are files under the directories that gotten corrupted. 

            mhanafi Mahmoud Hanafi added a comment - I used debugfs to dump all fid in /REMOTE_DIR on each MDT. Then I did a lookup of the fid2path to match the directories that were missing. I then cd into the /fs/.lustre/fid/fidnum and moved all contents to its new location. Dry-run lfsck still running and finding lots of these Mar 30 12:53:39 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb98:0x8467:0x0] : rc = 56 Mar 30 12:53:39 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb98:0x8468:0x0] : rc = 56 These are files under the directories that gotten corrupted. 

            It may be that mounting the MDT with "-o resetoi" would have rebuilt the OI files without having to move them from lost+found, in case someone finds this ticket in the future.

            adilger Andreas Dilger added a comment - It may be that mounting the MDT with " -o resetoi " would have rebuilt the OI files without having to move them from lost+found , in case someone finds this ticket in the future.

            This looks like LU-16655, which was caused by a bad code change breaking the on-disk file format for OI Scrub. If Scrub has been run on a filesystem prior to upgrade then it will incorrectly read the fields from this file. The patch https://review.whamcloud.com/50455 "LU-16655 scrub: upgrade scrub_file from 2.12 format" fixes this issue and LU-16655 describes the details (though it is too late to avoid this bug for your system).

            adilger Andreas Dilger added a comment - This looks like LU-16655 , which was caused by a bad code change breaking the on-disk file format for OI Scrub. If Scrub has been run on a filesystem prior to upgrade then it will incorrectly read the fields from this file. The patch https://review.whamcloud.com/50455 " LU-16655 scrub: upgrade scrub_file from 2.12 format " fixes this issue and LU-16655 describes the details (though it is too late to avoid this bug for your system).
            dongyang Dongyang Li added a comment -

            Hi Mahmoud, 2 questions:
            What does stat look like on nobackupp11 via debugfs?
            How did you find ylin4 in debugfs?

            dongyang Dongyang Li added a comment - Hi Mahmoud, 2 questions: What does stat look like on nobackupp11 via debugfs? How did you find ylin4 in debugfs?

            I recovered the files.

            I found the parent fid and cd into /fs/.lustre/fid/fidnum then just move all contents to a newly created directory

            I still like to understand what caused the corruption. 

            mhanafi Mahmoud Hanafi added a comment - I recovered the files. I found the parent fid and cd into /fs/.lustre/fid/fidnum then just move all contents to a newly created directory I still like to understand what caused the corruption. 

            I started a lfsck dry-run

            on MDT0 getting a lot of these errors that are for files with hard links

            Mar 30 12:32:28 nbp11-srv1 kernel: ret_from_fork+0x1f/0x40
            Mar 30 12:32:28 nbp11-srv1 kernel: Lustre: nbp11-MDT0000-osd: namespace LFSCK add flags for [0x20004ca8c:0x8986:0x0] in the trace file, flags 1, old 0, new 1: rc = -22
            Mar 30 12:32:28 nbp11-srv1 kernel: CPU: 34 PID: 1520983 Comm: lfsck Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-425.3.1.el8_lustre.x86_64 #1
            Mar 30 12:32:28 nbp11-srv1 kernel: Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/21/2022
            Mar 30 12:32:28 nbp11-srv1 kernel: Call Trace:
            Mar 30 12:32:28 nbp11-srv1 kernel: dump_stack+0x41/0x60
            Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_trans_create.part.58+0x63/0x70 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_namespace_trace_update+0x972/0x980 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_namespace_exec_oit+0x87d/0x970 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_master_oit_engine+0xc56/0x1360 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_master_engine+0x512/0xcd0 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: ? __schedule+0x2d9/0x860
            Mar 30 12:32:28 nbp11-srv1 kernel: ? finish_wait+0x80/0x80
            Mar 30 12:32:28 nbp11-srv1 kernel: ? lfsck_master_oit_engine+0x1360/0x1360 [lfsck]
            Mar 30 12:32:28 nbp11-srv1 kernel: kthread+0x10a/0x120
            Mar 30 12:32:28 nbp11-srv1 kernel: ? set_kthread_struct+0x50/0x50
            Mar 30 12:32:28 nbp11-srv1 kernel: ret_from_fork+0x1f/0x40

            on MDT2 getting these errors

            Mar 30 12:33:43 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb78:0x1e8bb:0x0]: rc = 56
            Mar 30 12:33:43 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb78:0x1e8bc:0x0]: rc = 56

            These are the files for the bad directories.

            mhanafi Mahmoud Hanafi added a comment - I started a lfsck dry-run on MDT0 getting a lot of these errors that are for files with hard links Mar 30 12:32:28 nbp11-srv1 kernel: ret_from_fork+0x1f/0x40 Mar 30 12:32:28 nbp11-srv1 kernel: Lustre: nbp11-MDT0000-osd: namespace LFSCK add flags for [0x20004ca8c:0x8986:0x0] in the trace file, flags 1, old 0, new 1: rc = -22 Mar 30 12:32:28 nbp11-srv1 kernel: CPU: 34 PID: 1520983 Comm: lfsck Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-425.3.1.el8_lustre.x86_64 #1 Mar 30 12:32:28 nbp11-srv1 kernel: Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/21/2022 Mar 30 12:32:28 nbp11-srv1 kernel: Call Trace: Mar 30 12:32:28 nbp11-srv1 kernel: dump_stack+0x41/0x60 Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_trans_create.part.58+0x63/0x70 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_namespace_trace_update+0x972/0x980 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_namespace_exec_oit+0x87d/0x970 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_master_oit_engine+0xc56/0x1360 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: lfsck_master_engine+0x512/0xcd0 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: ? __schedule+0x2d9/0x860 Mar 30 12:32:28 nbp11-srv1 kernel: ? finish_wait+0x80/0x80 Mar 30 12:32:28 nbp11-srv1 kernel: ? lfsck_master_oit_engine+0x1360/0x1360 [lfsck] Mar 30 12:32:28 nbp11-srv1 kernel: kthread+0x10a/0x120 Mar 30 12:32:28 nbp11-srv1 kernel: ? set_kthread_struct+0x50/0x50 Mar 30 12:32:28 nbp11-srv1 kernel: ret_from_fork+0x1f/0x40 on MDT2 getting these errors Mar 30 12:33:43 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb78:0x1e8bb:0x0] : rc = 56 Mar 30 12:33:43 nbp11-srv5 kernel: Lustre: nbp11-MDT0002-osd: layout LFSCK master found bad lmm_oi for [0x2400ecb78:0x1e8bc:0x0] : rc = 56 These are the files for the bad directories.

            People

              adilger Andreas Dilger
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: