Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17164

Old files not accessible anymore with lma incompat=2 and no lov

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.12.8
    • None
    • 2.12.8+patches, CentOS 7.9, ldiskfs
    • 3
    • 9223372036854775807

    Description

      Hello!
      On our Oak filesystem, today running 2.12.8+patches (very close to 2.12.9), a few old files, which were last modified in March 2020, cannot be accessed anymore. From a client:

      [root@oak-cli01 ~]# ls -l /oak/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup
      ls: cannot access '/oak/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup/pseudogenome.fasta': No such file or directory
      ls: cannot access '/oak/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup/repnames.bed': No such file or directory
      total 0
      -????????? ? ? ? ?            ? pseudogenome.fasta
      -????????? ? ? ? ?            ? repnames.bed
      

      We found them with no trusted.lov, just a trusted.lma and ACLs (system.posix_acl_access), owned by root / root and 0000 as permissions (note that I since changed the ownership/permissions which are reflected in the debugfs output below, so the ctime has been updated too):

      oak-MDT0000> debugfs:  stat ROOT/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup/pseudogenome.fasta
      Inode: 745295211   Type: regular    Mode:  0440   Flags: 0x0
      Generation: 392585436    Version: 0x00000000:00000000
      User:     0   Group:     0   Project:     0   Size: 0
      File ACL: 0
      Links: 1   Blockcount: 0
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x651b1517:a256cacc -- Mon Oct  2 12:08:07 2023
       atime: 0x649be9dc:d980bf08 -- Wed Jun 28 01:05:48 2023
       mtime: 0x5e7ae8a2:437ca450 -- Tue Mar 24 22:14:10 2020
      crtime: 0x649be9dc:d980bf08 -- Wed Jun 28 01:05:48 2023
      Size of extra inode fields: 32
      Extended attributes:
        lma: fid=[0x2f800028cf:0x944c:0x0] compat=0 incompat=2
        system.posix_acl_access:
          user::r--
          group::rwx
          group:3352:rwx
          mask::r--
          other::---
      BLOCKS:
      
      oak-MDT0000> debugfs:  stat ROOT/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup/repnames.bed
      Inode: 745295212   Type: regular    Mode:  0440   Flags: 0x0
      Generation: 392585437    Version: 0x00000000:00000000
      User:     0   Group:     0   Project:     0   Size: 0
      File ACL: 0
      Links: 1   Blockcount: 0
      Fragment:  Address: 0    Number: 0    Size: 0
       ctime: 0x651b1517:a256cacc -- Mon Oct  2 12:08:07 2023
       atime: 0x649be9dc:d980bf08 -- Wed Jun 28 01:05:48 2023
       mtime: 0x5e7ae8ad:07654c1c -- Tue Mar 24 22:14:21 2020
      crtime: 0x649be9dc:d980bf08 -- Wed Jun 28 01:05:48 2023
      Size of extra inode fields: 32
      Extended attributes:
        lma: fid=[0x2f800028cf:0x953d:0x0] compat=0 incompat=2
        system.posix_acl_access:
          user::r--
          group::rwx
          group:3352:rwx
          mask::r--
          other::---
      BLOCKS:
      

      Note also that the crtime is recent because we migrated this MDT (MDT0000) using a backup/restore method to new hardware last June 2023, but we have verified yesterday that these files were already like that before the MDT migration (we still have access to the old storage array). So we know it's not something we introduced during the migration. Just in case you notice the crtime and ask

      Timeline as we understand it:

      • March 2020 these files likely created, or at least last modified, this was on Lustre 2.10.8
      • October 2020: we upgraded from 2.10.8 to 2.12.5
      • June 2022: we have recorded a SATTR changelog event with those FIDs but on oak-MDT0002, I don't know why as they are stored on MDT0.
      • June 2023: we perform a MDT backup/restore to new hardware but we confirmed this didn't introduce the problem
      • October 2023: our users notice and report the problem.

      Changelog events on those FIDs (we log them to Splunk):

      2022-06-08T13:15:14.793547861-0700 mdt=oak-MDT0002 id=9054081490 type=SATTR flags=0x44 uid=0 gid=0 target=[0x2f800028cf:0x944c:0x0]
      2022-06-08T13:15:14.795309940-0700 mdt=oak-MDT0002 id=9054081491 type=SATTR flags=0x44 uid=0 gid=0 target=[0x2f800028cf:0x953d:0x0]
      

      It's really curious to see those coming from oak-MDT0002 !??
      We have also noticed these errors in the logs:

      Oct 02 11:35:12 oak-md1-s1 kernel: LustreError: 59611:0:(mdt_open.c:1227:mdt_cross_open()) oak-MDT0002: [0x2f800028cf:0x944c:0x0] doesn't exist!: rc = -14
      Oct 02 11:35:37 oak-md1-s1 kernel: LustreError: 59615:0:(mdt_open.c:1227:mdt_cross_open()) oak-MDT0002: [0x2f800028cf:0x944c:0x0] doesn't exist!: rc = -14
      

      Could Lustre be confused on which MDT these FIDs are supposed to be served because of corrupted metadata? Why on earth could oak-MDT0002 be involved here?

      Parent FID:

      [root@oak-cli01 ~]# lfs path2fid /oak/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup
      [0x200033e88:0x114:0x0]
      [root@oak-cli01 ~]# lfs getdirstripe /oak/stanford/groups/khavari/users/dfporter/before_2021_projects/genome/fastRepEnrich_hg38/fastRepEnrich/fastRE_Setup
      lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
      

       

      We tried to run lfsck namespace but it crashed our MDS likely due to LU-14105 which is only fixed in 2.14:

            KERNEL: /usr/lib/debug/lib/modules/3.10.0-1160.83.1.el7_lustre.pl1.x86_64/vmlinux
          DUMPFILE: vmcore  [PARTIAL DUMP]
              CPUS: 64
              DATE: Mon Oct  2 22:55:53 2023
            UPTIME: 48 days, 16:17:05
      LOAD AVERAGE: 2.94, 3.39, 3.52
             TASKS: 3287
          NODENAME: oak-md1-s2
           RELEASE: 3.10.0-1160.83.1.el7_lustre.pl1.x86_64
           VERSION: #1 SMP Sun Feb 19 18:38:37 PST 2023
           MACHINE: x86_64  (3493 Mhz)
            MEMORY: 255.6 GB
             PANIC: "Kernel panic - not syncing: LBUG"
               PID: 24913
           COMMAND: "lfsck_namespace"
              TASK: ffff8e62979fa100  [THREAD_INFO: ffff8e5f41a48000]
               CPU: 8
             STATE: TASK_RUNNING (PANIC)
      
      crash> bt
      PID: 24913  TASK: ffff8e62979fa100  CPU: 8   COMMAND: "lfsck_namespace"
       #0 [ffff8e5f41a4baa8] machine_kexec at ffffffffaac69514
       #1 [ffff8e5f41a4bb08] __crash_kexec at ffffffffaad29d72
       #2 [ffff8e5f41a4bbd8] panic at ffffffffab3ab713
       #3 [ffff8e5f41a4bc58] lbug_with_loc at ffffffffc06538eb [libcfs]
       #4 [ffff8e5f41a4bc78] lfsck_namespace_assistant_handler_p1 at ffffffffc1793e68 [lfsck]
       #5 [ffff8e5f41a4bd80] lfsck_assistant_engine at ffffffffc177604e [lfsck]
       #6 [ffff8e5f41a4bec8] kthread at ffffffffaaccb511
       #7 [ffff8e5f41a4bf50] ret_from_fork_nospec_begin at ffffffffab3c51dd
      

      According to Robinhood, these files' striping is likely 1 so we're going to try to find their object IDs.

      Do you have any idea on how to resolve this without running lfsck? How can we find/reattach the objects?

      Thanks!

      Attachments

        Activity

          People

            wc-triage WC Triage
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: