Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6066

lfsck_namespace_repair_nlink() ASSERTION( (((lfsck_object_type(obj)) & 00170000) == 0100000) ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.7.0
    • Single node test system (MDTx2, OSTx3, client), RHEL 2.6.32-431.29.2.el6 kernel, Lustre master v2_6_91_0-49-ge0ece89
    • 3
    • 16885

    Description

      I was testing out some filesystem corruption (mounted MDT as type ldiskfs, copied MDT file and all xattrs from hosts to hosts.clone, then modified LMA FID and LOV ostid f_oid=0x1 to f_oid=0x2) so that they would share the same OST object but have different FIDs.

      When remounting the MDT as type lustre and listing the files, it detected OI corruption due to the missing FID and started OI scrub:

      Lustre: testfs-MDT0000: trigger OI scrub by RPC for [0x2c00059f0:0x2:0x0], rc = 0 [2]
      

      which appeared to be successful since I could list all the files.

      I deleted the hosts.clone file, and then observed (as expected) that ls returned an error because the referenced OST objects no longer existed. However, I was unable to unlink the original filename, even when using munlink which should ignore any errors. This was apparently because I had (accidentally) made the cloned file share the same FID f_oid=0x2 as a third file hosts2, and figured that the duplication of the MDT FID was causing problems since it couldn't find this FID in the OI anymore.

      I tried running lctl lfsck_start -M testfs-MDT0000 -A to rebuild the OI to contain the original f_oid=0x2 inode (which still existed in the host2 LMA), but immediately hit the below assertions on two different LFSCK threads:

      LustreError: 20102:0:(lfsck_namespace.c:2921:lfsck_namespace_repair_nlink()) ASSERTION( (((lfsck_object_type(obj)) & 00170000) == 0100000) ) failed:
      LustreError: 20102:0:(lfsck_namespace.c:2921:lfsck_namespace_repair_nlink()) LBUG
      Pid: 20102, comm: lfsck_namespace
      Call Trace:
       [<ffffffffa0812895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0812e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa0f06bf1>] lfsck_namespace_repair_nlink+0x6b1/0xa60 [lfsck]
       [<ffffffffa0f1b9bf>] lfsck_namespace_double_scan_one+0x23f/0x1410 [lfsck]
       [<ffffffffa0f1d899>] lfsck_namespace_assistant_handler_p2+0xd09/0x11b0 [lfsck]
       [<ffffffffa0eff399>] lfsck_assistant_engine+0x14e9/0x1e00 [lfsck]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
      
      LustreError: 20097:0:(lfsck_namespace.c:2921:lfsck_namespace_repair_nlink()) ASSERTION( (((lfsck_object_type(obj)) & 00170000) == 0100000) ) failed:
      Pid: 20097, comm: lfsck_namespace
      Call Trace:
       [<ffffffffa0812895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0812e97>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa0f06bf1>] lfsck_namespace_repair_nlink+0x6b1/0xa60 [lfsck]
       [<ffffffffa0f1b9bf>] lfsck_namespace_double_scan_one+0x23f/0x1410 [lfsck]
       [<ffffffffa0f1d899>] lfsck_namespace_assistant_handler_p2+0xd09/0x11b0 [lfsck]
       [<ffffffffa0eff399>] lfsck_assistant_engine+0x14e9/0x1e00 [lfsck]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
      LustreError: dumping log to /tmp/lustre-log.1419280935.20097
      

      We definitely shouldn't be LASSERTing on data from the filesystem.

      Attachments

        Activity

          People

            yong.fan nasf (Inactive)
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: