Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10922

sanity-lfsck test_23b: (9) Fail to repair dangling name entry: 0

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.13.0, Lustre 2.14.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for nasf <fan.yong@intel.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d32d2bc0-428c-11e8-b45c-52540065bddc

      Inject failure stub on MDT0 to simulate dangling name entry
      fail_val=130
      fail_loc=0x1621
      fail_val=0
      fail_loc=0
       - unlinked 0 (time 1573179943 ; total 0 ; last 0)
      total: 10 unlinks in 0 seconds: inf unlinks/second
      'ls' should fail because of dangling name entry
      Trigger namespace LFSCK to find out dangling name entry
      Started LFSCK on the device lustre-MDT0000: scrub namespace
       sanity-lfsck test_23b: @@@@@@ FAIL: (9) Fail to repair dangling name entry: 0 
      

      Attachments

        Issue Links

          Activity

            [LU-10922] sanity-lfsck test_23b: (9) Fail to repair dangling name entry: 0

            During the past 4 weeks 8 of 391 runs failed (~2% failure rate), and all of the failures were on ZFS filesystems.

            adilger Andreas Dilger added a comment - During the past 4 weeks 8 of 391 runs failed (~2% failure rate), and all of the failures were on ZFS filesystems.
            yujian Jian Yu added a comment - +1 on master branch: https://testing.whamcloud.com/test_sets/90443a02-bf1c-11e9-98c8-52540065bddc
            hornc Chris Horn added a comment - +1 on master: https://testing.whamcloud.com/test_sets/5506db70-a455-11e9-8fc1-52540065bddc
            yujian Jian Yu added a comment - +1 on master: https://testing.whamcloud.com/test_sets/c8b4d914-8d58-11e9-abe3-52540065bddc
            mdiep Minh Diep added a comment - +1 on b2_12: https://testing.whamcloud.com/test_sets/5405e0be-4c6b-11e9-9646-52540065bddc
            utopiabound Nathaniel Clark added a comment - This is still happening on master: https://testing.whamcloud.com/test_sets/e7766cbe-d8c7-11e8-975a-52540065bddc https://testing.whamcloud.com/test_sets/2bd3bfe4-ed02-11e8-815b-52540065bddc https://testing.whamcloud.com/test_sets/fb9eabfc-e14b-11e8-9210-52540065bddc
            pjones Peter Jones added a comment -

            Landed for 2.12

            pjones Peter Jones added a comment - Landed for 2.12

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32042/
            Subject: LU-10922 osd-zfs: return -ENOENT if object destoryed
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8d2655309a80a55a066011a66e7c95f7a0087e9c

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32042/ Subject: LU-10922 osd-zfs: return -ENOENT if object destoryed Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8d2655309a80a55a066011a66e7c95f7a0087e9c

            Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/32042
            Subject: LU-10922 lfsck: skip destroyed object
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7618d59588d231ee87e18ec6ae8aca09215e205a

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/32042 Subject: LU-10922 lfsck: skip destroyed object Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7618d59588d231ee87e18ec6ae8aca09215e205a

            The MDS logs show that:

            00100000:10000000:1.0:1523995128.651061:0:23636:0:(lfsck_namespace.c:5706:lfsck_namespace_assistant_handler_p1()) lustre-MDT0000-osd: namespace LFSCK assistant fail to handle the entry: [0x200003ab2:0x89:0x0], parent [0x200003ab2:0x7d:0x0], name foo: rc = -61
            

            The object [0x200003ab2:0x89:0x0] was just removed before the LFSCK, but the logic was not aware of that.

            yong.fan nasf (Inactive) added a comment - The MDS logs show that: 00100000:10000000:1.0:1523995128.651061:0:23636:0:(lfsck_namespace.c:5706:lfsck_namespace_assistant_handler_p1()) lustre-MDT0000-osd: namespace LFSCK assistant fail to handle the entry: [0x200003ab2:0x89:0x0], parent [0x200003ab2:0x7d:0x0], name foo: rc = -61 The object [0x200003ab2:0x89:0x0] was just removed before the LFSCK, but the logic was not aware of that.

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated: