Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-966

post-fsck MDS LBUG during recovery due to missing FID

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.2.0, Lustre 2.1.1
    • Lustre 2.0.0
    • None
    • 3
    • 4274

    Description

      We already got this LBUG twice :
      ================================
      ASSERTION(mdd_object_exists(obj)) failed: FID is [0x20001a604:0xc9:0x0]
      LustreError: 34372:0:(mdd_object.c:91:mdd_la_get()) LBUG
      ================================

      It always occured after a MDS crash, shine fsck, shine start and during the Clients recovery timeframe.
      Each time we achieved to restart the MDT/MDS using abort_recovery.

      If we assume that the concerned+missing FID has been destroyed during the fsck on the MDT after a MDS crash for any other problem, and if we consider that there are no other possible scenario than such "external" action to lead to this situation (my opinion, but what do you think ??), can we think about to replace this Assert/LBUG with only a Warning message (at least during Client-recovery phase ...) ???

      Attachments

        Issue Links

          Activity

            [LU-966] post-fsck MDS LBUG during recovery due to missing FID
            joshua Joshua Kugler (Inactive) made changes -
            Reporter Original: Bruno Faccini [ bfaccini ] New: Alexandre Louvet [ louveta ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.1.1 [ 10101 ]
            Fix Version/s Original: Lustre 2.1.2 [ 10111 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Reopened [ 4 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.1.2 [ 10111 ]
            Fix Version/s Original: Lustre 2.1.1 [ 10101 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.1.1 [ 10101 ]
            tappro Mikhail Pershin made changes -
            Resolution Original: Fixed [ 1 ]
            Status Original: Resolved [ 5 ] New: Reopened [ 4 ]
            tappro Mikhail Pershin made changes -
            Link New: This issue is related to LU-1060 [ LU-1060 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is duplicated by LU-1098 [ LU-1098 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.2.0 [ 10082 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Zhenyu Xu [ bobijam ]
            bfaccini Bruno Faccini (Inactive) created issue -

            People

              bobijam Zhenyu Xu
              louveta Alexandre Louvet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: