Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11419

lfsck does not complete phase2

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0
    • Lustre 2.10.4
    • None
    • x86_64, zfs, 3 MDTs, all on 1 MDS, , 2.10.4 + many patches.
    • 3
    • 9223372036854775807

    Description

      Hi,

      I presume this is related to LU-11111 and LU-10888.

      lctl lfsck_start -M dagg-MDT0000 -t namespace -A -n
      completed ok

      lctl lfsck_start -M dagg-MDT0000 -t namespace -A
      completed on mdt1 and mdt2 but stuck on mdt0.

      this is the summary of repairs, and md0 did not progress from here:

      [warble2]root: lctl get_param -n mdd.dagg-MDT000*.lfsck_namespace | egrep 'status:|repaired|checked_'  | grep -v ' 0$'
      status: scanning-phase2
      checked_phase1: 33226737
      checked_phase2: 10901477
      dangling_repaired: 28
      striped_shards_repaired: 102
      name_hash_repaired: 51
      status: completed
      checked_phase1: 32652269
      checked_phase2: 12379442
      dangling_repaired: 28
      striped_shards_repaired: 125
      status: completed
      checked_phase1: 32662678
      checked_phase2: 12378342
      unmatched_pairs_repaired: 1
      dangling_repaired: 11
      striped_shards_repaired: 96
      

      lfsck_namespace was using 100% of a cpu but the checked_phase2 counter wasn't going up.
      kill -9 on lfsck_namespace didn't work
      I didn't try lfsk stop_lfsck this time.
      mdt0 wouldn't umount. had to reset the MDS.

      I did a sysrq 't' and 'w' before resetting the MDS and those start at
      Sep 23 00:18:42
      in the attached messages file.

      hopefully that might help.
      please let us know if there's something else we can help with.

      cheers,
      robin

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              scadmin SC Admin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: