Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8569

Sharded DNE directory full of files that don't exist

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      On our DNE testbed, one of our sharded directories seems to contain files that are all in a broken state. Currently both servers and clients are running 2.8.0_0.0.llnlpreview.40 (see the lustre-release-fe-llnl repo).

      We can get a directory listing, but nothing listed is actually accessible. Here is an excerpt from running ls -l:

      # pwd
      /p/lquake/casses1/opal-jet/simul_2
      # ls -l
      ls: cannot access simul_link.2243: No such file or directory
      ls: cannot access simul_link.3161: No such file or directory
      ls: cannot access simul_link.3129: No such file or directory
      ls: cannot access simul_link.3893: No such file or directory
      ls: cannot access simul_link.691: No such file or directory
      ls: cannot access simul_link.3233: No such file or directory
      ls: cannot access simul_link.235: No such file or directory
      ls: cannot access simul_link.1653: No such file or directory
      ls: cannot access simul_link.3167: No such file or directory
      ls: cannot access simul_link.681: No such file or directory
      ls: cannot access simul_link.835: No such file or directory
      ls: cannot access simul_link.3857: No such file or directory
      ls: cannot access simul_link.1591: No such file or directory
      ls: cannot access simul_link.1175: No such file or directory
      [cut]
      -????????? ? ? ? ?            ? simul_link.937
      -????????? ? ? ? ?            ? simul_link.94
      -????????? ? ? ? ?            ? simul_link.940
      -????????? ? ? ? ?            ? simul_link.941
      -????????? ? ? ? ?            ? simul_link.942
      -????????? ? ? ? ?            ? simul_link.943
      -????????? ? ? ? ?            ? simul_link.944
      -????????? ? ? ? ?            ? simul_link.947
      [cut]
      

      Here is the striping information:

      # lfs getdirstripe .
      .
      lmv_stripe_count: 16 lmv_stripe_offset: 12
      mdtidx           FID[seq:oid:ver]
          12           [0x50000996c:0x14fed:0x0]
          13           [0x54000919d:0x14fed:0x0]
          14           [0x58000a086:0x14fed:0x0]
          15           [0x5c000996b:0x14fed:0x0]
           0           [0x200006b03:0x14fed:0x0]
           1           [0x3000089cc:0x14fed:0x0]
           2           [0x38000996d:0x14fed:0x0]
           3           [0x4c000b0df:0x14fed:0x0]
           4           [0x2c000a142:0xec09:0x0]
           5           [0x3c000b8b2:0xec09:0x0]
           6           [0x34000a143:0xec09:0x0]
           7           [0x40000a143:0xec09:0x0]
           8           [0x44000a142:0xec09:0x0]
           9           [0x24000a143:0xec09:0x0]
          10           [0x2800091a4:0xec09:0x0]
          11           [0x4800091a3:0xec09:0x0]
      

      I ran lfsck on all services (at least those started by the "--all" option), but that did not address this situation.

      The problem files cannot be unlinked:

      # rm simul_link.999
      rm: cannot remove 'simul_link.999': No such file or directory
      

      Attachments

        1. getstripelogs.tar.gz
          0.2 kB
        2. jet-link-logs-part1.tar.gz
          0.2 kB
        3. jet-link-logs-part2.tar.gz
          0.2 kB
        4. jet-link-logs-part3.tar.gz
          0.2 kB
        5. jet-link-logs-part4.tar.gz
          0.2 kB
        6. lfsck_namespace_state-9-28-2016.log
          24 kB

        Issue Links

          Activity

            [LU-8569] Sharded DNE directory full of files that don't exist

            Before this closes, can these patches also be ported to the 2.8FE branch?

            dinatale2 Giuseppe Di Natale (Inactive) added a comment - Before this closes, can these patches also be ported to the 2.8FE branch?

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23500/
            Subject: LU-8569 linkea: linkEA size limitation
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e760042016bb5b12f9b21568304c02711930720f

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23500/ Subject: LU-8569 linkea: linkEA size limitation Project: fs/lustre-release Branch: master Current Patch Set: Commit: e760042016bb5b12f9b21568304c02711930720f

            Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23741
            Subject: LU-8569 lfsck: handle linkEA overflow
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 94f5d2fec9edb6e1e5359ceebea9882cb5bb2719

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23741 Subject: LU-8569 lfsck: handle linkEA overflow Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 94f5d2fec9edb6e1e5359ceebea9882cb5bb2719

            Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23500
            Subject: LU-8569 linkea: linkEA size limitation
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0d8fe108f7b7f267fa790320954fc55e996af964

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23500 Subject: LU-8569 linkea: linkEA size limitation Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0d8fe108f7b7f267fa790320954fc55e996af964

            Yes, I think it is reasonable to limit linkEA size in this case. The Linux kernel xattr API is also similarly limited by the size of individual xattrs, and ldiskfs has a 4KB limit for xattrs, so the Lustre code is already expecting that not all links will be stored for a given file.

            adilger Andreas Dilger added a comment - Yes, I think it is reasonable to limit linkEA size in this case. The Linux kernel xattr API is also similarly limited by the size of individual xattrs, and ldiskfs has a 4KB limit for xattrs, so the Lustre code is already expecting that not all links will be stored for a given file.

            Just did some tests on ZFS and it looks like the problem is because the linkEA on ZFS reach above the llog chunk size (32768), which our current update llog system can not handle. i.e. one update operation (update op + its parameter) size can not > llog chunk size (32KB).

            So is it ok to limit the linkea size here?

            di.wang Di Wang (Inactive) added a comment - Just did some tests on ZFS and it looks like the problem is because the linkEA on ZFS reach above the llog chunk size (32768), which our current update llog system can not handle. i.e. one update operation (update op + its parameter) size can not > llog chunk size (32KB). So is it ok to limit the linkea size here?
            di.wang Di Wang (Inactive) added a comment - - edited

            Just looked the debug log, it looks like update log is too long, which seems not right.

            .............
            0x23:47025: 200000020:00000040:9.0:1476399235.972447:0:154190:0:(update_trans.c:93:top_multiple_thandle_dump())  cookie 0x23:47025: 1
            

            too much log cookies ( > 1k) for this transaction, each cookie can hold 32k update records. So I do not understand why link can generate such big record size. Hmm, even though the linkea size might be big in your test. (Do we limit linkea size for zfs?) the problem might be in
            sub_updates_write. and related with this patch http://review.whamcloud.com/21334 , I will check.

            I suspect this test might reproduce the problem, sigh, I do not have zfs environment here,

            diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh
            index c61e3bc..0a3a82c 100755
            --- a/lustre/tests/sanity.sh
            +++ b/lustre/tests/sanity.sh
            @@ -15196,6 +15196,29 @@ test_300q() {
             }
             run_test 300q "create remote directory under orphan directory"
             
            +test_300r() {
            +       [ $PARALLEL == "yes" ] && skip "skip parallel run" && return
            +       [ $(lustre_version_code $SINGLEMDS) -lt $(version_code 2.7.55) ] &&
            +               skip "Need MDS version at least 2.7.55" && return
            +       [ $MDSCOUNT -lt 2 ] && skip "needs >= 2 MDTs" && return
            +       local stripe_count
            +       local file
            +
            +       mkdir $DIR/$tdir
            +
            +       $LFS setdirstripe -i1 -c3 $DIR/$tdir/remote_dir ||
            +               error "set striped dir error"
            +
            +       touch $DIR/$tdir/$tfile
            +       for ((i = 0; i < 50000; i++)); do
            +               ln $DIR/$tdir/$tfile $DIR/$tdir/remote_dir/fffffffffffffffffffffffffffffffffffffffff-$i ||
            +                       error "ln remote file fails"
            +       done
            +
            +       return 0
            +}
            +run_test 300r "test remote ln under striped directory"
            +
             prepare_remote_file() {
                    mkdir $DIR/$tdir/src_dir ||
                            error "create remote source failed"
            
            
            di.wang Di Wang (Inactive) added a comment - - edited Just looked the debug log, it looks like update log is too long, which seems not right. ............. 0x23:47025: 200000020:00000040:9.0:1476399235.972447:0:154190:0:(update_trans.c:93:top_multiple_thandle_dump()) cookie 0x23:47025: 1 too much log cookies ( > 1k) for this transaction, each cookie can hold 32k update records. So I do not understand why link can generate such big record size. Hmm, even though the linkea size might be big in your test. (Do we limit linkea size for zfs?) the problem might be in sub_updates_write. and related with this patch http://review.whamcloud.com/21334 , I will check. I suspect this test might reproduce the problem, sigh, I do not have zfs environment here, diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh index c61e3bc..0a3a82c 100755 --- a/lustre/tests/sanity.sh +++ b/lustre/tests/sanity.sh @@ -15196,6 +15196,29 @@ test_300q() { } run_test 300q "create remote directory under orphan directory" +test_300r() { + [ $PARALLEL == "yes" ] && skip "skip parallel run" && return + [ $(lustre_version_code $SINGLEMDS) -lt $(version_code 2.7.55) ] && + skip "Need MDS version at least 2.7.55" && return + [ $MDSCOUNT -lt 2 ] && skip "needs >= 2 MDTs" && return + local stripe_count + local file + + mkdir $DIR/$tdir + + $LFS setdirstripe -i1 -c3 $DIR/$tdir/remote_dir || + error "set striped dir error" + + touch $DIR/$tdir/$tfile + for ((i = 0; i < 50000; i++)); do + ln $DIR/$tdir/$tfile $DIR/$tdir/remote_dir/fffffffffffffffffffffffffffffffffffffffff-$i || + error "ln remote file fails" + done + + return 0 +} +run_test 300r "test remote ln under striped directory" + prepare_remote_file() { mkdir $DIR/$tdir/src_dir || error "create remote source failed"
            pjones Peter Jones added a comment -

            Got it. For future reference it is possible to make adjustments to git commit messages when landing, so it would have been possible to use the correct JIRA reference without delaying things.

            pjones Peter Jones added a comment - Got it. For future reference it is possible to make adjustments to git commit messages when landing, so it would have been possible to use the correct JIRA reference without delaying things.

            Peter,

            As you can see in the comment history, to make LU-8569 original issues to be clear, the new test failure about the LFSCK was split from LU-8569 description with new ticket LU-8647. The patch http://review.whamcloud.com/22723/ was used for resolving LU-8647 issue, but because the patch http://review.whamcloud.com/22723/ was push to Gerrit before LU-8647 created, then such patch still used the old ticket number.

            So we can close the ticket LU-8647 as resolved. There are still some work to be done for LU-8569. I am investigating the huge logs.

            yong.fan nasf (Inactive) added a comment - Peter, As you can see in the comment history, to make LU-8569 original issues to be clear, the new test failure about the LFSCK was split from LU-8569 description with new ticket LU-8647 . The patch http://review.whamcloud.com/22723/ was used for resolving LU-8647 issue, but because the patch http://review.whamcloud.com/22723/ was push to Gerrit before LU-8647 created, then such patch still used the old ticket number. So we can close the ticket LU-8647 as resolved. There are still some work to be done for LU-8569 . I am investigating the huge logs.
            pjones Peter Jones added a comment -

            So LU-8647 was fixed by http://git.whamcloud.com/fs/lustre-release.git/commit/445da16c2ac0475b1c1077c822800b68cdbb7ce3 even though it used the LU-8569 JIRA reference in the commit message?

            pjones Peter Jones added a comment - So LU-8647 was fixed by http://git.whamcloud.com/fs/lustre-release.git/commit/445da16c2ac0475b1c1077c822800b68cdbb7ce3 even though it used the LU-8569 JIRA reference in the commit message?

            Peter,

            There is still work being tracked under this ticket. The logs I posted last week are to help find a resolution to this issue.

            The patch that landed was for LU-8647.

            dinatale2 Giuseppe Di Natale (Inactive) added a comment - Peter, There is still work being tracked under this ticket. The logs I posted last week are to help find a resolution to this issue. The patch that landed was for LU-8647 .

            People

              yong.fan nasf (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: