Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8967

directory entries for non existing files

Details

    • 3
    • 9223372036854775807

    Description

      We have several directories with entries for non existing files. For example:

      [root@quartz2311:~]# ls -l /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0                                                                                 
      ls: cannot access /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003: No such file or directory
      total 3154
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.000
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.001
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.002
      -????????? ? ?       ?             ?            ? filler.003
      drwx------ 2 casses1 casses1   25600 Dec 21 16:43 ~dmtmp
      

      The directory itself is a remote directory on one MDT:

      [root@quartz2311:~]# lfs getdirstripe -d /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0
      lmv_stripe_count: 0 lmv_stripe_offset: 3
      

      We are able to get striping information for this file:

      [root@quartz2311:~]# lfs getstripe /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003
      /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003
      lmm_stripe_count:   1
      lmm_stripe_size:    1048576
      lmm_pattern:        1
      lmm_layout_gen:     0
      lmm_stripe_offset:  27
              obdidx           objid           objid           group
                  27        20538776      0x1396598      0xcc0000402
      

      It looks like the OSS serving that OST was rebooted and the OST went through recovery around the time the missing file was created. In particular, we note that the object number falls in the range of orphan objects that were deleted:

      [root@zinci:~]# grep 0xcc0000402 /var/log/conman/console.zinc*
      /var/log/conman/console.zinc43:2016-12-21 16:30:56 [189484.767900] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538706 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:33:30 [189639.110247] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:35:41 [189769.704490] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:40:19 [190047.449320] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:44:45 [190313.751155] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538820 to 0xcc0000402:20541649
      /var/log/conman/console.zinc44:2016-12-21 16:49:27 [  159.838420] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538820 to 0xcc0000402:20541649
      

      I will attach server console logs separately.

      Attachments

        Issue Links

          Activity

            [LU-8967] directory entries for non existing files
            pjones Peter Jones added a comment -

            AFAIK items tracked under this ticket are complete

            pjones Peter Jones added a comment - AFAIK items tracked under this ticket are complete
            pjones Peter Jones added a comment -

            This is now confirmed as a duplicate of LU-8562 and so you should proceed with using Ned's ports of those patches to 2.8 FE. In addition it is recommended that you pick up the fix for LU-8367.

            pjones Peter Jones added a comment - This is now confirmed as a duplicate of LU-8562 and so you should proceed with using Ned's ports of those patches to 2.8 FE. In addition it is recommended that you pick up the fix for LU-8367 .

            a prototype is under testing, I'm going to pass it through Maloo few more times..

            bzzz Alex Zhuravlev added a comment - a prototype is under testing, I'm going to pass it through Maloo few more times..

            with LU-8562 I still see the same symptoms rarely.. there is another patch addressing the same issue, it's doing a bit better, but still possible to reproduce within few hours. I've been looking for the root cause.

            bzzz Alex Zhuravlev added a comment - with LU-8562 I still see the same symptoms rarely.. there is another patch addressing the same issue, it's doing a bit better, but still possible to reproduce within few hours. I've been looking for the root cause.
            nedbass Ned Bass (Inactive) added a comment - - edited

            Ned, so this issue is solved by LU-8562 in general, but patch itself contains defect. I checked your patch, does it solves your problem? Or more work is required in that area?

            I have tested the LU-8562 patch and my one-line follow-on patch https://review.whamcloud.com/24758 on a single-node test setup. I am no longer able to reproduce the data loss bug with those patches applied. Without the patches I can reproduce it almost immediately using the LU-8562 test case.

            The  remaining work to do in that area is as follows.

            • An explanation is needed as to why conf-sanity test_101 is still failing as per LU-8972. The ongoing test case failure suggests the data loss bug is not completely resolved by the LU-8562 patch. We need high confidence that this bug is resolved before putting user data on Lustre 2.8 FE.
            • Patch https://review.whamcloud.com/24758 needs review by someone who understands the osp_precreate_thread state machine better than me.

             

            Interesting that LU-8562 itself is quite recent change and we did't observe a lot of issues similar to LU-8967 without it. I wonder what was changed in your system when you start seeing it. Was it just a software update or hardware as well?

             

            My best guess as to why we started seeing LU-8562 is that we made changes to our pacemaker/corosync HA system. We recently optimized the configuration so Lustre services are started with much less delay than before. This makes it is very likely that OST orphan cleanup will be interrupted by the HA partner coming up and failing back the OST. As I understand, that is the race window for LU-8562 to occur. Before it took a long time for services to start up, so orphan cleanup was almost always done by the time the partner failed back the OST.

            nedbass Ned Bass (Inactive) added a comment - - edited Ned, so this issue is solved by LU-8562 in general, but patch itself contains defect. I checked your patch, does it solves your problem? Or more work is required in that area? I have tested the LU-8562 patch and my one-line follow-on patch  https://review.whamcloud.com/24758 on a single-node test setup. I am no longer able to reproduce the data loss bug with those patches applied. Without the patches I can reproduce it almost immediately using the LU-8562 test case. The  remaining work to do in that area is as follows. An explanation is needed as to why conf-sanity test_101 is still failing as per LU-8972 . The ongoing test case failure suggests the data loss bug is not completely resolved by the LU-8562 patch. We need high confidence that this bug is resolved before putting user data on Lustre 2.8 FE. Patch https://review.whamcloud.com/24758 needs review by someone who understands the  osp_precreate_thread state machine better than me.   Interesting that LU-8562 itself is quite recent change and we did't observe a lot of issues similar to LU-8967 without it. I wonder what was changed in your system when you start seeing it. Was it just a software update or hardware as well?   My best guess as to why we started seeing LU-8562 is that we made changes to our pacemaker/corosync HA system. We recently optimized the configuration so Lustre services are started with much less delay than before. This makes it is very likely that OST orphan cleanup will be interrupted by the HA partner coming up and failing back the OST. As I understand, that is the race window for LU-8562 to occur. Before it took a long time for services to start up, so orphan cleanup was almost always done by the time the partner failed back the OST.

            Ned, so this issue is solved by LU-8562 in general, but patch itself contains defect. I checked your patch, does it solves your problem? Or more work is required in that area?

            Interesting that LU-8562 itself is quite recent change and we did't observe a lot of issues similar to LU-8967 without it. I wonder what was changed in your system when you start seeing it. Was it just a software update or hardware as well?

            tappro Mikhail Pershin added a comment - Ned, so this issue is solved by LU-8562 in general, but patch itself contains defect. I checked your patch, does it solves your problem? Or more work is required in that area? Interesting that LU-8562 itself is quite recent change and we did't observe a lot of issues similar to LU-8967 without it. I wonder what was changed in your system when you start seeing it. Was it just a software update or hardware as well?

            Hi Mikhail, Each occurrence that I've investigated happened immediately after the OST completed recovery. The object numbers of the missing files all fall at the beginning of the range of deleted orphans. It does not continue to occur when all OSTs are up.

            I can remove the files as root. The rm command fails for an unprivileged user because stat() returns ENONENT and rm treats that as fatal unless you're root.

            I have confirmed that I can reproduce LU-8562 on our system using the test case from that patch and it looks just like this issue. I tested https://review.whamcloud.com/#/c/22211/ on a single node setup and wasn't able to reproduce the bug. However I ran into a defect with that patch that causes the osp_precreate thread to hang as I described in LU-8562.

             

            nedbass Ned Bass (Inactive) added a comment - Hi Mikhail, Each occurrence that I've investigated happened immediately after the OST completed recovery. The object numbers of the missing files all fall at the beginning of the range of deleted orphans. It does not continue to occur when all OSTs are up. I can remove the files as root. The rm command fails for an unprivileged user because stat() returns ENONENT and rm treats that as fatal unless you're root. I have confirmed that I can reproduce LU-8562 on our system using the test case from that patch and it looks just like this issue. I tested  https://review.whamcloud.com/#/c/22211/  on a single node setup and wasn't able to reproduce the bug. However I ran into a defect with that patch that causes the osp_precreate thread to hang as I described in LU-8562 .  

            Ned, are these entries occurred once when OST was failed over or still continue to occur? Is it possible to remove them?

            I am checking patches you've mentioned.

            tappro Mikhail Pershin added a comment - Ned, are these entries occurred once when OST was failed over or still continue to occur? Is it possible to remove them? I am checking patches you've mentioned.

            I suspect this is related to LU-8562.

            nedbass Ned Bass (Inactive) added a comment - I suspect this is related to LU-8562 .
            pjones Peter Jones added a comment -

            Mike

            Could you please assist with this issue?

            Thanks

            Peter

            pjones Peter Jones added a comment - Mike Could you please assist with this issue? Thanks Peter

            People

              tappro Mikhail Pershin
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: