Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4584

Lock revocation process fails consistently

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • None
    • 3
    • 12530

    Description

      Some users have reported to us that the "rm" command is taking a long time. Some investigation revealed that at least the first "rm" in a directory takes just over 100 seconds, which of course sounds like OBD_TIMEOUT_DEFAULT.

      This isn't necessarily the simplest reproducer, but the following reproducer is completely consistent:

      1. set directory striping default count to 48
      2. touch a file on client A
      3. rm file on client B

      The clients are running 2.4.0-19chaos, servers are at 2.4.0-21chaos. The servers are using zfs as the backend.

      I have some lustre logs that I will share and talk about in additional posts to this ticket. But essentially it looks like the server always times out on a AST to client A (explaining the 100 second delay). It is not really clear yet to me why that happens, because client A appears to be completely responsive. My current suspicion is the the MDT is to blame.

      Attachments

        1. 172.16.66.4@tcp.log.bz2
          40 kB
        2. 172.16.66.5@tcp.log.bz2
          53 kB
        3. 172.20.20.201@o2ib500.log.bz2
          8.52 MB
        4. client_log_20140206.txt
          375 kB
        5. inflames.log
          2.40 MB

        Issue Links

          Activity

            [LU-4584] Lock revocation process fails consistently

            I have been testing with the LU-4584 patch and I'm still seeing client evictions. Could it be possible to get the LU-2827 patch working on 2.4

            simmonsja James A Simmons added a comment - I have been testing with the LU-4584 patch and I'm still seeing client evictions. Could it be possible to get the LU-2827 patch working on 2.4

            It was my bad. The last test shot we used our 2.4 production file system which didn't have the patch from here. So the above breakage is expected. We are in the process of testing this at larger scale (500 nodes) production machine. Yes ORNL has created a public git tree

            https://github.com/ORNL-TechInt/lustre

            So people can examine our special sauce.

            simmonsja James A Simmons added a comment - It was my bad. The last test shot we used our 2.4 production file system which didn't have the patch from here. So the above breakage is expected. We are in the process of testing this at larger scale (500 nodes) production machine. Yes ORNL has created a public git tree https://github.com/ORNL-TechInt/lustre So people can examine our special sauce.

            James, can you share the patch stack you are using? That might help us figure out if you are reporting the same issue or something else. And if it isn't exactly the same issue, we really need to get you to report it in another ticket.

            morrone Christopher Morrone (Inactive) added a comment - James, can you share the patch stack you are using? That might help us figure out if you are reporting the same issue or something else. And if it isn't exactly the same issue, we really need to get you to report it in another ticket.

            Just finished a test shot with Cray 2.5 clients to see if the client evicts stopped. Their default client which is some 2.5 version with many many patches lacked the LU-2827 and LU-4861 patches I founded that helped with 2.5.2. So I applied patches from LU-2827 and LU-4861 but still had client evicts. I collected the logs from the server side and have placed them here:

            ftp.whamcloud.com/uploads/LU-4584/atlas2_testshot_Jul_29_2014_debug_logs.tar.gz

            simmonsja James A Simmons added a comment - Just finished a test shot with Cray 2.5 clients to see if the client evicts stopped. Their default client which is some 2.5 version with many many patches lacked the LU-2827 and LU-4861 patches I founded that helped with 2.5.2. So I applied patches from LU-2827 and LU-4861 but still had client evicts. I collected the logs from the server side and have placed them here: ftp.whamcloud.com/uploads/ LU-4584 /atlas2_testshot_Jul_29_2014_debug_logs.tar.gz

            BTW, I forgot to indicate here that my b2_4 patch/back-port for LU-2827 (http://review.whamcloud.com/10902) has still some problem and needs some re-work, because MDS bombs with "(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) ASSERTION( lock->l_readers > 0 ) failed" when running LLNL reproducer from LU-4584 or recovery-small/test_53 in auto-tests.
            More to come, crash-dump is under investigations, but we still can use http://review.whamcloud.com/9488 as a fix for b2_4.

            bfaccini Bruno Faccini (Inactive) added a comment - BTW, I forgot to indicate here that my b2_4 patch/back-port for LU-2827 ( http://review.whamcloud.com/10902 ) has still some problem and needs some re-work, because MDS bombs with "(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) ASSERTION( lock->l_readers > 0 ) failed" when running LLNL reproducer from LU-4584 or recovery-small/test_53 in auto-tests. More to come, crash-dump is under investigations, but we still can use http://review.whamcloud.com/9488 as a fix for b2_4.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            Merged b2_4 backport of both #5978 and #10378 master changes from LU-2827, is at http://review.whamcloud.com/10902.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited Merged b2_4 backport of both #5978 and #10378 master changes from LU-2827 , is at http://review.whamcloud.com/10902 .

            Because Client's reply buffer was not big enough to receive 1st Server's reply including LVB/layout due to default/large striping.

            bfaccini Bruno Faccini (Inactive) added a comment - Because Client's reply buffer was not big enough to receive 1st Server's reply including LVB/layout due to default/large striping.

            Why are messages being resent?

            morrone Christopher Morrone (Inactive) added a comment - Why are messages being resent?

            Prior to patch from this ticket and/or LU-2827, there was a bug (new lock early creation causing itself instead of 1st to be found during lookup!) during Server handling of resent requests causing the 1st/old one to become orphaned and replicated.
            It has been decided that patch(es) from LU-2827 will be used to fix this issue because more generic and handling all cases, mainly by detecting resent case earlier and avoiding unnecessary new lock creation.
            I am presently porting+testing a b2_4 backport of LU-2827 patches and will provide updates asap.

            bfaccini Bruno Faccini (Inactive) added a comment - Prior to patch from this ticket and/or LU-2827 , there was a bug (new lock early creation causing itself instead of 1st to be found during lookup!) during Server handling of resent requests causing the 1st/old one to become orphaned and replicated. It has been decided that patch(es) from LU-2827 will be used to fix this issue because more generic and handling all cases, mainly by detecting resent case earlier and avoiding unnecessary new lock creation. I am presently porting+testing a b2_4 backport of LU-2827 patches and will provide updates asap.

            So now that the unrelated issue being tracked for ORNL has moved to LU-5225 and the patches from LU-2827 have landed to master, can this issue be marked as a duplicate of LU-2827?

            First, can we get an explanation of how that fixes this problem, and then a clear list of the patch(es) I need to apply to b2_4?

            morrone Christopher Morrone (Inactive) added a comment - So now that the unrelated issue being tracked for ORNL has moved to LU-5225 and the patches from LU-2827 have landed to master, can this issue be marked as a duplicate of LU-2827 ? First, can we get an explanation of how that fixes this problem, and then a clear list of the patch(es) I need to apply to b2_4?
            pjones Peter Jones added a comment -

            So now that the unrelated issue being tracked for ORNL has moved to LU-5225 and the patches from LU-2827 have landed to master, can this issue be marked as a duplicate of LU-2827?

            pjones Peter Jones added a comment - So now that the unrelated issue being tracked for ORNL has moved to LU-5225 and the patches from LU-2827 have landed to master, can this issue be marked as a duplicate of LU-2827 ?

            People

              bfaccini Bruno Faccini (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: