Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7020

OST_DESTROY message times out on MDS repeatedly, indefinitely

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      We are seeing a single message on one of our MDS nodes time out over and over with no apparent end in sight. The message is destined for an OST, so each time it times out the connection with the OST is assumed dead, and the MDS reconnects. Here is a sample of the timeout:

      00000100:00000100:15.0:1437780834.157231:0:12825:0:(client.c:1976:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1437780188/real 1437780188]  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:Xru/2/ffffffff rc -11/-1
      00000100:00000200:15.0:1437780834.157246:0:12825:0:(events.c:99:reply_in_callback()) @@@ type 6, status 0  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:Xru/2/ffffffff rc -11/-1
      00000100:00000200:15.0:1437780834.157250:0:12825:0:(events.c:120:reply_in_callback()) @@@ unlink  req@ffff880c55ff7400 x1506778035514940/t0(0) o6->lsd-OST003a-osc-MDT0000@172.19.2.163@o2ib100:28/4 lens 664/432 e 27 to 1 dl 1437780833 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
      00000400:00000200:15.0:1437780834.157254:0:12825:0:(lib-md.c:73:lnet_md_unlink()) Unlinking md ffff88072424e1c0
      00000100:02000400:15.0:1437780834.157258:0:12825:0:(import.c:179:ptlrpc_set_import_discon()) lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.163@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
      

      Thread 12825 (ptlrpcd_14) is repeatedly timing out an opcode 6 (OST_DESTROY) with OST lsd-OST003a. Interestingly, it has the same transaction number each time. I cannot, however, see any resend of the request, so the OSS never really has an opportunity to try to reply to the message. It is especially strange that the send time is getting updated each time even though the message does not seem to be resent.

      This problem has persisted through reboots of both the MDS and OSTs. The sysadmins tried failing over the OST, and the problem followed the OST to the failover OSS (that part probably isn't terribly surprising).

      This has persisted for weeks now.

      Currently, the servers are running lustre version 2.5.4-4chaos.

      Attachments

        1. debug_log.pilsner59.LU-7020.2015.10.15.txt.gz
          837 kB
          D. Marc Stearman
        2. debug_log.pilsner-mds1.LU-7020.2015.10.15.txt.gz
          0.3 kB
          D. Marc Stearman
        3. pilsner59.stack.out.03Nov2015.gz
          89 kB
          Cameron Harr
        4. pilsner59.stack.out.gz
          82 kB
          Cameron Harr

        Issue Links

          Activity

            [LU-7020] OST_DESTROY message times out on MDS repeatedly, indefinitely
            pjones Peter Jones added a comment -

            Thanks Cameron

            pjones Peter Jones added a comment - Thanks Cameron

            Yes. It can be closed.

            charr Cameron Harr added a comment - Yes. It can be closed.
            pjones Peter Jones added a comment -

            That's good news Cameron. So, can we close this ticket as a duplicate of LU-5242?

            pjones Peter Jones added a comment - That's good news Cameron. So, can we close this ticket as a duplicate of LU-5242 ?
            charr Cameron Harr added a comment -

            Hongchao,
            These messages were symptomatic of the reported issue, following the same format and period. However, looking again, it looks like the messages stopped about 10 minutes after I copied these messages. It appears that the problem did resolve itself with the patched code and I should have waited about 22 more minutes before responding .

            charr Cameron Harr added a comment - Hongchao, These messages were symptomatic of the reported issue, following the same format and period. However, looking again, it looks like the messages stopped about 10 minutes after I copied these messages. It appears that the problem did resolve itself with the patched code and I should have waited about 22 more minutes before responding .

            Hi Cameron,

            The logs in the above seems to be normal. the precreated objects at OST will be deleted by MDT when the MDT recover the connection
            to OST, the logs "deleting orphan objects from xxx to xxx ..." is just these cases, and the corresponding request code OST_CREATE(5)
            is different from the request code in the issue (OST_DESTORY 6).

            Is there other logs to indicate the OST_DESTROY request is resent periodically?

            Thanks

            hongchao.zhang Hongchao Zhang added a comment - Hi Cameron, The logs in the above seems to be normal. the precreated objects at OST will be deleted by MDT when the MDT recover the connection to OST, the logs "deleting orphan objects from xxx to xxx ..." is just these cases, and the corresponding request code OST_CREATE(5) is different from the request code in the issue (OST_DESTORY 6). Is there other logs to indicate the OST_DESTROY request is resent periodically? Thanks
            charr Cameron Harr added a comment -

            I updated to Lustre 2.5.5-3chaos today on the system and there appears to be no change in the behavior. Following is output taken after the system has been updated:

            Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
            Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
            Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26258401 to 0x0:26258561
            Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
            Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: Skipped 1 previous similar message
            Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
            Jan 14 09:45:20 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
            Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
            Jan 14 09:45:39 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26260525 to 0x0:26260961
            Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete
            Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting
            Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26262647 to 0x0:26262977
            Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
            
            charr Cameron Harr added a comment - I updated to Lustre 2.5.5-3chaos today on the system and there appears to be no change in the behavior. Following is output taken after the system has been updated: Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:34:34 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26258401 to 0x0:26258561 Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100) Jan 14 09:34:34 pilsner-mds1 kernel: Lustre: Skipped 1 previous similar message Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:45:20 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:45:20 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100) Jan 14 09:45:39 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26260525 to 0x0:26260961 Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection to lsd-OST003a (at 172.19.2.162@o2ib100) was lost; in progress operations using this service will wait for recovery to complete Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: Client lsd-MDT0000-mdtlov_UUID (at 172.19.2.102@o2ib100) reconnecting Jan 14 09:56:05 pilsner59 kernel: Lustre: lsd-OST003a: deleting orphan objects from 0x0:26262647 to 0x0:26262977 Jan 14 09:56:05 pilsner-mds1 kernel: Lustre: lsd-OST003a-osc-MDT0000: Connection restored to lsd-OST003a (at 172.19.2.162@o2ib100)
            hongchao.zhang Hongchao Zhang added a comment - - edited the patch back-ported from http://review.whamcloud.com/#/c/13630/ in LU-5242 is tracked at http://review.whamcloud.com/#/c/17154/

            Yes, it certainly sounds like LU-5242 is a likely candidate for fixing this issue. It looks like something that fell through the cracks at Intel when it came time to land the backport.

            morrone Christopher Morrone (Inactive) added a comment - Yes, it certainly sounds like LU-5242 is a likely candidate for fixing this issue. It looks like something that fell through the cracks at Intel when it came time to land the backport.
            charr Cameron Harr added a comment -

            This is late notice, but we have planned OS upgrades on this system this morning (08:30-10:00 PST) and I will have the system offline for a small amount of time. It looks like we have some pointers to chase down, but let me know if you'd like me to do anything with that downtime.

            charr Cameron Harr added a comment - This is late notice, but we have planned OS upgrades on this system this morning (08:30-10:00 PST) and I will have the system offline for a small amount of time. It looks like we have some pointers to chase down, but let me know if you'd like me to do anything with that downtime.
            bzzz Alex Zhuravlev added a comment - - edited

            object size is not equal transaction size. transaction size includes blocks the transaction modifies, but blocks used for user's data aren't modified, so they are not part of transaction. IOW, it's only metadata blocks (dnode, tree).

            bzzz Alex Zhuravlev added a comment - - edited object size is not equal transaction size. transaction size includes blocks the transaction modifies, but blocks used for user's data aren't modified, so they are not part of transaction. IOW, it's only metadata blocks (dnode, tree).

            People

              hongchao.zhang Hongchao Zhang
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: