Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.12.2
    • RHEL7.6, lustre-2.12.2, ZFS-0.8.1-1
    • 2
    • 9223372036854775807

    Description

      We are hitting an issue after the upgrade to lustre-2.12.2 last weekend where the MDS servers will start to hang, and the lustre_health check will report NOT HEALTHY because of long waiting requests. We think this is related to LU-10250, but wanted to create a seperate ticket to track this to push the priority up. It is taking an MDS down about once a day causing production outages.

      2019-07-05T02:16:36.434782-04:00 f2-mds2.ncrc.gov kernel: Lustre: f2-OST0035-osc-MDT0001: Connection to f2-OST0035 (at 10.10.33.50@o2ib2) was l

      ost; in progress operations using this service will wait for recovery to complete

      2019-07-05T02:17:26.605024-04:00 f2-mds2.ncrc.gov kernel: Lustre: f2-OST0035-osc-MDT0001: Connection restored to 10.10.33.50@o2ib2 (at 10.10.33.50@o2ib2)

      2019-07-05T02:28:56.360456-04:00 f2-mds2.ncrc.gov kernel: INFO: task txg_quiesce:40218 blocked for more than 120 seconds.

      2019-07-05T02:28:56.360500-04:00 f2-mds2.ncrc.gov kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      2019-07-05T02:28:56.371558-04:00 f2-mds2.ncrc.gov kernel: txg_quiesce     D ffff99aec932a080     0 40218      2 0x00000000

      2019-07-05T02:28:56.371580-04:00 f2-mds2.ncrc.gov kernel: Call Trace:

      2019-07-05T02:28:56.371599-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff83967c49>] schedule+0x29/0x70

      2019-07-05T02:28:56.384261-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0995325>] cv_wait_common+0x125/0x150 [spl]

      2019-07-05T02:28:56.384280-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff832c2d40>] ? wake_up_atomic_t+0x30/0x30

      2019-07-05T02:28:56.397218-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0995365>] __cv_wait+0x15/0x20 [spl]

      2019-07-05T02:28:56.397238-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0bc863b>] txg_quiesce_thread+0x2cb/0x3c0 [zfs]

      2019-07-05T02:28:56.411080-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0bc8370>] ? txg_init+0x2b0/0x2b0 [zfs]

      2019-07-05T02:28:56.411101-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc099cb93>] thread_generic_wrapper+0x73/0x80 [spl]

      2019-07-05T02:28:56.425331-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc099cb20>] ? __thread_exit+0x20/0x20 [spl]

      2019-07-05T02:28:56.425350-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff832c1c71>] kthread+0xd1/0xe0

      2019-07-05T02:28:56.430937-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff832c1ba0>] ? insert_kthread_work+0x40/0x40

      2019-07-05T02:28:56.437770-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff83974c1d>] ret_from_fork_nospec_begin+0x7/0x21

      2019-07-05T02:28:56.444951-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff832c1ba0>] ? insert_kthread_work+0x40/0x40

      2019-07-05T02:28:56.459337-04:00 f2-mds2.ncrc.gov kernel: INFO: task mdt04_000:42947 blocked for more than 120 seconds.

      2019-07-05T02:28:56.459357-04:00 f2-mds2.ncrc.gov kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      2019-07-05T02:28:56.478994-04:00 f2-mds2.ncrc.gov kernel: mdt04_000       D ffff99aecc7ae180     0 42947      2 0x00000000

      2019-07-05T02:28:56.479015-04:00 f2-mds2.ncrc.gov kernel: Call Trace:

      2019-07-05T02:28:56.479037-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff83967c49>] schedule+0x29/0x70

      2019-07-05T02:28:56.491665-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0995325>] cv_wait_common+0x125/0x150 [spl]

      2019-07-05T02:28:56.491685-04:00 f2-mds2.ncrc.gov kernel: [<ffffffff832c2d40>] ? wake_up_atomic_t+0x30/0x30

      2019-07-05T02:28:56.504560-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0995365>] __cv_wait+0x15/0x20 [spl]

      2019-07-05T02:28:56.504580-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0b675fb>] dmu_tx_wait+0x20b/0x3b0 [zfs]

      2019-07-05T02:28:56.517986-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc0b67831>] dmu_tx_assign+0x91/0x490 [zfs]

      2019-07-05T02:28:56.518007-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc176d049>] osd_trans_start+0x199/0x440 [osd_zfs]

      2019-07-05T02:28:56.532397-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc18ce837>] mdt_empty_transno+0xf7/0x850 [mdt]

      2019-07-05T02:28:56.532416-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc18d1eee>] mdt_mfd_open+0x8de/0xe70 [mdt]

      2019-07-05T02:28:56.546398-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc18a4ba2>] ? mdt_pack_acl2body+0x1c2/0x9f0 [mdt]

      2019-07-05T02:28:56.546419-04:00 f2-mds2.ncrc.gov kernel: [<ffffffffc18d2acb>] mdt_finish_open+0x64b/0x760 [mdt]

      Attachments

        Issue Links

          Activity

            [LU-12510] mds server hangs cv_wait_common

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35707
            Subject: LU-12510 osd: osd-zfs to release zrlock quickly
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: 678e24e01ea4427b84d6c4b2a4d679edf077d3cc

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/35707 Subject: LU-12510 osd: osd-zfs to release zrlock quickly Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: 678e24e01ea4427b84d6c4b2a4d679edf077d3cc

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35600/
            Subject: LU-12510 osd: osd-zfs to release zrlock quickly
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: ec6b9a60c87767236804865da51a5c1c425db01c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35600/ Subject: LU-12510 osd: osd-zfs to release zrlock quickly Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: ec6b9a60c87767236804865da51a5c1c425db01c
            pjones Peter Jones added a comment -

            In that case let's create another JIRA ticket to track that work and leave this as resolved and get the fix onto the LTS branch too (I think this is already in flight)

            pjones Peter Jones added a comment - In that case let's create another JIRA ticket to track that work and leave this as resolved and get the fix onto the LTS branch too (I think this is already in flight)

            I'll be working on a patch re-implementing this using new DMU API, but this can wait a bit, IMO.
            the landed patch is fully functional, it just keep using not-best-approach to access DMU functionality.
            that was made this way as part of performance improvements due to missing bits in DMU API.

            bzzz Alex Zhuravlev added a comment - I'll be working on a patch re-implementing this using new DMU API, but this can wait a bit, IMO. the landed patch is fully functional, it just keep using not-best-approach to access DMU functionality. that was made this way as part of performance improvements due to missing bits in DMU API.
            pjones Peter Jones added a comment -

            James

            Do you mean that further Lustre changes will be needed to adjust to the ZFS changes in 0.8.2?

            Peter

            pjones Peter Jones added a comment - James Do you mean that further Lustre changes will be needed to adjust to the ZFS changes in 0.8.2? Peter

            The above patch is only a work around. A more complete solution is needed with ZFS 0.8.2. We should either open another ticket or reopen this ticket.

            simmonsja James A Simmons added a comment - The above patch is only a work around. A more complete solution is needed with ZFS 0.8.2. We should either open another ticket or reopen this ticket.
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35600
            Subject: LU-12510 osd: osd-zfs to release zrlock quickly
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: d844b66bf8d58708237547598d016b356a07a01d

            gerrit Gerrit Updater added a comment - James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/35600 Subject: LU-12510 osd: osd-zfs to release zrlock quickly Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: d844b66bf8d58708237547598d016b356a07a01d

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35524/
            Subject: LU-12510 osd: osd-zfs to release zrlock quickly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 88b329ac2ab568a25251f3f7c3a7e0c7367cb36f

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/35524/ Subject: LU-12510 osd: osd-zfs to release zrlock quickly Project: fs/lustre-release Branch: master Current Patch Set: Commit: 88b329ac2ab568a25251f3f7c3a7e0c7367cb36f

            pfarrell hard to say, but I'd not very likely, few reported processes weren't running, but got stuck with IO:

            [16250.115937] Pid: 2829, comm: ll_ost00_029 3.10.0-957.21.3.el7_lustre.x86_64 #1 SMP Tue Jul 16 17:30:03 UTC 2019
            [16250.117636] Call Trace:
            [16250.118155]  [<ffffffffc02ee262>] cv_wait_common+0xb2/0x150 [spl]
            [16250.119447]  [<ffffffffc02ee338>] __cv_wait_io+0x18/0x20 [spl]
            [16250.120491]  [<ffffffffc04aca3b>] zio_wait+0x11b/0x1c0 [zfs]
            [16250.121621]  [<ffffffffc03ebfce>] dbuf_read+0x67e/0x9e0 [zfs]
            [16250.122689]  [<ffffffffc03f8060>] dmu_buf_hold_by_dnode+0x50/0x80 [zfs]
            [16250.123910]  [<ffffffffc046b6c2>] zap_get_leaf_byblk.isra.11+0x92/0x260 [zfs]
            [16250.125193]  [<ffffffffc046b978>] zap_deref_leaf+0xe8/0x100 [zfs]
            [16250.126336]  [<ffffffffc046c9b2>] fzap_lookup+0x62/0x180 [zfs]
            [16250.127455]  [<ffffffffc04712b1>] zap_lookup_impl+0xd1/0x200 [zfs]
            [16250.128629]  [<ffffffffc0471c2e>] zap_lookup_norm+0x8e/0xd0 [zfs]
            [16250.129773]  [<ffffffffc0471ca3>] zap_lookup+0x33/0x40 [zfs]
            [16250.130823]  [<ffffffffc113d4c1>] osd_fid_lookup+0x3e1/0x4c0 [osd_zfs]
            [16250.132040]  [<ffffffffc1135669>] osd_object_init+0x389/0xca0 [osd_zfs]
            [16250.133221]  [<ffffffffc0c17b25>] lu_object_alloc+0xe5/0x320 [obdclass]
            
            bzzz Alex Zhuravlev added a comment - pfarrell hard to say, but I'd not very likely, few reported processes weren't running, but got stuck with IO: [16250.115937] Pid: 2829, comm: ll_ost00_029 3.10.0-957.21.3.el7_lustre.x86_64 #1 SMP Tue Jul 16 17:30:03 UTC 2019 [16250.117636] Call Trace: [16250.118155] [<ffffffffc02ee262>] cv_wait_common+0xb2/0x150 [spl] [16250.119447] [<ffffffffc02ee338>] __cv_wait_io+0x18/0x20 [spl] [16250.120491] [<ffffffffc04aca3b>] zio_wait+0x11b/0x1c0 [zfs] [16250.121621] [<ffffffffc03ebfce>] dbuf_read+0x67e/0x9e0 [zfs] [16250.122689] [<ffffffffc03f8060>] dmu_buf_hold_by_dnode+0x50/0x80 [zfs] [16250.123910] [<ffffffffc046b6c2>] zap_get_leaf_byblk.isra.11+0x92/0x260 [zfs] [16250.125193] [<ffffffffc046b978>] zap_deref_leaf+0xe8/0x100 [zfs] [16250.126336] [<ffffffffc046c9b2>] fzap_lookup+0x62/0x180 [zfs] [16250.127455] [<ffffffffc04712b1>] zap_lookup_impl+0xd1/0x200 [zfs] [16250.128629] [<ffffffffc0471c2e>] zap_lookup_norm+0x8e/0xd0 [zfs] [16250.129773] [<ffffffffc0471ca3>] zap_lookup+0x33/0x40 [zfs] [16250.130823] [<ffffffffc113d4c1>] osd_fid_lookup+0x3e1/0x4c0 [osd_zfs] [16250.132040] [<ffffffffc1135669>] osd_object_init+0x389/0xca0 [osd_zfs] [16250.133221] [<ffffffffc0c17b25>] lu_object_alloc+0xe5/0x320 [obdclass]

            People

              bzzz Alex Zhuravlev
              dustb100 Dustin Leverman
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: