Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.8.0
    • 2
    • 9223372036854775807

    Description

      Here is all resutls results with 200K files.

      # lctl set_param ldlm.namespaces.*.lru_size=1000
      ldlm.namespaces.MGC10.0.10.153@o2ib.lru_size=1000
      ldlm.namespaces.lustre-MDT0000-mdc-ffff881fccd75800.lru_size=1000
      ldlm.namespaces.lustre-OST0000-osc-ffff881fccd75800.lru_size=1000
      ldlm.namespaces.lustre-OST0001-osc-ffff881fccd75800.lru_size=1000
      ldlm.namespaces.lustre-OST0002-osc-ffff881fccd75800.lru_size=1000
      ldlm.namespaces.lustre-OST0003-osc-ffff881fccd75800.lru_size=1000
      
      # ls -lR /lustre
      # lctl get_param ldlm.namespaces.*.lock_count
      ldlm.namespaces.MGC10.0.10.153@o2ib.lock_count=4
      ldlm.namespaces.lustre-MDT0000-mdc-ffff881fccd75800.lock_count=1002
      ldlm.namespaces.lustre-OST0000-osc-ffff881fccd75800.lock_count=50003
      ldlm.namespaces.lustre-OST0001-osc-ffff881fccd75800.lock_count=50002
      ldlm.namespaces.lustre-OST0002-osc-ffff881fccd75800.lock_count=50003
      ldlm.namespaces.lustre-OST0003-osc-ffff881fccd75800.lock_count=50004
      

      Attachments

        1. bt.all
          316 kB
          Ann Koehler
        2. bt.uniq
          10 kB
          Ann Koehler
        3. dmesg
          483 kB
          Ann Koehler

        Issue Links

          Activity

            [LU-6390] lru_size on the OSC is not honored
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14342/
            Subject: LU-6390 ldlm: restore the ELC for enqueue
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ac5abd46e95edd97316ff0e9563288636e7c42bc

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14342/ Subject: LU-6390 ldlm: restore the ELC for enqueue Project: fs/lustre-release Branch: master Current Patch Set: Commit: ac5abd46e95edd97316ff0e9563288636e7c42bc

            Sorry, forgot to mention, the client is running 2.5.1 on CentOS 2.6.32-431.20.3.el6.x86_64. (Same system as in LELUS-294).

            So are you saying that the LU-5781 patch is needed in addition to LU-6390? And that the 2 together will fix the problems with too many locks in the LRU? Certainly would explain why we didn't see the LU-4300-like hang in our internal testing, since the versions we typically run have both patches.

            amk Ann Koehler (Inactive) added a comment - Sorry, forgot to mention, the client is running 2.5.1 on CentOS 2.6.32-431.20.3.el6.x86_64. (Same system as in LELUS-294). So are you saying that the LU-5781 patch is needed in addition to LU-6390 ? And that the 2 together will fix the problems with too many locks in the LRU? Certainly would explain why we didn't see the LU-4300 -like hang in our internal testing, since the versions we typically run have both patches.
            vitaly_fertman Vitaly Fertman added a comment - - edited

            cl_lock_mutex_get does not exists in 2.8, since the CLIO simplification, so this was not 2.8 lustre what you tested.

            the patch by itself is supposed to be correct, I think the problem is related to the issue rased in LU-5781, cancel_lru_policy is called not atomically with set_cbpending, so some dirty pages could be added in between.

            vitaly_fertman Vitaly Fertman added a comment - - edited cl_lock_mutex_get does not exists in 2.8, since the CLIO simplification, so this was not 2.8 lustre what you tested. the patch by itself is supposed to be correct, I think the problem is related to the issue rased in LU-5781 , cancel_lru_policy is called not atomically with set_cbpending, so some dirty pages could be added in between.

            dmesg file from the data mover node. Shows partial output from
            echo t > /proc/sysrq-trigger

            The forced stack trace dump was done at least 30 minutes before the /proc/pid/stack output was captured. So comparisons between dmesg and bt.all show that the threads are indeed hung.

            amk Ann Koehler (Inactive) added a comment - dmesg file from the data mover node. Shows partial output from echo t > /proc/sysrq-trigger The forced stack trace dump was done at least 30 minutes before the /proc/pid/stack output was captured. So comparisons between dmesg and bt.all show that the threads are indeed hung.

            Unique stack traces for Lustre processes extracted from bt.all

            amk Ann Koehler (Inactive) added a comment - Unique stack traces for Lustre processes extracted from bt.all

            ps output followed by
            for each pid in /proc; do cat /proc/$pid/stack; done

            amk Ann Koehler (Inactive) added a comment - ps output followed by for each pid in /proc; do cat /proc/$pid/stack; done

            Customer site installed http://review.whamcloud.com/14342 on their clients. After ~6 hours of running, a data mover client hung. The node is not configured to take memory dumps so we captured stack traces from /proc. The stack traces are reminiscent of LU-4300. 11 out of 32 ptlrcpd threads and 47 out of 60 ldlm_bl threads are waiting in cl_lock_mutex_get. And 2 ll_agl threads are stuck in osc_extent_wait:

            10730 ll_agl_21508
            11228 ll_agl_21487
            [<ffffffffa0f4eb90>] osc_extent_wait+0x420/0x670 [osc]
            [<ffffffffa0f4f0af>] osc_cache_wait_range+0x2cf/0x890 [osc]
            [<ffffffffa0f50281>] osc_cache_writeback_range+0xc11/0xfb0 [osc]
            [<ffffffffa0f3b6f4>] osc_lock_flush+0x84/0x280 [osc]
            [<ffffffffa0f3b9d7>] osc_lock_cancel+0xe7/0x1c0 [osc]
            [<ffffffffa0b4cbf5>] cl_lock_cancel0+0x75/0x160 [obdclass]
            [<ffffffffa0b4d7ab>] cl_lock_cancel+0x13b/0x140 [obdclass]
            [<ffffffffa0f3cf1a>] osc_ldlm_blocking_ast+0x13a/0x350 [osc]
            [<ffffffffa0cf703c>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc]
            [<ffffffffa0d06eaa>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc]
            [<ffffffffa0d0a1ae>] ldlm_cli_cancel_list_local+0xee/0x290 [ptlrpc]
            [<ffffffffa0d0b055>] ldlm_cancel_lru_local+0x35/0x40 [ptlrpc]
            [<ffffffffa0d0c4cc>] ldlm_prep_elc_req+0x3ec/0x4b0 [ptlrpc]
            [<ffffffffa0d0c5b8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc]
            [<ffffffffa0f205d9>] osc_enqueue_base+0x109/0x5a0 [osc]
            [<ffffffffa0f3c5cd>] osc_lock_enqueue+0x1ed/0x890 [osc]
            [<ffffffffa0b50c2c>] cl_enqueue_try+0xfc/0x300 [obdclass]
            [<ffffffffa0fce64a>] lov_lock_enqueue+0x21a/0xf10 [lov]
            [<ffffffffa0b50c2c>] cl_enqueue_try+0xfc/0x300 [obdclass]
            [<ffffffffa0b51b4f>] cl_enqueue_locked+0x6f/0x1f0 [obdclass]
            [<ffffffffa0b5279e>] cl_lock_request+0x7e/0x270 [obdclass]
            [<ffffffffa109d000>] cl_glimpse_lock+0x180/0x490 [lustre]
            [<ffffffffa109d875>] cl_glimpse_size0+0x1a5/0x1d0 [lustre]
            [<ffffffffa1095ffb>] ll_agl_trigger+0x1db/0x4b0 [lustre]
            [<ffffffffa1096e6e>] ll_agl_thread+0x15e/0x490 [lustre]
            [<ffffffff8109abf6>] kthread+0x96/0xa0
            [<ffffffff8100c20a>] child_rip+0xa/0x20
            [<ffffffffffffffff>] 0xffffffffffffffff
            
            6143 ptlrpcd_0 + 10 more ptlrpcd threads (out of 32)
            [<ffffffffa0b4e6df>] cl_lock_mutex_get+0x6f/0xd0 [obdclass]
            [<ffffffffa0fd5b19>] lovsub_parent_lock+0x49/0x120 [lov]
            [<ffffffffa0fd6c4f>] lovsub_lock_modify+0x7f/0x1e0 [lov]
            [<ffffffffa0b4e108>] cl_lock_modify+0x98/0x310 [obdclass]
            [<ffffffffa0f3de32>] osc_lock_granted+0x1e2/0x2b0 [osc]
            [<ffffffffa0f3e308>] osc_lock_upcall+0x408/0x600 [osc]
            [<ffffffffa0f1e7a6>] osc_enqueue_fini+0x106/0x240 [osc]
            [<ffffffffa0f23272>] osc_enqueue_interpret+0xe2/0x1e0 [osc]
            [<ffffffffa0d2487c>] ptlrpc_check_set+0x2bc/0x1b50 [ptlrpc]
            [<ffffffffa0d500cb>] ptlrpcd_check+0x53b/0x560 [ptlrpc]
            [<ffffffffa0d5071b>] ptlrpcd+0x33b/0x3f0 [ptlrpc]
            [<ffffffff8109abf6>] kthread+0x96/0xa0
            [<ffffffff8100c20a>] child_rip+0xa/0x20
            [<ffffffffffffffff>] 0xffffffffffffffff
            
            6209  ldlm_bl_00 + 46 other ldlm_bl threads (out of 60)
            [<ffffffffa0b4e6df>] cl_lock_mutex_get+0x6f/0xd0 [obdclass]
            [<ffffffffa0f3ce5a>] osc_ldlm_blocking_ast+0x7a/0x350 [osc]
            [<ffffffffa0d0f0c0>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc]
            [<ffffffffa0d0f5f1>] ldlm_bl_thread_main+0x261/0x3c0 [ptlrpc]
            [<ffffffff8109abf6>] kthread+0x96/0xa0
            [<ffffffff8100c20a>] child_rip+0xa/0x20
            [<ffffffffffffffff>] 0xffffffffffffffff
            

            I'll attach the complete list of stack traces to this ticket.

            Let me know whether you need a dump and I'll see if we can reproduce the bug on a test system.

            amk Ann Koehler (Inactive) added a comment - Customer site installed http://review.whamcloud.com/14342 on their clients. After ~6 hours of running, a data mover client hung. The node is not configured to take memory dumps so we captured stack traces from /proc. The stack traces are reminiscent of LU-4300 . 11 out of 32 ptlrcpd threads and 47 out of 60 ldlm_bl threads are waiting in cl_lock_mutex_get. And 2 ll_agl threads are stuck in osc_extent_wait: 10730 ll_agl_21508 11228 ll_agl_21487 [<ffffffffa0f4eb90>] osc_extent_wait+0x420/0x670 [osc] [<ffffffffa0f4f0af>] osc_cache_wait_range+0x2cf/0x890 [osc] [<ffffffffa0f50281>] osc_cache_writeback_range+0xc11/0xfb0 [osc] [<ffffffffa0f3b6f4>] osc_lock_flush+0x84/0x280 [osc] [<ffffffffa0f3b9d7>] osc_lock_cancel+0xe7/0x1c0 [osc] [<ffffffffa0b4cbf5>] cl_lock_cancel0+0x75/0x160 [obdclass] [<ffffffffa0b4d7ab>] cl_lock_cancel+0x13b/0x140 [obdclass] [<ffffffffa0f3cf1a>] osc_ldlm_blocking_ast+0x13a/0x350 [osc] [<ffffffffa0cf703c>] ldlm_cancel_callback+0x6c/0x1a0 [ptlrpc] [<ffffffffa0d06eaa>] ldlm_cli_cancel_local+0x8a/0x470 [ptlrpc] [<ffffffffa0d0a1ae>] ldlm_cli_cancel_list_local+0xee/0x290 [ptlrpc] [<ffffffffa0d0b055>] ldlm_cancel_lru_local+0x35/0x40 [ptlrpc] [<ffffffffa0d0c4cc>] ldlm_prep_elc_req+0x3ec/0x4b0 [ptlrpc] [<ffffffffa0d0c5b8>] ldlm_prep_enqueue_req+0x28/0x30 [ptlrpc] [<ffffffffa0f205d9>] osc_enqueue_base+0x109/0x5a0 [osc] [<ffffffffa0f3c5cd>] osc_lock_enqueue+0x1ed/0x890 [osc] [<ffffffffa0b50c2c>] cl_enqueue_try+0xfc/0x300 [obdclass] [<ffffffffa0fce64a>] lov_lock_enqueue+0x21a/0xf10 [lov] [<ffffffffa0b50c2c>] cl_enqueue_try+0xfc/0x300 [obdclass] [<ffffffffa0b51b4f>] cl_enqueue_locked+0x6f/0x1f0 [obdclass] [<ffffffffa0b5279e>] cl_lock_request+0x7e/0x270 [obdclass] [<ffffffffa109d000>] cl_glimpse_lock+0x180/0x490 [lustre] [<ffffffffa109d875>] cl_glimpse_size0+0x1a5/0x1d0 [lustre] [<ffffffffa1095ffb>] ll_agl_trigger+0x1db/0x4b0 [lustre] [<ffffffffa1096e6e>] ll_agl_thread+0x15e/0x490 [lustre] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff 6143 ptlrpcd_0 + 10 more ptlrpcd threads (out of 32) [<ffffffffa0b4e6df>] cl_lock_mutex_get+0x6f/0xd0 [obdclass] [<ffffffffa0fd5b19>] lovsub_parent_lock+0x49/0x120 [lov] [<ffffffffa0fd6c4f>] lovsub_lock_modify+0x7f/0x1e0 [lov] [<ffffffffa0b4e108>] cl_lock_modify+0x98/0x310 [obdclass] [<ffffffffa0f3de32>] osc_lock_granted+0x1e2/0x2b0 [osc] [<ffffffffa0f3e308>] osc_lock_upcall+0x408/0x600 [osc] [<ffffffffa0f1e7a6>] osc_enqueue_fini+0x106/0x240 [osc] [<ffffffffa0f23272>] osc_enqueue_interpret+0xe2/0x1e0 [osc] [<ffffffffa0d2487c>] ptlrpc_check_set+0x2bc/0x1b50 [ptlrpc] [<ffffffffa0d500cb>] ptlrpcd_check+0x53b/0x560 [ptlrpc] [<ffffffffa0d5071b>] ptlrpcd+0x33b/0x3f0 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff 6209 ldlm_bl_00 + 46 other ldlm_bl threads (out of 60) [<ffffffffa0b4e6df>] cl_lock_mutex_get+0x6f/0xd0 [obdclass] [<ffffffffa0f3ce5a>] osc_ldlm_blocking_ast+0x7a/0x350 [osc] [<ffffffffa0d0f0c0>] ldlm_handle_bl_callback+0x130/0x400 [ptlrpc] [<ffffffffa0d0f5f1>] ldlm_bl_thread_main+0x261/0x3c0 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffffffffffff>] 0xffffffffffffffff I'll attach the complete list of stack traces to this ticket. Let me know whether you need a dump and I'll see if we can reproduce the bug on a test system.

            Vitaly Fertman (vitaly_fertman@xyratex.com) uploaded a new patch: http://review.whamcloud.com/14342
            Subject: LU-6390 ldlm: restore the ELC for enqueue
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3f9b5d1aea04d30a76f874edc5048689cd98308a

            gerrit Gerrit Updater added a comment - Vitaly Fertman (vitaly_fertman@xyratex.com) uploaded a new patch: http://review.whamcloud.com/14342 Subject: LU-6390 ldlm: restore the ELC for enqueue Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3f9b5d1aea04d30a76f874edc5048689cd98308a

            People

              jay Jinshan Xiong (Inactive)
              jay Jinshan Xiong (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: