[LU-5787] ptlrpcd_rcv loop in osc_ldlm_weigh_ast Created: 22/Oct/14  Updated: 07/Jun/16

Status: Reopened
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Antoine Percher Assignee: Zhenyu Xu
Resolution: Unresolved Votes: 1
Labels: None
Environment:

kernel 2.6.32-431.23.3 + bull fix
lustre 2.5.3 + bull fix


Issue Links:
Related
is related to LU-5781 endless loop in osc_lock_weight() Resolved
Epic/Theme: OSC, Tera100
Severity: 3
Rank (Obsolete): 16240

 Description   

At Cea T100 system we have an issue similar than the LU-5781, after a failover server node some client nodes have the thread ptlrpcd_rcv who use 100% of one cpu, with perf we can see :

      0.55%      ptlrpcd_rcv  [obdclass]               [k] cl_page_at_trusted
                |
                --- cl_page_at_trusted
                   |
                   |--97.59%-- cl_page_gang_lookup
                   |          osc_ldlm_weigh_ast
                   |          osc_cancel_for_recovery
                   |          ldlm_cancel_no_wait_policy
                   |          ldlm_prepare_lru_list
                   |          ldlm_cancel_lru_local
                   |          ldlm_replay_locks
                   |          ptlrpc_import_recovery_state_machine
                   |          ptlrpc_connect_interpret
                   |          ptlrpc_check_set
                   |          ptlrpcd_check
                   |          ptlrpcd
                   |          kthread
                   |          child_rip
                   |
                    --2.41%-- osc_ldlm_weigh_ast
                              osc_cancel_for_recovery
                              ldlm_cancel_no_wait_policy
                              ldlm_prepare_lru_list
                              ldlm_cancel_lru_local
                              ldlm_replay_locks
                              ptlrpc_import_recovery_state_machine
                              ptlrpc_connect_interpret
                              ptlrpc_check_set
                              ptlrpcd_check
                              ptlrpcd
                              kthread
                              child_rip  

we have some osc with the state :

    /proc/fs/lustre/osc/ptmp2-OST0021-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST0022-osc-ffff8801ff8b3800/state:current_state: REPLAY_LOCKS
    /proc/fs/lustre/osc/ptmp2-OST0023-osc-ffff8801ff8b3800/state:current_state: REPLAY_LOCKS
    /proc/fs/lustre/osc/ptmp2-OST0024-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST0025-osc-ffff8801ff8b3800/state:current_state: REPLAY_LOCKS
    /proc/fs/lustre/osc/ptmp2-OST0026-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST0027-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST0028-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST0029-osc-ffff8801ff8b3800/state:current_state: REPLAY_LOCKS
    /proc/fs/lustre/osc/ptmp2-OST002a-osc-ffff8801ff8b3800/state:current_state: CONNECTING
    /proc/fs/lustre/osc/ptmp2-OST002b-osc-ffff8801ff8b3800/state:current_state: REPLAY_WAIT

and also,in some case, it is possible to release the state to FULL after running

  1. lctl set_param ldlm.namespaces.*.lru_size=clear
    or
  2. echo 1 > /proc/sys/vm/drop_caches

and after a NMI ptlrpcd_rcv stack was :

     crash> bt 15400
    PID: 15400  TASK: ffff880c7c5a0100  CPU: 14  COMMAND: "ptlrpcd_rcv"
     #0 [ffff88088e4c7e90] crash_nmi_callback at ffffffff81030096
     #1 [ffff88088e4c7ea0] notifier_call_chain at ffffffff8152f9d5
     #2 [ffff88088e4c7ee0] atomic_notifier_call_chain at ffffffff8152fa3a
     #3 [ffff88088e4c7ef0] notify_die at ffffffff810a056e
     #4 [ffff88088e4c7f20] do_nmi at ffffffff8152d69b
     #5 [ffff88088e4c7f50] nmi at ffffffff8152cf60
        [exception RIP: cl_page_gang_lookup+292]
        RIP: ffffffffa04f18b4  RSP: ffff880c7cabb990  RFLAGS: 00000206
        RAX: 000000000000000a  RBX: 000000000000000b  RCX: 0000000000000000
        RDX: ffff880660a63da8  RSI: ffffffffa0af8740  RDI: ffff88065b15ae00
        RBP: ffff880c7cabba30   R8: 000000000000000e   R9: ffff880c7cabb950
        R10: 0000000000002362  R11: ffff88087a09e5d0  R12: ffff88065b15a800
        R13: ffff880660a63df8  R14: 000000000000000b  R15: 000000000000000e
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    --- <NMI exception stack> ---
     #6 [ffff880c7cabb990] cl_page_gang_lookup at ffffffffa04f18b4 [obdclass]
     #7 [ffff880c7cabba38] osc_ldlm_weigh_ast at ffffffffa095e9b7 [osc]
     #8 [ffff880c7cabbab8] osc_cancel_for_recovery at ffffffffa094305d [osc]
     #9 [ffff880c7cabbac8] ldlm_cancel_no_wait_policy at ffffffffa0637711 [ptlrpc]
    #10 [ffff880c7cabbae8] ldlm_prepare_lru_list at ffffffffa063b61b [ptlrpc]
    #11 [ffff880c7cabbb68] ldlm_cancel_lru_local at ffffffffa063ba34 [ptlrpc]
    #12 [ffff880c7cabbb88] ldlm_replay_locks at ffffffffa063bbbc [ptlrpc]
    #13 [ffff880c7cabbc08] ptlrpc_import_recovery_state_machine at ffffffffa06844f7 [ptlrpc]
    #14 [ffff880c7cabbc68] ptlrpc_connect_interpret at ffffffffa0685659 [ptlrpc]
    #15 [ffff880c7cabbd08] ptlrpc_check_set at ffffffffa065bbc1 [ptlrpc]
    #16 [ffff880c7cabbda8] ptlrpcd_check at ffffffffa0687f9b [ptlrpc]
    #17 [ffff880c7cabbe08] ptlrpcd at ffffffffa06884bb [ptlrpc]
    #18 [ffff880c7cabbee8] kthread at ffffffff81099f56
    #19 [ffff880c7cabbf48] kernel_thread at ffffffff8100c20a

when the root will be understanding on LU-5781, we need a patch version for lustre 2.5.3



 Comments   
Comment by Bruno Faccini (Inactive) [ 22/Oct/14 ]

Hello Antoine!
Looks like problem has been already well identified in LU-5781, so a master patch will come up soon for master, and I presume will likely be easy to back-port in b2_5.

Comment by Peter Jones [ 22/Oct/14 ]

Bobijam

Could you please look into this issue? Jinshan agrees that this looks like a duplicate of LU-5781

Thanks

Peter

Comment by Zhenyu Xu [ 24/Oct/14 ]

dup of LU-5781

Comment by Peter Jones [ 21/Nov/14 ]

Bobijam

I have reopened this ticket because it is proving confusing to separate the approach needed for b2_5 (which does not contain LU-3321) and master (which does). Could you please advise what patches Bull require on b2_5?

Thanks

Peter

Comment by Zhenyu Xu [ 22/Nov/14 ]

b2_5 does contain LU-3321 patch (git commit is 0a6c6fcd46a4e2eb289eff72402e34d329a63d91, which is a combination of commit 154fb1f7 from LU-3321 and commit bfae5a4e from LU-4300).

Comment by Peter Jones [ 22/Nov/14 ]

Ah ok. So what do you advise for Bull to use on b2_5?

Comment by Zhenyu Xu [ 22/Nov/14 ]

use backport of #12362, #12603

Comment by Sebastien Buisson (Inactive) [ 26/Nov/14 ]

Bobijam,

Do we also have to revert 0a6c6fcd46a4e2eb289eff72402e34d329a63d91 from b2_5?
BTW, it seems #12362 and #12603 do not apply cleanly on b2_5.

TIA,
Sebastien.

Comment by Zhenyu Xu [ 26/Nov/14 ]

no, you don't need to revert it. #12362 is the cure for the loop (#12603 is for another issue, you don't need it here), and the b2_5 port of #12362 is at http://review.whamcloud.com/12859

Generated at Sat Feb 10 01:54:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.