[LU-8125] ASSERTION( strncmp(infos[pol_idx].pi_arg, tmp.pi_arg, sizeof(tmp.pi_arg)) == 0 ) failed: Created: 10/May/16  Updated: 14/Jun/18  Resolved: 06/Jul/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Mahmoud Hanafi Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: cea
Environment:

Lustre 2.7.1


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Testing setting tbf policies caused LBUG.

<0>LustreError: 62827:0:(lproc_ptlrpc.c:547:ptlrpc_lprocfs_nrs_seq_show()) ASSERTION( strncmp(infos[pol_idx].pi_arg, tmp.pi_arg, sizeof(tmp.pi_arg)) == 0 ) failed: 
<0>LustreError: 62827:0:(lproc_ptlrpc.c:547:ptlrpc_lprocfs_nrs_seq_show()) LBUG
<4>Pid: 62827, comm: lctl
<4>
<4>Call Trace:
<4> [<ffffffffa0492895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa0492e97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa083d8b4>] ptlrpc_lprocfs_nrs_seq_show+0x6c4/0x930 [ptlrpc]
<4> [<ffffffff811ae422>] seq_read+0xf2/0x400
<4> [<ffffffff811f4dbe>] proc_reg_read+0x7e/0xc0
<4> [<ffffffff81188fe5>] vfs_read+0xb5/0x1a0
<4> [<ffffffff81189121>] sys_read+0x51/0x90
<4> [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 62827, comm: lctl Tainted: G           ---------------  T 2.6.32-504.30.3.el6.20151008.x86_64.lustre271 #1
<4>Call Trace:
<4> [<ffffffff81561679>] ? panic+0xa7/0x190
<4> [<ffffffffa0492eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa083d8b4>] ? ptlrpc_lprocfs_nrs_seq_show+0x6c4/0x930 [ptlrpc]
<4> [<ffffffff811ae422>] ? seq_read+0xf2/0x400
<4> [<ffffffff811f4dbe>] ? proc_reg_read+0x7e/0xc0
<4> [<ffffffff81188fe5>] ? vfs_read+0xb5/0x1a0
<4> [<ffffffff81189121>] ? sys_read+0x51/0x90
<4> [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
[1]kdb> 

What I typed.

nbp1-oss6 ~ # lctl get_param ldlm.services.ldlm_canceld.nrs_policies       
ldlm.services.ldlm_canceld.nrs_policies=

regular_requests:
  - name: fifo
    state: started
    fallback: yes
    queued: 0                   
    active: 0                   

  - name: crrn
    state: stopped
    fallback: no
    queued: 0                   
    active: 0                   

  - name: tbf nid
    state: stopped
    fallback: no
    queued: 0                   
    active: 0                   

high_priority_requests:
  - name: fifo
    state: started
    fallback: yes
    queued: 0                   
    active: 0                   

  - name: crrn
    state: stopped
    fallback: no
    queued: 0                   
    active: 0                   

  - name: tbf nid
    state: stopped
    fallback: no
    queued: 0                   
    active: 0                   

nbp1-oss6 ~ # lctl set_param ldlm.services.ldlm_canceld.nrs_policies="tbf req nid"
ldlm.services.ldlm_canceld.nrs_policies=tbf req nid
error: set_param: setting /proc/fs/lustre/ldlm/services/ldlm_canceld/nrs_policies=tbf req nid: Unknown error 524
nbp1-oss6 ~ # lctl set_param ldlm.services.ldlm_canceld.nrs_policies="tbf reg nid"
ldlm.services.ldlm_canceld.nrs_policies=tbf reg nid
nbp1-oss6 ~ # lctl set_param ldlm.services.ldlm_canceld.nrs_policies="tbf gp nid"
ldlm.services.ldlm_canceld.nrs_policies=tbf gp nid
error: set_param: setting /proc/fs/lustre/ldlm/services/ldlm_canceld/nrs_policies=tbf gp nid: Unknown error 524
nbp1-oss6 ~ # lctl set_param ldlm.services.ldlm_canceld.nrs_policies="tbf hp nid"
ldlm.services.ldlm_canceld.nrs_policies=tbf hp nid
nbp1-oss6 ~ # lctl get_param ldlm.services.ldlm_canceld.nrs_policies


 Comments   
Comment by Peter Jones [ 10/May/16 ]

Emoly

Could you please investigate this issue?

Thanks

Peter

Comment by Emoly Liu [ 11/May/16 ]

OK, I will have a look.

Comment by Gerrit Updater [ 13/May/16 ]

Emoly Liu (emoly.liu@intel.com) uploaded a new patch: http://review.whamcloud.com/20164
Subject: LU-8125 nrs: pol_arg should be copied after the policy starts
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 4d4455e20310744825a3af7e9e3cb587cbab77ec

Comment by Gerrit Updater [ 05/Jul/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20164/
Subject: LU-8125 nrs: pol_arg should be copied after the policy starts
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d6f305f601fdcd032657f1ff8d107fc175b22be6

Comment by Emoly Liu [ 06/Jul/16 ]

Landed to master 2.9

Comment by Jean-Baptiste Riaux (Inactive) [ 07/Jul/16 ]

backported to FE 2.7 (http://review.whamcloud.com/#/c/21180/)

Generated at Sat Feb 10 02:14:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.