Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
I got this tell tale complaint in sanityn test 77e:
.781807] BUG: spinlock wrong CPU on CPU#1, lctl/29799 (Not tainted) .782638] lock: ffff880055277d08, .magic: dead4ead, .owner: lctl/29799, .owner_c .784053] Pid: 29799, comm: lctl Not tainted 2.6.32-rhe6.7-debug #1 .784828] Call Trace: .785635] [<ffffffff812a06fa>] ? spin_bug+0xaa/0x100 .786402] [<ffffffff812a07c6>] ? _raw_spin_unlock+0x76/0xa0 .787184] [<ffffffff81530afe>] ? _spin_unlock+0xe/0x10 .787989] [<ffffffffa153faa4>] ? nrs_policy_ctl+0xd4/0x2e0 [ptlrpc] .788835] [<ffffffffa15414f2>] ? ptlrpc_nrs_policy_control+0xe2/0x2a0 [ptlrpc] .790282] [<ffffffffa1522876>] ? ptlrpc_lprocfs_nrs_seq_write+0x3e6/0x600 [ptlrp .791762] [<ffffffffa1522490>] ? ptlrpc_lprocfs_nrs_seq_write+0x0/0x600 [ptlrpc] .794101] [<ffffffff811ff945>] ? proc_reg_write+0x85/0xc0 .794872] [<ffffffff81192f48>] ? vfs_write+0xb8/0x1a0 .795616] [<ffffffff811943f6>] ? fget_light_pos+0x16/0x50 .796356] [<ffffffff81193881>] ? sys_write+0x51/0xb0 .797114] [<ffffffff815312ee>] ? do_device_not_available+0xe/0x10 .797878] [<ffffffff8100b112>] ? system_call_fastpath+0x16/0x1b
What this means is that something in nrs_policy_ctl() slept while holding nrs->nrs_lock to the point that when it woke up, it was rescheduled on a different cpu.
I only running rhel6 at the moment so this is the best I got and I do not see any obvious culprit right away.
Probably need to rerun with rhel7 where it actually catches offenders much better.
Attachments
Issue Links
- is related to
-
LU-5717 Dead lock of nrs_tbf_timer_cb
- Resolved