[LU-2981] sanity.sh test_17m test_77i: oops in ptlrpc_server_hpreq_fini Created: 18/Mar/13 Updated: 01/Apr/13 Resolved: 01/Apr/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | LB | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7264 | ||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/1000ccce-8f3b-11e2-aa82-52540035b04c. The sub-test test_77i failed with the following error:
Info required for matching: sanity 77i |
| Comments |
| Comment by Andreas Dilger [ 18/Mar/13 ] |
|
Another hit in in ptlrpc_unregister_service() though this time from an unmount: https://maloo.whamcloud.com/test_sets/018d9cec-8d58-11e2-bb99-52540035b04c 19:19:04:RIP: 0010:[<ffffffffa07edc27>] [<ffffffffa07edc27>] ptlrpc_server_hpreq_fini+0x27/0x160 [ptlrpc] 19:19:05:Process umount (pid: 11123, threadinfo ffff880069806000, task ffff88005 19:19:05:Call Trace: 19:19:05: [<ffffffffa07f0d79>] ptlrpc_unregister_service+0x4a9/0x10b0 [ptlrpc] 19:19:05: [<ffffffff81052223>] ? __wake_up+0x53/0x70 19:19:05: [<ffffffffa0de49fe>] mgs_device_fini+0xee/0x5a0 [mgs] 19:19:06: [<ffffffffa06489c7>] class_cleanup+0x577/0xda0 [obdclass] 19:19:06: [<ffffffffa061dd36>] ? class_name2dev+0x56/0xe0 [obdclass] 19:19:06: [<ffffffffa064a2ac>] class_process_config+0x10bc/0x1c80 [obdclass] 19:19:06: [<ffffffffa0643ad3>] ? lustre_cfg_new+0x353/0x7e0 [obdclass] 19:19:06: [<ffffffffa064afe9>] class_manual_cleanup+0x179/0x6f0 [obdclass] 19:19:06: [<ffffffffa061dd36>] ? class_name2dev+0x56/0xe0 [obdclass] 19:19:06: [<ffffffffa0657a3d>] server_put_super+0x46d/0xf00 [obdclass] 19:19:06: [<ffffffff811785ab>] generic_shutdown_super+0x5b/0xe0 19:19:06: [<ffffffff81178696>] kill_anon_super+0x16/0x60 19:19:07: [<ffffffffa064ce46>] lustre_kill_super+0x36/0x60 [obdclass] 19:19:07: [<ffffffff81179670>] deactivate_super+0x70/0x90 19:19:07: [<ffffffff811955cf>] mntput_no_expire+0xbf/0x110 19:19:07: [<ffffffff81195f2b>] sys_umount+0x7b/0x3a0 19:19:07: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b |
| Comment by Andreas Dilger [ 18/Mar/13 ] |
|
sanity.sh test_77i has failed 10 times and test_17m once in the past 4 weeks, but only starting 2013-03-12. |
| Comment by Andreas Dilger [ 18/Mar/13 ] |
|
Could this relate to NRS? The patch http://review.whamcloud.com/5665 was landed on the 12th. |
| Comment by Nikitas Angelinas [ 18/Mar/13 ] |
|
I think this must be due to the NRS framework follow-up patch itself, as the version that fired those bugs had some important parts missing. I have just updated that patch and this new version should address this ticket. |
| Comment by Peter Jones [ 22/Mar/13 ] |
|
Thanks Nikitas! As it seems that you are not around to do so (I appreciate it is late on a Friday in the UK) Bob is going to rebase this patch to avoid the |
| Comment by Nikitas Angelinas [ 22/Mar/13 ] |
|
Hi Peter, please proceed to rebase the patch if you want to get a clean test run, though I was planning to refresh the patch at end of day today (so in 5-7 hours from now) after I included some additional changes. |
| Comment by Peter Jones [ 22/Mar/13 ] |
|
Nikitas Ah great - that timeframe works fine to get testing completed over the weekend. Obviously we want the version with the latest changes. I guess that I underestimated the end of your work day Peter |
| Comment by Alexander Boyko [ 29/Mar/13 ] |
|
Xyratex has patch for this issue, probably, I will submit it to master in a few days. |
| Comment by Peter Jones [ 01/Apr/13 ] |
|
ok so the extra tidy up from |