[LU-8510] ASSERTION( dt->do_ops->do_invalidate ) failed Created: 17/Aug/16 Updated: 19/Mar/19 Resolved: 08/Sep/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Giuseppe Di Natale (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | soak | ||
| Environment: |
CentOS Linux 7/x86_64 |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
The following call stack during autotesting on Maloo for http://review.whamcloud.com/#/c/20546/. My new test is standing up 3 MDTs with non-consecutive indices and a couple of OSTs. The method I am using to start the "custom" filesystem seems to be consistent with how other tests start their "custom" filesystems. Link to the Maloo test session results is https://testing.hpdd.intel.com/test_sessions/4599d8d8-6108-11e6-906c-5254006e85c2. The LBUG is preventing the filesystem from coming up. Any suggestions? LustreError: 21374:0:(dt_object.h:2633:dt_invalidate()) ASSERTION( dt->do_ops->do_invalidate ) failed: LustreError: 21374:0:(dt_object.h:2633:dt_invalidate()) LBUG Pid: 21374, comm: mdt00_002 Call Trace: [<ffffffffa05e67d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] [<ffffffffa05e6d75>] lbug_with_loc+0x45/0xc0 [libcfs] [<ffffffffa0ea8fcf>] lod_object_unlock+0x39f/0x440 [lod] [<ffffffffa0f11e1b>] mdd_object_unlock+0x3b/0xd0 [mdd] [<ffffffffa0ddbb62>] mdt_unlock_slaves+0x1a2/0x3c0 [mdt] [<ffffffffa0de3c72>] mdt_md_create+0xb52/0xba0 [mdt] [<ffffffffa0de3e2b>] mdt_reint_create+0x16b/0x350 [mdt] [<ffffffffa0de5330>] mdt_reint_rec+0x80/0x210 [mdt] [<ffffffffa0dc7d62>] mdt_reint_internal+0x5b2/0x9b0 [mdt] [<ffffffffa0dd3077>] mdt_reint+0x67/0x140 [mdt] [<ffffffffa0a69aa5>] tgt_request_handle+0x915/0x1320 [ptlrpc] [<ffffffffa0a15c5b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [<ffffffffa0a13818>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [<ffffffffa05f1957>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [<ffffffffa0a19d10>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] [<ffffffffa0a19270>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc] [<ffffffff810a5aef>] kthread+0xcf/0xe0 [<ffffffff810a5a20>] ? kthread+0x0/0xe0 [<ffffffff816469d8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a20>] ? kthread+0x0/0xe0 Kernel panic - not syncing: LBUG |
| Comments |
| Comment by Peter Jones [ 18/Aug/16 ] |
|
Bobijam Could you please assist with this issue Thanks Peter |
| Comment by Gerrit Updater [ 19/Aug/16 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/22017 |
| Comment by Frank Heckes (Inactive) [ 06/Sep/16 ] |
|
The same error also happened during soak testing of '20160902' (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20160902) Error message is the same beside addresses (see attached file vmcore-dmesg.txt) Sequence of events
Attached files: messages, console and vmcore-dmesg.txt of affected node lola-8, debug log (mask -1) containing debug information of the time interval while executing the mount command specified above. |
| Comment by Frank Heckes (Inactive) [ 06/Sep/16 ] |
|
The soak test was executed with el6.7 build (https://build.hpdd.intel.com/job/lustre-master/3431/ tag 2.8.57) |
| Comment by Gerrit Updater [ 08/Sep/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22017/ |
| Comment by Peter Jones [ 08/Sep/16 ] |
|
Landed for 2.9 |