[LU-4097] ptlrpc_main()) ASSERTION( svcpt->scp_nthrs_starting == 1 ) failed: Created: 12/Oct/13 Updated: 15/Oct/13 Resolved: 15/Oct/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Cliff White (Inactive) | Assignee: | Oleg Drokin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Hyperion |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 11006 |
| Description |
|
Starting IOR ssf, server crashes/wedged 2013-10-12 11:13:33 LustreError: 6291:0:(service.c:2864:ptlrpc_start_thread()) cannot start thread 'll_ost01_009': rc -2816
2013-10-12 11:13:33 LustreError: 6321:0:(service.c:2467:ptlrpc_main()) ASSERTION( svcpt->scp_nthrs_starting == 1 ) failed:
2013-10-12 11:13:33 LustreError: 6321:0:(service.c:2467:ptlrpc_main()) LBUG
2013-10-12 11:13:33 Pid: 6321, comm: ll_ost01_010
2013-10-12 11:13:33
2013-10-12 11:13:33 Call Trace:
2013-10-12 11:13:33 [<ffffffffa06bf895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
2013-10-12 11:13:33 [<ffffffffa06bfe97>] lbug_with_loc+0x47/0xb0 [libcfs]
2013-10-12 11:13:33 [<ffffffffa0a49bdc>] ptlrpc_main+0x153c/0x1740 [ptlrpc]
2013-10-12 11:13:33 [<ffffffffa0a486a0>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
2013-10-12 11:13:33 [<ffffffff81096a36>] kthread+0x96/0xa0
2013-10-12 11:13:33 [<ffffffff8100c0ca>] child_rip+0xa/0x20
2013-10-12 11:13:33 [<ffffffff810969a0>] ? kthread+0x0/0xa0
2013-10-12 11:13:33 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
2013-10-12 11:13:33
|
| Comments |
| Comment by Cliff White (Inactive) [ 12/Oct/13 ] |
|
Console log from hyperion-agb20 (dead OSS) |
| Comment by Cliff White (Inactive) [ 12/Oct/13 ] |
|
System had started iorfpp, node had repeated watchdogs after the LBUG. |
| Comment by Peter Jones [ 12/Oct/13 ] |
|
Oleg what do you suggest? |
| Comment by Oleg Drokin [ 14/Oct/13 ] |
|
Cliff did a rerun of the test and it did not reproduce. The error message itself does not make much sense, and we do not have any extra debugging info, so I do not hink we can do anything about this. Could be a one-time fluke too. |
| Comment by Jodi Levi (Inactive) [ 15/Oct/13 ] |
|
Cliff has rerun tests and unable to reproduce this issue. |