[LU-5464] Hung ll_ost01 Created: 08/Aug/14 Updated: 16/Oct/15 Resolved: 16/Oct/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Mahmoud Hanafi | Assignee: | Zhenyu Xu |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Environment: |
clients: 2.1.5/2.4.3 |
||
| Severity: | 3 |
| Rank (Obsolete): | 15222 |
| Description |
|
OSS getting several ll_ost hung threads. LNet: 2842:0:(o2iblnd_cb.c:2348:kiblnd_passive_connect()) Skipped 1 previous similar message LNet: Service thread pid 11968 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: LNet: Skipped 1 previous similar message Pid: 11968, comm: ll_ost01_089 Call Trace: [<ffffffff815404c2>] schedule_timeout+0x192/0x2e0 [<ffffffff81080610>] ? process_timeout+0x0/0x10 [<ffffffffa04886d1>] cfs_waitq_timedwait+0x11/0x20 [libcfs] [<ffffffffa0744ffd>] ldlm_completion_ast+0x4ed/0x960 [ptlrpc] [<ffffffffa0740790>] ? ldlm_expired_completion_wait+0x0/0x390 [ptlrpc] [<ffffffff81063be0>] ? default_wake_function+0x0/0x20 [<ffffffffa0744738>] ldlm_cli_enqueue_local+0x1f8/0x5d0 [ptlrpc] [<ffffffffa0744b10>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] [<ffffffffa07434b0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa0e303a1>] ofd_destroy_by_fid+0x321/0x710 [ofd] [<ffffffffa07434b0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa0744b10>] ? ldlm_completion_ast+0x0/0x960 [ptlrpc] [<ffffffffa076d125>] ? lustre_msg_buf+0x55/0x60 [ptlrpc] [<ffffffffa0e34fd7>] ofd_destroy+0x1a7/0x8b0 [ofd] [<ffffffffa0771430>] ? lustre_swab_ost_body+0x0/0x10 [ptlrpc] [<ffffffffa0e078a9>] ost_handle+0x4349/0x48e0 [ost] [<ffffffffa0494124>] ? libcfs_id2str+0x74/0xb0 [libcfs] [<ffffffffa077e3b8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [<ffffffffa04885de>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa0499d6f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [<ffffffffa0775719>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffff81055813>] ? __wake_up+0x53/0x70 [<ffffffffa077f74e>] ptlrpc_main+0xace/0x1700 [ptlrpc] [<ffffffffa077ec80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffffa077ec80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffffa077ec80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 See clients hangs and ost disconnects from mds. |
| Comments |
| Comment by Peter Jones [ 08/Aug/14 ] |
|
Bobijam Could you please advise on this issue? Thanks Peter |
| Comment by Zhenyu Xu [ 11/Aug/14 ] |
|
Do you have OST and client debug logs of this issue? |
| Comment by Mahmoud Hanafi [ 15/Oct/15 ] |
|
This can be closed |
| Comment by Peter Jones [ 16/Oct/15 ] |
|
ok Mahmoud |