[LU-2541] Interop 2.1.3<->2.4 failure on test suite sanity-quota test_6: OST hang during destroy Created: 27/Dec/12 Updated: 13/May/13 Resolved: 13/May/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | James Nunez (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | zfs | ||
| Environment: |
server: lustre-master with zfs |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 5962 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/3a2dabd6-507d-11e2-8617-52540035b04c. The sub-test test_6 failed with the following error:
LNet: Service thread pid 13772 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 13772, comm: ll_ost00_020 Call Trace: [<ffffffff814ff1c2>] schedule_timeout+0x192/0x2e0 [<ffffffff8107e1c0>] ? process_timeout+0x0/0x10 [<ffffffffa0619751>] cfs_waitq_timedwait+0x11/0x20 [libcfs] [<ffffffffa090c69d>] ldlm_completion_ast+0x4ed/0x980 [ptlrpc] [<ffffffffa09080b0>] ? ldlm_expired_completion_wait+0x0/0x260 [ptlrpc] [<ffffffff810602c0>] ? default_wake_function+0x0/0x20 [<ffffffffa090bdb8>] ldlm_cli_enqueue_local+0x1f8/0x5f0 [ptlrpc] [<ffffffffa090c1b0>] ? ldlm_completion_ast+0x0/0x980 [ptlrpc] [<ffffffffa090acd0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa0e68909>] ofd_destroy_by_fid+0x179/0x3f0 [ofd] [<ffffffffa090acd0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] [<ffffffffa090c1b0>] ? ldlm_completion_ast+0x0/0x980 [ptlrpc] [<ffffffffa093440d>] ? lustre_msg_buf+0x5d/0x60 [ptlrpc] [<ffffffffa0e6adb7>] ofd_destroy+0x187/0x6a0 [ofd] [<ffffffffa0629591>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa0e4ce22>] ost_handle+0x4232/0x46a0 [ost] [<ffffffffa0625394>] ? libcfs_id2str+0x74/0xb0 [libcfs] [<ffffffffa0945a0c>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc] [<ffffffffa093cdf9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffffa0629591>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffff81053463>] ? __wake_up+0x53/0x70 [<ffffffffa0946fa5>] ptlrpc_main+0xbb5/0x1970 [ptlrpc] [<ffffffffa09463f0>] ? ptlrpc_main+0x0/0x1970 [ptlrpc] [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffffa09463f0>] ? ptlrpc_main+0x0/0x1970 [ptlrpc] [<ffffffffa09463f0>] ? ptlrpc_main+0x0/0x1970 [ptlrpc] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Info required for matching: sanity-quota 6 |
| Comments |
| Comment by Sarah Liu [ 23/Jan/13 ] |
|
another instance: https://maloo.whamcloud.com/test_sets/0426a83a-618a-11e2-be04-52540035b04c |
| Comment by Nathaniel Clark [ 07/Feb/13 ] |
|
https://maloo.whamcloud.com/test_sets/dab78f6c-70f9-11e2-9241-52540035b04c No stack dump but client was also master and had timed-out RPCs. |
| Comment by James Nunez (Inactive) [ 13/May/13 ] |
|
The last test run added to this ticket on February 7, https://maloo.whamcloud.com/test_sets/dab78f6c-70f9-11e2-9241-52540035b04c, is now associated with |
| Comment by James Nunez (Inactive) [ 13/May/13 ] |
|
Duplicate of |