Details
-
Bug
-
Resolution: Won't Fix
-
Minor
-
Lustre 2.1.0
-
3
-
4759
Description
When I rebooted two OSS to put a patch for bug LU-874 on the servers, quite a few of the clients have appear to have gotten deadlocked in recovery. Here's a backtrace of ptlrpcd-rcv on on client:
crash> bt 5077 PID: 5077 TASK: ffff88082da834c0 CPU: 8 COMMAND: "ptlrpcd-rcv" #0 [ffff88082da85430] schedule at ffffffff814ee3b2 #1 [ffff88082da854f8] io_schedule at ffffffff814eeba3 #2 [ffff88082da85518] sync_page at ffffffff81110fbd #3 [ffff88082da85528] __wait_on_bit_lock at ffffffff814ef40a #4 [ffff88082da85578] __lock_page at ffffffff81110f57 #5 [ffff88082da855d8] vvp_page_own at ffffffffa093bf6a [lustre] #6 [ffff88082da855f8] cl_page_own0 at ffffffffa0601d3b [obdclass] #7 [ffff88082da85678] cl_page_own at ffffffffa0601fa0 [obdclass] #8 [ffff88082da85688] cl_page_gang_lookup at ffffffffa0603bb7 [obdclass] #9 [ffff88082da85758] cl_lock_page_out at ffffffffa06096fc [obdclass] #10 [ffff88082da85808] osc_lock_flush at ffffffffa0858e8f [osc] #11 [ffff88082da85858] osc_lock_cancel at ffffffffa0858f2a [osc] #12 [ffff88082da858d8] cl_lock_cancel0 at ffffffffa0604665 [obdclass] #13 [ffff88082da85928] cl_lock_cancel at ffffffffa06051ab [obdclass] #14 [ffff88082da85968] osc_ldlm_blocking_ast at ffffffffa0859cf8 [osc] #15 [ffff88082da859f8] ldlm_cancel_callback at ffffffffa06a1ba3 [ptlrpc] #16 [ffff88082da85a18] ldlm_lock_cancel at ffffffffa06a1c89 [ptlrpc] #17 [ffff88082da85a58] ldlm_cli_cancel_list_local at ffffffffa06bede8 [ptlrpc] #18 [ffff88082da85ae8] ldlm_cancel_lru_local at ffffffffa06bf255 [ptlrpc] #19 [ffff88082da85b08] ldlm_replay_locks at ffffffffa06bf385 [ptlrpc] #20 [ffff88082da85bb8] ptlrpc_import_recovery_state_machine at ffffffffa070ceea [ptlrpc] #21 [ffff88082da85c38] ptlrpc_connect_interpret at ffffffffa070db38 [ptlrpc] #22 [ffff88082da85d08] ptlrpc_check_set at ffffffffa06dd870 [ptlrpc] #23 [ffff88082da85de8] ptlrpcd_check at ffffffffa07113b8 [ptlrpc] #24 [ffff88082da85e48] ptlrpcd at ffffffffa071175b [ptlrpc] #25 [ffff88082da85f48] kernel_thread at ffffffff8100c14a
I will need to do more investigation, but thats a start.
Attachments
Issue Links
- is duplicated by
-
LU-1066 Test failure on test suite replay-single test 89
- Resolved
- is related to
-
LU-1059 vvp_page_unmap()) ASSERTION(PageLocked(vmpage))
- Resolved
- Trackbacks
-
Changelog 2.1 Changes from version 2.1.2 to version 2.1.3 Server support for kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1....
-
Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....