Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
None
-
Lustre 2.1.5
-
None
-
3
-
9404
Description
Hi,
I am afraid we suffer again from the issue described in LU-2683 and LU-1690. But this time we are running Lustre 2.1.5, which includes the 4 patches from LU-874. We also backported patch http://review.whamcloud.com/5208 from LU-2683 in our sources.
So those 5 patches might not be enough to fix this problem.
Here is the information collected from the crash:
crash>dmesg ... LustreError: 65257:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 65257:0:(cl_io.c:967:cl_io_cancel()) Canceling ongoing page trasmission ... crash> ps | grep 65257 65257 2 5 ffff880fe2ac27d0 IN 0.0 0 0 [ldlm_bl_62] crash> bt 65257 PID: 65257 TASK: ffff880fe2ac27d0 CPU: 5 COMMAND: "ldlm_bl_62" #0 [ffff880fe32a7ae0] schedule at ffffffff81484c15 #1 [ffff880fe32a7ba8] cfs_waitq_wait at ffffffffa055a6de [libcfs] #2 [ffff880fe32a7bb8] cl_sync_io_wait at ffffffffa067f3cb [obdclass] #3 [ffff880fe32a7c58] cl_io_submit_sync at ffffffffa067f643 [obdclass] #4 [ffff880fe32a7cb8] cl_lock_page_out at ffffffffa0676997 [obdclass] #5 [ffff880fe32a7d28] osc_lock_flush at ffffffffa0a6abaf [osc] #6 [ffff880fe32a7d78] osc_lock_cancel at ffffffffa0a6acbf [osc] #7 [ffff880fe32a7dc8] cl_lock_cancel0 at ffffffffa0675575 [obdclass] #8 [ffff880fe32a7df8] cl_lock_cancel at ffffffffa067639b [obdclass] #9 [ffff880fe32a7e18] osc_ldlm_blocking_ast at ffffffffa0a6bd9a [osc] #10 [ffff880fe32a7e88] ldlm_handle_bl_callback at ffffffffa07a0293 [ptlrpc] #11 [ffff880fe32a7eb8] ldlm_bl_thread_main at ffffffffa07a06d1 [ptlrpc] #12 [ffff880fe32a7f48] kernel_thread at ffffffff8100412a crash> dmesg | grep 'SYNC IO' LustreError: 3140:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 63611:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 65257:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 65316:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 65235:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 65277:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages LustreError: 63605:0:(cl_io.c:1702:cl_sync_io_wait()) SYNC IO failed with error: -110, try to cancel 1 remaining pages
Sebastien.