Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.9.0
-
Interop: 2.8.0 EL7.2 Server/EL7.2 Client
Server: b2_8, build# 12
Client: master, build# 3431
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a3f680b4-7276-11e6-8afd-5254006e85c2.
The sub-test test_255a failed with the following error:
test failed to respond and timed out
Client dmesg:
[ 7909.928568] Lustre: DEBUG MARKER: == sanity test 255a: check 'lfs ladvise -a willread' ================================================= 19:40:57 (1472870457) [ 7916.983080] Lustre: 13272:0:(client.c:2113:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1472870457/real 1472870457] req@ffff880043d21b00 x1544415137983184/t0(0) o21->lustre-OST0003-osc-ffff880000b5c800@10.2.4.226@tcp:6/4 lens 504/432 e 0 to 1 dl 1472870464 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [ 7916.993670] Lustre: lustre-OST0003-osc-ffff880000b5c800: Connection to lustre-OST0003 (at 10.2.4.226@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 7917.002143] Lustre: lustre-OST0003-osc-ffff880000b5c800: Connection restored to 10.2.4.226@tcp (at 10.2.4.226@tcp) [ 7917.004970] Lustre: Skipped 1 previous similar message [ 7938.019143] Lustre: lustre-OST0003-osc-ffff880000b5c800: Connection to lustre-OST0003 (at 10.2.4.226@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 7938.030107] Lustre: Skipped 2 previous similar messages [ 7973.060145] Lustre: lustre-OST0003-osc-ffff880000b5c800: Connection to lustre-OST0003 (at 10.2.4.226@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 7973.070547] Lustre: Skipped 10 previous similar messages [ 7987.075859] Lustre: lustre-OST0003-osc-ffff880000b5c800: Connection restored to 10.2.4.226@tcp (at 10.2.4.226@tcp) [ 7987.082573] Lustre: Skipped 15 previous similar messages [ 8040.574146] INFO: task lfs:24449 blocked for more than 120 seconds. [ 8040.579853] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 8040.582026] lfs D ffff88003cb47b10 0 24449 24265 0x00000080 [ 8040.584167] ffff88005462fbe0 0000000000000082 ffff880036479700 ffff88005462ffd8 [ 8040.586392] ffff88005462ffd8 ffff88005462ffd8 ffff880036479700 ffff8800378a5790 [ 8040.588540] ffff8800378a5798 7fffffffffffffff ffff880036479700 ffff88003cb47b10 [ 8040.590683] Call Trace: [ 8040.592422] [<ffffffff8163bc59>] schedule+0x29/0x70 [ 8040.594318] [<ffffffff81639949>] schedule_timeout+0x209/0x2d0 [ 8040.596362] [<ffffffffa0945c5c>] ? ptlrpcd_add_req+0x1ec/0x2e0 [ptlrpc] [ 8040.598446] [<ffffffffa0923c30>] ? lustre_swab_ladvise_hdr+0x40/0x40 [ptlrpc] [ 8040.600571] [<ffffffff8163c026>] wait_for_completion+0x116/0x170 [ 8040.602607] [<ffffffff810b8910>] ? wake_up_state+0x20/0x20 [ 8040.604615] [<ffffffffa0b4f3b0>] ? lov_io_iter_fini_wrapper+0x50/0x50 [lov] [ 8040.606715] [<ffffffffa0aeee68>] osc_io_ladvise_end+0x38/0x50 [osc] [ 8040.608765] [<ffffffffa075a95d>] cl_io_end+0x5d/0x150 [obdclass] [ 8040.610725] [<ffffffffa0b4f48b>] lov_io_end_wrapper+0xdb/0xe0 [lov] [ 8040.612715] [<ffffffffa0b4f9a6>] lov_io_call.isra.9+0x86/0x140 [lov] [ 8040.614647] [<ffffffffa0b4fa96>] lov_io_end+0x36/0xb0 [lov] [ 8040.616502] [<ffffffffa075a95d>] cl_io_end+0x5d/0x150 [obdclass] [ 8040.618371] [<ffffffffa075d103>] cl_io_loop+0xb3/0x190 [obdclass] [ 8040.620279] [<ffffffffa0bd1c29>] ll_file_ioctl+0x3069/0x37f0 [lustre] [ 8040.622168] [<ffffffff811f2585>] do_vfs_ioctl+0x2e5/0x4c0 [ 8040.624012] [<ffffffff811fd057>] ? __alloc_fd+0xa7/0x130 [ 8040.625813] [<ffffffff811fd167>] ? __fd_install+0x47/0x60 [ 8040.627680] [<ffffffff811f2801>] SyS_ioctl+0xa1/0xc0 [ 8040.629464] [<ffffffff81646c49>] system_call_fastpath+0x16/0x1b
OST console:
19:43:40:[ 7912.118286] Lustre: DEBUG MARKER: == sanity test 255a: check 'lfs ladvise -a willread' ================================================= 19:40:57 (1472870457) 19:43:40:[ 7912.161108] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0002: no handlers for opcode 0x15 19:43:40:[ 7912.166191] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) Skipped 6 previous similar messages 19:43:40:[ 7919.176071] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 7919.185035] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7926.185274] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 7926.189136] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7933.189463] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 7933.196824] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7940.211153] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7947.211321] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 7947.216921] Lustre: Skipped 1 previous similar message 19:43:40:[ 7954.222110] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7954.228583] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) Skipped 1 previous similar message 19:43:40:[ 7968.227179] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 7968.233536] Lustre: Skipped 8 previous similar messages 19:43:40:[ 7975.251445] LustreError: 23546:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 7975.257113] LustreError: 23546:0:(tgt_handler.c:593:tgt_handler_find_check()) Skipped 8 previous similar messages 19:43:40:[ 8000.268676] Lustre: lustre-OST0000: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 19:43:40:[ 8000.275601] Lustre: Skipped 5 previous similar messages 19:43:40:[ 8010.282726] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) lustre-OST0003: no handlers for opcode 0x15 19:43:40:[ 8010.287623] LustreError: 12896:0:(tgt_handler.c:593:tgt_handler_find_check()) Skipped 10 previous similar messages 19:43:40:[ 8066.316121] Lustre: lustre-OST0003: Client 95455275-3e21-dec7-8ff1-6b8c07c6d061 (at 10.2.4.219@tcp) reconnecting 20:41:39:********** Timeout by autotest system **********
This issue was first seen on 08/16/2016
Attachments
Issue Links
- is related to
-
LU-8811 mdc_close() may return a freed request in *request
- Resolved
-
LU-9097 sanity test_253 test_255: ZFS list corruption
- Closed
- is related to
-
LU-14139 batched statahead processing
- Resolved
-
LU-14380 Make statahead better support Breadth First Search (BFS) or Depth First Search (DFS)
- Open
(3 mentioned in)