Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.5.2
-
None
-
Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/61/
Distro/Arch: RHEL6.5/x86_64 + SLES11SP3/x86_64 (Server + Client)
-
3
-
14246
Description
parallel-scale test iorssf failed as follows:
** error ** ERROR in aiori-POSIX.c (line 316): cannot close file. ERROR: Input/output error ** exiting **
Dmesg on client node:
[79443.603331] Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf ================================================================ 08:39:21 (1402241961) [79539.008049] Lustre: 4915:0:(client.c:1908:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1402242048/real 0] req@ffff88000b268800 x1470307796710476/t0(0) o103->lustre-OST0006-osc-ffff880009b9b000@10.1.6.247@tcp:17/18 lens 328/224 e 0 to 1 dl 1402242057 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 [79539.008055] Lustre: 4915:0:(client.c:1908:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [79539.008068] Lustre: lustre-OST0006-osc-ffff880009b9b000: Connection to lustre-OST0006 (at 10.1.6.247@tcp) was lost; in progress operations using this service will wait for recovery to complete [79543.442585] LustreError: 167-0: lustre-OST0006-osc-ffff880009b9b000: This client was evicted by lustre-OST0006; in progress operations using this service will fail. [79543.506888] Lustre: lustre-OST0006-osc-ffff880009b9b000: Connection restored to lustre-OST0006 (at 10.1.6.247@tcp) [79543.506891] Lustre: Skipped 1 previous similar message [79619.930641] Lustre: DEBUG MARKER: /usr/sbin/lctl mark parallel-scale test_iorssf: @@@@@@ FAIL: ior failed! 1
Console log on OSS node:
09:45:22:Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf ================================================================ 08:39:21 (1402241961) 09:45:22:Lustre: 10808:0:(client.c:1908:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1402242048/real 1402242048] req@ffff880033dd8400 x1470305068132704/t0(0) o104->lustre-OST0006@10.1.6.249@tcp:15/16 lens 296/224 e 0 to 1 dl 1402242055 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 09:45:22:Lustre: 10808:0:(client.c:1908:ptlrpc_expire_one_request()) Skipped 1 previous similar message 09:45:22:LustreError: 138-a: lustre-OST0006: A client on nid 10.1.6.249@tcp was evicted due to a lock blocking callback time out: rc -107 09:45:22:LustreError: Skipped 26 previous similar messages 09:45:22:LustreError: 10775:0:(ldlm_lib.c:2702:target_bulk_io()) @@@ Eviction on bulk GET req@ffff88002ba61400 x1470307796710396/t0(0) o4->b4770fd0-a408-2d95-7c2a-fb5d5e6f4b61@10.1.6.249@tcp:0/0 lens 488/448 e 1 to 0 dl 1402242082 ref 1 fl Interpret:/0/0 rc 0/0 09:45:22:Lustre: lustre-OST0006: Bulk IO write error with b4770fd0-a408-2d95-7c2a-fb5d5e6f4b61 (at 10.1.6.249@tcp), client will retry: rc -107 09:45:22:Lustre: Skipped 8 previous similar messages 09:45:22:LustreError: 10775:0:(ldlm_lib.c:2702:target_bulk_io()) Skipped 1 previous similar message 09:45:22:LustreError: 10869:0:(ldlm_lockd.c:2300:ldlm_cancel_handler()) ldlm_cancel from 10.1.6.249@tcp arrived at 1402242061 with bad export cookie 6818992319885353493 09:45:22:Lustre: DEBUG MARKER: /usr/sbin/lctl mark parallel-scale test_iorssf: @@@@@@ FAIL: ior failed! 1 09:45:22:Lustre: DEBUG MARKER: parallel-scale test_iorssf: @@@@@@ FAIL: ior failed! 1
Maloo report: https://maloo.whamcloud.com/test_sets/a12994bc-ef55-11e3-9713-52540035b04c