Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.8.0
-
3
-
9223372036854775807
Description
Found this in 24 hours failover test in OpenSFS cluster.
Lustre: DEBUG MARKER: ==== Checking the clients loads AFTER failover -- failure NOT OK Lustre: DEBUG MARKER: mds7 has failed over 1 times, and counting... Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436839484/real 1436839484] req@ffff88062c53a080 x1506632146392040/t0(0) o400->lustre-MDT0006-osp-MDT0001@192.168.2.127@o2ib:24/4 lens 224/224 e 1 to 1 dl 1436839486 ref 1 fl Rpc:X/c0/ffffffff rc 0/-1 Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) Skipped 1 previous similar message Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436839503/real 1436839503] req@ffff88062c53a080 x1506632146392208/t0(0) o400->lustre-MDT0006-osp-MDT0001@192.168.2.127@o2ib:24/4 lens 224/224 e 1 to 1 dl 1436839505 ref 1 fl Rpc:X/c0/ffffffff rc 0/-1 Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) Skipped 2 previous similar messages Lustre: lustre-MDT0006-osp-MDT0001: Connection restored to lustre-MDT0006 (at 192.168.2.127@o2ib) LustreError: 12030:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed: LustreError: 12030:0:(dt_object.c:512:dt_record_write()) LBUG Pid: 12030, comm: mdt_out03_005 Call Trace: [<ffffffffa0506875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa0506e77>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa065978f>] dt_record_write+0xbf/0x130 [obdclass] [<ffffffffa08f4d0e>] out_tx_write_exec+0x7e/0x300 [ptlrpc] [<ffffffffa08ed30a>] out_tx_end+0xda/0x5d0 [ptlrpc] [<ffffffffa08f1e7b>] out_handle+0xd9b/0x17e0 [ptlrpc] [<ffffffffa083afb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc] [<ffffffffa08ea212>] tgt_request_handle+0xa42/0x1230 [ptlrpc] [<ffffffffa0892891>] ptlrpc_main+0xe41/0x1920 [ptlrpc] [<ffffffffa0891a50>] ? ptlrpc_main+0x0/0x1920 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 LustreError: dumping log to /tmp/lustre-log.1436839511.12030