Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.4.0, Lustre 2.1.5
-
3
-
9382
Description
During acceptance testing for KIT, they tried OSS failover while running several applications. And applications got IO errors (can't create file and similar messages). This should not happen and IO should just stall till failover happens.
The clients were running 2.4 and servers were 2.1.5. We tried with 2.1.5 clients and did not see this issue. I have attached the client and server logs.
Attachments
- ll10987.out.gz
- 0.2 kB
- LU-XXXX.tgz
- 463 kB
- mds1.llog.gz
- 224 kB
- mds2.llog.gz
- 0.2 kB
Activity
Hi Hongchao,
Sorry for the delay in getting the output:
Oct 23 10:47:48 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Oct 23 10:48:39 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10949 sub-object on OST idx 0/1: rc = -11 Oct 23 10:48:39 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:583:lov_update_create_set()) shouldn't encounter -EAGAIN! Oct 23 10:48:39 pfscn2 kernel: : Pid: 9651, comm: ptlrpcd Not tainted 2.6.32-358.11.1.el6_lustre.es124.x86_64 #1 Oct 23 10:48:39 pfscn2 kernel: : Call Trace: Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0ad3505>] ? lov_update_create_set+0x4f5/0x500 [lov] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0ad78cd>] ? cb_create_update+0x2d/0x100 [lov] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0a6f28b>] ? handle_async_create+0x7b/0x390 [osc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0a6fad9>] ? async_create_interpret+0x39/0x50 [osc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0811c9b>] ? ptlrpc_check_set+0x29b/0x1b00 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8150ecda>] ? schedule_timeout+0x19a/0x2e0 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842f20>] ? ptlrpcd_check+0x1a0/0x230 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa08431be>] ? ptlrpcd+0x20e/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff81063310>] ? default_wake_function+0x0/0x20 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Oct 23 10:48:39 pfscn2 kernel: : LustreError: 10250:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10949 sub-object on OST idx 0/1: rc = -5 Oct 23 10:49:21 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Oct 23 10:49:21 pfscn2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans Oct 23 10:53:38 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_setattr operation failed with -19 Oct 23 10:53:38 pfscn2 kernel: : LustreError: Skipped 1 previous similar message Oct 23 10:53:38 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Oct 23 10:54:09 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_connect operation failed with -19 Oct 23 10:54:09 pfscn2 kernel: : LustreError: Skipped 1 previous similar message Oct 23 10:54:16 pfscn2 kernel: : Lustre: 10169:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 6ede50b8-462e-c607-2c6d-f747541fabbd@172.26.17.4@o2ib t0 exp (null) cur 1382518456 last 0 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0xd948 sub-object on OST idx 0/1: rc = -11 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:583:lov_update_create_set()) shouldn't encounter -EAGAIN! Oct 23 10:54:28 pfscn2 kernel: : Pid: 9651, comm: ptlrpcd Not tainted 2.6.32-358.11.1.el6_lustre.es124.x86_64 #1 Oct 23 10:54:28 pfscn2 kernel: : Call Trace: Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0ad3505>] ? lov_update_create_set+0x4f5/0x500 [lov] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0ad78cd>] ? cb_create_update+0x2d/0x100 [lov] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0a6f28b>] ? handle_async_create+0x7b/0x390 [osc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0a6fad9>] ? async_create_interpret+0x39/0x50 [osc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0811c9b>] ? ptlrpc_check_set+0x29b/0x1b00 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8150ecda>] ? schedule_timeout+0x19a/0x2e0 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842f20>] ? ptlrpcd_check+0x1a0/0x230 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa08431be>] ? ptlrpcd+0x20e/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff81063310>] ? default_wake_function+0x0/0x20 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 24178:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0xd948 sub-object on OST idx 0/1: rc = -5
Hi Kit,
could you please test with the following debug patch
diff --git a/lustre/lov/lov_request.c b/lustre/lov/lov_request.c index 0fc635e..f0476cb 100644 --- a/lustre/lov/lov_request.c +++ b/lustre/lov/lov_request.c @@ -577,6 +577,14 @@ static int lov_update_create_set(struct lov_request_set *se "on OST idx %d/%d: rc = %d\n", set->set_oi->oi_oa->o_id, req->rq_idx, lsm->lsm_stripe_count, rc); + +#ifdef __KERNEL__ + if (rc == -EAGAIN) { + CDEBUG(D_ERROR, "shouldn't encounter -EAGAIN!\n"); + dump_stack(); + } +#endif + if (rc > 0) { CERROR("obd_create returned invalid err %d\n", rc); rc = -EIO; diff --git a/lustre/osc/osc_create.c b/lustre/osc/osc_create.c index ffc81d4..22265bd 100644 --- a/lustre/osc/osc_create.c +++ b/lustre/osc/osc_create.c @@ -412,7 +412,7 @@ static int handle_async_create(struct ptlrpc_request *req, i LASSERT_SPIN_LOCKED(&oscc->oscc_lock); - if(rc) + if(rc != 0 && rc != -ENOTCONN && rc != -ENODEV) GOTO(out_wake, rc); /* Handle the critical type errors first.
The fix that worked on my test system didn't work on the customer system, so it might have been a fluke.
I haven't seen that message in the new logs. Here is the latest server dk's:
00000100:00100000:14.0:1381243911.199943:0:4598:0:(import.c:725:ptlrpc_connect_import()) @@@ (re)connect request (timeout 5) req@ffff8808181fb400 x1448249214211642/t0(
0) o8->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:28/4 lens 368/512 e 0 to 0 dl 0 ref 1 fl New:N/0/ffffffff rc 0/-1
00000100:00100000:13.0:1381243911.212619:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211637:172.26.17.3@o2ib:-1
00020000:00020000:13.0:1381243911.212624:0:4596:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10f5a sub-object on OST idx 0/1: rc = -11
00000100:00100000:13.0:1381243911.225308:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211638:172.26.17.3@o2ib:-1
00020000:00020000:13.0:1381243911.225312:0:4596:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x18c7 sub-object on OST idx 0/1: rc = -11
00000100:00100000:13.0:1381243911.250579:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211639:172.26.17.3@o2ib:-1
00020000:00020000:6.0:1381243911.250579:0:29944:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10f5a sub-object on OST idx 0/1: rc = -5
00000100:00100000:13.0:1381243911.250586:0:4596:0:(client.c:1434:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:e76ce10d-1d27-cdc4-1091-649094c70331:4596:1448249214211641:172.26.17.1@o2ib:400
00020000:00020000:5.0:1381243911.250612:0:4755:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x18c7 sub-object on OST idx 0/1: rc = -5
00000100:00100000:13.0:1381243911.250638:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:0:1448249214211644:172.26.17.3@o2ib:13
00000100:00100000:5.0:1381243911.250741:0:4755:0:(service.c:1771:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt_02:94b08aa2-54d7-b32e-019d-2561ed3286b5+6:58211:x1447047461344560:12345-172.26.4.3@o2ib:101 Request procesed in 49643595us (49643620us total) trans 0 rc 301/301
Hi Kit,
Sorry for delayed response, we're in holiday in the last several days.
if "OSCC_FLAG_DEGRADED" is set, then this OST won't be used preferentially when creating object. do you only test the failover on one OSS?
BTW, is there something liken "Unknown rc XXX from async create: failing oscc" in the kernel logs?
Thanks
I tested adding the degraded flag to the oscc and it seemed to fix the problem on my test system:
if(rc != 0 && rc != -ENOTCONN && rc != -ENODEV) GOTO(out_wake, rc); + if (rc == -ENOTCONN || rc == -ENODEV) + oscc->oscc_flags |= OSCC_FLAG_DEGRADED; /* Handle the critical type errors first.
I ran into one issue where the OST ran out of inodes and it caused the lov_create threads on the MDT to deadlock. I'm not sure if that's a new behavior or in stock 2.1.6 as well. I'll investigate.
Ah ok, that makes sense. Should handle_async_create add the RECOVERING flag to the oscc? It seems like osc_interpret_create would add it, but is it possible that it could get to oscc_internal_create without that flag?
Hi
the fake requests will be queued into osc_creater.oscc_wait_create_list in osc_create_async to wait the objects creation to complete, and it will be woken up
in osc_interpret_create with returned value from OST. in this case, -ENOTCONN or -ENODEV is returned and failed these fake requests, which should continue
waiting the recovery of OST to complete and the objects will be ready then.
I'll create a new debug patch with the http://review.whamcloud.com/#/c/7559/
Thanks!
Hi Hongchao,
I was trying to follow the create logic some and it's confused me some. It seems like the fake requests are handled by oscc_internal_create, but then if the oscc is recovering (due to the EAGAIN) all it does is return 0. Am I reading that right?
Thanks,
Kit
Hi Hongchao,
The patch fixed the data corruption, but we are still seeing the same errors on failover. Should we reproduce the issue with the original debugging patch + fix? Here are the kernel messages from the fix (patch 7559):
Sep 25 15:16:49 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_setattr operation failed with -19 Sep 25 15:16:49 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Sep 25 15:17:39 pfscn2 kernel: : LustreError: 9001:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x3288 sub-object on OST idx 0/1: rc = -11 Sep 25 15:17:39 pfscn2 kernel: : LustreError: 7308:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x3288 sub-object on OST idx 0/1: rc = -5 Sep 25 15:17:40 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Sep 25 15:17:40 pfscn2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans
should it actually be:
+ if(rc != 0 && rc != -ENOTCONN && rc != -ENODEV && rc != -EAGAIN)
I built a version with that just to test. I didn't get any IO errors, but I did get: