[LU-3645] Interop 2.1.5 <--> 2.4 Write operations during failover errors out instead of stalling Created: 26/Jul/13 Updated: 15/Mar/14 Resolved: 22/Nov/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.1.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Girish Shilamkar (Inactive) | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mn1 | ||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9382 |
| Description |
|
During acceptance testing for KIT, they tried OSS failover while running several applications. And applications got IO errors (can't create file and similar messages). This should not happen and IO should just stall till failover happens. |
| Comments |
| Comment by Peter Jones [ 26/Jul/13 ] |
|
Hongchao Could you please look into this one? Thanks Peter |
| Comment by Kit Westneat (Inactive) [ 26/Jul/13 ] |
|
This is actually a duplicate of |
| Comment by Peter Jones [ 26/Jul/13 ] |
|
ok - thanks Kit |
| Comment by Kit Westneat (Inactive) [ 30/Jul/13 ] |
|
Hi Peter, I was mistaken on this - it is a different issue uncovered during the same testing. Can you reopen the ticket and advise us on what kind of debug information we should get to help diagnose the issue? I've attached rpctrace and vfstrace output. |
| Comment by Kit Westneat (Inactive) [ 30/Jul/13 ] |
|
| Comment by James Nunez (Inactive) [ 30/Jul/13 ] |
|
Per Kit Westneat : Thanks, |
| Comment by Hongchao Zhang [ 01/Aug/13 ] |
|
status update: |
| Comment by Hongchao Zhang [ 02/Aug/13 ] |
|
I can't find the logs related to eviction in the client logs, but there are some in server (MDS & OSS) LU-XXXX/oss1/messages-20130623:Jun 21 14:34:05 oss1 kernel: : Lustre: pfscdat1-OST0000: haven't heard from client 52a702a3-cdba-d2a9-b8ee-c95c3cce895b (at 172.26.4.6@o2ib) in 230 seconds. I think it's dead, and I am evicting it. exp ffff8800b073f800, cur 1371818045 expire 1371817895 last 1371817815 |
| Comment by Girish Shilamkar (Inactive) [ 06/Aug/13 ] |
|
> Hongchao Zhang added a comment - 02/Aug/13 4:56 AM They are the client IPs. 172.26.7.* had mounted the file systems but were not part of the failover testing. > LU-XXXX/oss1/messages-20130623:Jun 21 14:34:05 oss1 kernel: : Lustre: pfscdat1-OST0000: haven't heard from client 52a702a3-cdba-d2a9-b8ee-c95c3cce895b (at 172.26.4.6@o2ib) in 230 seconds. I think it's dead, and I am evicting it. exp ffff8800b073f800, cur 1371818045 expire 1371817895 last 1371817815 All these messages are very old. The complete server system has been |
| Comment by Girish Shilamkar (Inactive) [ 07/Aug/13 ] |
|
Can you please increase the priority of this issue ? The problem is blocking this new system from getting into production. Thanks |
| Comment by Hongchao Zhang [ 07/Aug/13 ] |
|
okay! is this issue can be reproduced with fewer nodes on your site(say, 2 nodes)? |
| Comment by Girish Shilamkar (Inactive) [ 07/Aug/13 ] |
|
> I'm trying to reproduce it myself with 2 nodes(2.1.5 client + 2.4 server) by "bonnie++", and can't >reproduce it up to now. This problem was seen during acceptance testing and it only had mds failover pair and two oss. >could you please attach the new syslog of ucbn003/oss1/oss2, the latest logs in LU-XXXX.tgz is July 16, I can see ost_write operations failing with -ENODEV. I ran replay-ost-single on 2.1.5 (ost_writes fail with -ENODEV) and 2.4 ( ost_write fails with -ENOTCONN) Does the error returned by ost_write affect the failover behavior ? |
| Comment by Girish Shilamkar (Inactive) [ 07/Aug/13 ] |
|
> I'm trying to reproduce it myself with 2 nodes(2.1.5 client + 2.4 server) by "bonnie++", and can't reproduce it up to now. I am assuming 2.1.5 client + 2.4 server is typo, as this problem is seen with 2.1.5 servers and 2.4 client |
| Comment by Kit Westneat (Inactive) [ 07/Aug/13 ] |
|
I sorted out the relevant bits from the Lustre log so I could see what was going on a bit better, hopefully this is helpful. client: 00000080:00200000:0.0:1375191524.336997:0:11756:0:(namei.c:503:ll_lookup_it()) VFS Op:name=v8KnMgSE00000005fb,dir=144115440287769024/33554490(ffff8808619835f8),intent=o
pen|creat
00000100:00100000:0.0:1375191524.337573:0:11756:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc bonnie++:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a:11756:1441894532689196:172.26.17.2@o2ib:101
00000100:00100000:0.0:1375191524.337599:0:11756:0:(client.c:2090:ptlrpc_set_wait()) set ffff88086ac7b2c0 going to sleep for 6 seconds
00000100:00100000:0.0:1375191524.423682:0:11756:0:(client.c:1805:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc bonnie++:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a:11756:1441894532689196:172.26.17.2@o2ib:101
00000002:00100000:0.0:1375191524.423714:0:11756:0:(mdc_locks.c:579:mdc_finish_enqueue()) @@@ op: 3 disposition: 17, status: -5 req@ffff880867f32c00 x1441894532689196/t0(0) o101->pfscdat2-MDT0000-mdc-ffff8810708aec00@172.26.17.2@o2ib:12/10 lens 592/544 e 0 to 0 dl 1375191531 ref 1 fl Complete:R/0/0 rc 301/301
mds2: 00000100:00100000:1.0:1375191524.337381:0:17462:0:(service.c:1547:ptlrpc_server_handle_req_in()) got req x1441894532689196 00000100:00100000:1.0:1375191524.337395:0:17462:0:(service.c:1724:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt_10:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a+3316:11756:x1441894532689196:12345-172.26.4.3@o2ib:101 00000100:00080000:3.0:1375191524.337404:0:7060:0:(recover.c:216:ptlrpc_request_handle_notconn()) import pfscdat2-OST0000-osc-MDT0000 of pfscdat2-OST0000_UUID@172.26.17.3@o2ib abruptly disconnected: reconnecting 00000100:02000400:3.0:1375191524.337410:0:7060:0:(import.c:170:ptlrpc_set_import_discon()) pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete 00000100:00080000:3.0:1375191524.337419:0:7060:0:(import.c:180:ptlrpc_set_import_discon()) ffff88082aeed800 pfscdat2-OST0000_UUID: changing import state from FULL to DISCONN 00000100:00080000:3.0:1375191524.337424:0:7060:0:(import.c:626:ptlrpc_connect_import()) ffff88082aeed800 pfscdat2-OST0000_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:3.0:1375191524.337428:0:7060:0:(import.c:482:import_select_connection()) pfscdat2-OST0000-osc-MDT0000: connect to NID 172.26.17.3@o2ib last attempt 4982526859 00000100:00080000:3.0:1375191524.337433:0:7060:0:(import.c:561:import_select_connection()) pfscdat2-OST0000-osc-MDT0000: import ffff88082aeed800 using connection 172.26.17.3@o2ib/172.26.17.3@o2ib 00000100:00100000:3.0:1375191524.337453:0:7060:0:(import.c:725:ptlrpc_connect_import()) @@@ (re)connect request (timeout 5) req@ffff881001505c00 x1441623136776061/t0(0) o8->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:28/4 lens 368/512 e 0 to 0 dl 0 ref 1 fl New:N/0/ffffffff rc 0/-1 00000100:00100000:5.0:1375191524.337462:0:17454:0:(service.c:1547:ptlrpc_server_handle_req_in()) got req x1441894532689216 00000100:00020000:3.0:1375191524.337469:0:7060:0:(layout.c:1659:__req_capsule_get()) @@@ Wrong buffer for field `ost_body' (1 of 1) in format `OST_CREATE': 0 vs. 208 (server) req@ffff8810019bf000 x1441623136776022/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:7/4 lens 400/192 e 0 to 0 dl 1375191531 ref 1 fl Interpret:RN/0/0 rc -107/-107 00000100:00100000:5.0:1375191524.337475:0:17454:0:(service.c:1724:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt_rdpg_02:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a+3319:11760:x1441894532689216:12345-172.26.4.3@o2ib:35 00000100:00100000:13.0:1375191524.337481:0:7061:0:(client.c:1434:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd-rcv:pfscdat2-MDT0000-mdtlov_UUID:7061:1441623136776061:172.26.17.3@o2ib:8 00000100:00100000:5.0:1375191524.337511:0:17454:0:(service.c:1771:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt_rdpg_02:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a+3318:11760:x1441894532689216:12345-172.26.4.3@o2ib:35 Request procesed in 38us (60us total) trans 51541021103 rc 0/0 00000100:02020000:13.0:1375191524.347566:0:7061:0:(client.c:1132:ptlrpc_check_status()) 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_connect operation failed with -19 00000100:00080000:13.0:1375191524.347576:0:7061:0:(import.c:1112:ptlrpc_connect_interpret()) ffff88082aeed800 pfscdat2-OST0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:13.0:1375191524.347579:0:7061:0:(import.c:1158:ptlrpc_connect_interpret()) recovery of pfscdat2-OST0000_UUID on 172.26.17.3@o2ib failed (-19) 00000100:00100000:13.0:1375191524.347583:0:7061:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd-rcv:pfscdat2-MDT0000-mdtlov_UUID:7061:1441623136776061:172.26.17.3@o2ib:8 00000008:00020000:3.0:1375191524.381692:0:7060:0:(osc_create.c:175:osc_interpret_create()) @@@ Unknown rc -107 from async create: failing oscc req@ffff8810019bf000 x1441623136776022/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:7/4 lens 400/192 e 0 to 0 dl 1375191531 ref 1 fl Interpret:RN/0/0 rc -107/-107 00000100:00080000:3.0:1375191524.410395:0:7060:0:(import.c:195:ptlrpc_set_import_discon()) osc: import ffff88082aeed800 already not connected (conn 5, was 4): DISCONN 00000008:00080000:3.0:1375191524.410401:0:7060:0:(osc_create.c:182:osc_interpret_create()) preallocated through id 5539121 (next to use 5534160) 00020000:00080000:3.0:1375191524.410404:0:7060:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x8bb8 sub-object on OST idx 0/1: rc = -107 00000100:00100000:3.0:1375191524.410411:0:7060:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:7060:1441623136776022:172.26.17.3@o2ib:5 00020000:00020000:1.0:1375191524.410415:0:17462:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x8bb8 sub-object on OST idx 0/1: rc = -5 00000100:00100000:1.0:1375191524.423247:0:17462:0:(service.c:1771:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt_10:8ff70f2e-0c20-af6e-ad31-a95bc6b86f3a+3317:11756:x1441894532689196:12345-172.26.4.3@o2ib:101 Request procesed in 85853us (85878us total) trans 0 rc 301/301 oss1: 00000100:00100000:8.0:1375191524.325142:0:2026:0:(service.c:1547:ptlrpc_server_handle_req_in()) got req x1441623136776022 00000100:00100000:8.0:1375191524.325163:0:2026:0:(service.c:1724:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_creat_02:0+-99:7060:x1441623136776022:12345-172.26.17.2@o2ib:5 00000010:00080000:8.0:1375191524.325170:0:2026:0:(ost_handler.c:2085:ost_handle()) operation 5 on unconnected OST from 12345-172.26.17.2@o2ib 00000100:00100000:8.0:1375191524.325200:0:2026:0:(service.c:1771:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_creat_02:0+-99:7060:x1441623136776022:12345-172.26.17.2@o2ib:5 Request procesed in 41us (115us total) trans 0 rc -107/-107 |
| Comment by Oleg Drokin [ 09/Aug/13 ] |
|
The last log excerpts you show explains the EIO as due to MDS2 being disconnected from OST, so when we try to create an object, there's nowhere to get the object from, hence EIO. Now, why did MDS2 disconnected from the OST? Is OST still in recovery and does not recognize MDS, so does not let it in? |
| Comment by Kit Westneat (Inactive) [ 09/Aug/13 ] |
|
I found the kernel logs for the servers: mds2: Jul 30 15:34:08 mds2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans Jul 30 15:38:44 mds2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_create operation failed with -107 Jul 30 15:38:44 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Jul 30 15:38:44 mds2 kernel: : LustreError: 7060:0:(layout.c:1659:__req_capsule_get()) @@@ Wrong buffer for field `ost_body' (1 of 1) in format `OST_CREATE': 0 vs. 208 (server) Jul 30 15:38:44 mds2 kernel: : req@ffff8810019bf000 x1441623136776022/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:7/4 lens 400/192 e 0 to 0 dl 1375191531 ref 1 fl Interpret:RN/0/0 rc -107/-107 Jul 30 15:38:44 mds2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_connect operation failed with -19 Jul 30 15:38:44 mds2 kernel: : LustreError: 7060:0:(osc_create.c:175:osc_interpret_create()) @@@ Unknown rc -107 from async create: failing oscc req@ffff8810019bf000 x1441623136776022/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:7/4 lens 400/192 e 0 to 0 dl 1375191531 ref 1 fl Interpret:RN/0/0 rc -107/-107 Jul 30 15:38:44 mds2 kernel: : LustreError: 17462:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x8bb8 sub-object on OST idx 0/1: rc = -5 Jul 30 15:39:23 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) oss1: Jul 30 15:38:43 oss1 kernel: : Lustre: Failing over pfscdat2-OST0000 Jul 30 15:38:44 oss1 kernel: : LustreError: 137-5: pfscdat2-OST0000: Not available for connect from 172.26.4.3@o2ib (stopping) Jul 30 15:38:44 oss1 kernel: : LustreError: 137-5: pfscdat2-OST0000: Not available for connect from 172.26.17.2@o2ib (stopping) Jul 30 15:38:44 oss1 kernel: : Lustre: pfscdat2-OST0000: shutting down for failover; client state will be preserved. Jul 30 15:38:45 oss1 kernel: : Lustre: OST pfscdat2-OST0000 has stopped. oss2: Jul 30 15:38:56 oss2 kernel: : Lustre: 4795:0:(ldlm_lib.c:2028:target_recovery_init()) RECOVERY: service pfscdat2-OST0000, 11 recoverable clients, last_transno 42740433 Jul 30 15:38:56 oss2 kernel: : Lustre: pfscdat2-OST0000: Now serving pfscdat2-OST0000/ on /dev/mapper/ost_pfscdat2_0 with recovery enabled So oss1 was cleanly unmounted from oss1 and failed over to oss2. Shouldn't the MDT IO hang until it can connect to oss2? |
| Comment by Hongchao Zhang [ 12/Aug/13 ] |
|
this is the normal case for MDS to handle the objects precreation on OST, for the creation request is marked as no-delay and no-resend. |
| Comment by Kit Westneat (Inactive) [ 12/Aug/13 ] |
|
Hi Hongchao, I am not sure what you mean, I see this in the MDS logs: Isn't that the part where the MDT is returning EIO? Let me know if there is another debug level I should use to get more information. Thanks. |
| Comment by Hongchao Zhang [ 13/Aug/13 ] |
|
Hi Kit, At MDS, the OSC will create objects in advance (precreation) to be used by MDT, then the failure of creation request won't be known to client, Is the log of the client running the failed application available at July 30? and it could contain the logs where&when the MDT returned error to the client. Thanks |
| Comment by Kit Westneat (Inactive) [ 13/Aug/13 ] |
|
Sure, here's the client syslogs from that period: Jul 29 13:37:06 ucbn003 kernel: Lustre: Trying to mount a client with IR setting not compatible with current mgc. Force to use current mgc setting that is IR disabled. Jul 30 15:33:14 ucbn003 kernel: LustreError: 11-0: pfscdat2-OST0000-osc-ffff8810708aec00: Communicating with 172.26.17.4@o2ib, operation ost_destroy failed with -19. Jul 30 15:33:14 ucbn003 kernel: Lustre: pfscdat2-OST0000-osc-ffff8810708aec00: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Jul 30 15:34:08 ucbn003 kernel: Lustre: pfscdat2-OST0000-osc-ffff8810708aec00: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Jul 30 15:38:44 ucbn003 kernel: LustreError: 11-0: pfscdat2-OST0000-osc-ffff8810708aec00: Communicating with 172.26.17.3@o2ib, operation ldlm_enqueue failed with -107. Jul 30 15:38:44 ucbn003 kernel: Lustre: pfscdat2-OST0000-osc-ffff8810708aec00: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Jul 30 15:39:23 ucbn003 kernel: Lustre: pfscdat2-OST0000-osc-ffff8810708aec00: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Just to make sure we are on the same page, the EIO was in the client lctl dk logs posted earlier: It looks like pid 17462 gets the EIO in lov_update_create_set and then returns it to the client to complete x1441894532689196, which looks like the open|creat on the file. Thanks. |
| Comment by Hongchao Zhang [ 14/Aug/13 ] |
|
Hi KIT, yes, lov_create failed somehow, could you please retest with this patch, which prints more logs to help to find the cause, thanks! diff --git a/lustre/lov/lov_request.c b/lustre/lov/lov_request.c index 0fc635e..dd7c081 100644 --- a/lustre/lov/lov_request.c +++ b/lustre/lov/lov_request.c @@ -573,10 +573,10 @@ static int lov_update_create_set(struct lov_request_set *set, /* Pre-creating objects may timeout via -ETIMEDOUT or * -ENOTCONN both are always non-critical events. */ CDEBUG(rc == -ETIMEDOUT || rc == -ENOTCONN ? D_HA : D_ERROR, - "error creating fid "LPX64" sub-object " + "error creating fid "LPX64"/"LPX64" sub-object " "on OST idx %d/%d: rc = %d\n", - set->set_oi->oi_oa->o_id, req->rq_idx, - lsm->lsm_stripe_count, rc); + set->set_oi->oi_oa->o_id, set->set_oi->oi_oa->o_seq, + req->rq_idx, lsm->lsm_stripe_count, rc); if (rc > 0) { CERROR("obd_create returned invalid err %d\n", rc); rc = -EIO; @@ -636,6 +636,8 @@ static int create_done(struct obd_export *exp, struct lov_request_set *set, req->rq_complete = 0; rc = qos_remedy_create(set, req); + if (rc != 0) + CERROR("qos_remedy_create failed %d\n", rc); lov_update_create_set(set, req, rc); } } @@ -733,6 +735,8 @@ int cb_create_update(void *cookie, int rc) if (lovreq->rq_idx == cfs_fail_val) rc = -ENOTCONN; + if (rc != 0) + CERROR("cb_create_update failed %d\n", rc); rc= lov_update_create_set(lovreq->rq_rqset, lovreq, rc); if (lov_set_finished(lovreq->rq_rqset, 0)) lov_put_reqset(lovreq->rq_rqset); |
| Comment by Kit Westneat (Inactive) [ 14/Aug/13 ] |
|
Hi Hongchao, I was reviewing this ticket and I realized that in this test configuration, the file system only has one OST. Looking at the code for the creates, it appears that it is doing what is intended. When all OSTs are offline, the MDT will return EIO to the client. Is this correct? I will work on rebuilding a server with that debug patch. Thanks. |
| Comment by Hongchao Zhang [ 15/Aug/13 ] |
|
Hi Kit, one OST isn't a problem and the MDT/clients will wait it back. in my local test(two nodes of VM), the all two OSTs are unmounted and mounted without this error. Thanks |
| Comment by Kit Westneat (Inactive) [ 04/Sep/13 ] |
|
Hi Hongchao, Here are the results of the debug patch: Sep 4 14:34:48 mds2 kernel: : Lustre: 4591:0:(client.c:1817:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1378298081/real 1378298081] req@ffff8808173ce000 x1445232561408895/t0(0) o400->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:28/4 lens 192/192 e 0 to 1 dl 1378298088 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 4 14:34:48 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Sep 4 14:34:54 mds2 kernel: : Lustre: 4592:0:(client.c:1817:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1378298088/real 1378298088] req@ffff880825dca000 x1445232561408896/t0(0) o8->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:28/4 lens 368/512 e 0 to 1 dl 1378298094 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Sep 4 14:35:27 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Sep 4 14:35:27 mds2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans Sep 4 14:40:00 mds2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_setattr operation failed with -19 Sep 4 14:40:00 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Sep 4 14:40:00 mds2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_connect operation failed with -19 Sep 4 14:40:51 mds2 kernel: : LustreError: 4591:0:(lov_request.c:739:cb_create_update()) cb_create_update failed -11 Sep 4 14:40:51 mds2 kernel: : LustreError: 4591:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x9a8/0x0 sub-object on OST idx 0/1: rc = -11 Sep 4 14:40:51 mds2 kernel: : LustreError: 15954:0:(lov_request.c:640:create_done()) qos_remedy_create failed -5 Sep 4 14:40:51 mds2 kernel: : LustreError: 15954:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x9a8/0x0 sub-object on OST idx 0/1: rc = -5 Sep 4 14:40:51 mds2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Sep 4 14:40:51 mds2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans |
| Comment by Hongchao Zhang [ 05/Sep/13 ] |
|
Hi Kit, Thanks for your effort to test the debug patch! this issue is caused by the failed precreate request (rc = -107/-ENOTCONN), which was just sent to precreate objects and triggere the recovery. step 1, OSC found there was no enough objects, then it called osc_internal_create to send a request to precreate objects normally, the failure of the request precreating objects won't affect the OSC, for there should be enough objects to be used during the processing interval of the patch will be attached soon! |
| Comment by Hongchao Zhang [ 05/Sep/13 ] |
|
the patch is tracked at http://review.whamcloud.com/#/c/7559/ |
| Comment by Kit Westneat (Inactive) [ 05/Sep/13 ] |
|
Hi Hongchao, Is this an MDT only patch, or is it for the clients too? Thanks, |
| Comment by Hongchao Zhang [ 06/Sep/13 ] |
|
Hi Kit the patch is for MDT only. Thanks |
| Comment by Kit Westneat (Inactive) [ 13/Sep/13 ] |
|
Hi Hongchao, We've run into a problem with this patch. After upgrading the MDT, we are starting to get data corruption on the system. Here is a description of the issue:
I will upload the lctl dk logs (with vfstrace and rpctrace). Is there any other information we should get? |
| Comment by Kit Westneat (Inactive) [ 13/Sep/13 ] |
|
I should add that we didn't see the problem any more after downgrading to stock 2.1.5. We did however see the issue at both 2.1.6+patch and 2.1.5+patch. |
| Comment by Hongchao Zhang [ 16/Sep/13 ] |
|
sorry, there is an error in the previous patch, and it has been updated (http://review.whamcloud.com/#/c/7559/) |
| Comment by Kit Westneat (Inactive) [ 26/Sep/13 ] |
|
Hi Hongchao, The patch fixed the data corruption, but we are still seeing the same errors on failover. Should we reproduce the issue with the original debugging patch + fix? Here are the kernel messages from the fix (patch 7559): Sep 25 15:16:49 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_setattr operation failed with -19 Sep 25 15:16:49 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Sep 25 15:17:39 pfscn2 kernel: : LustreError: 9001:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x3288 sub-object on OST idx 0/1: rc = -11 Sep 25 15:17:39 pfscn2 kernel: : LustreError: 7308:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x3288 sub-object on OST idx 0/1: rc = -5 Sep 25 15:17:40 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Sep 25 15:17:40 pfscn2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans |
| Comment by Kit Westneat (Inactive) [ 27/Sep/13 ] |
|
Hi Hongchao, I was trying to follow the create logic some and it's confused me some. It seems like the fake requests are handled by oscc_internal_create, but then if the oscc is recovering (due to the EAGAIN) all it does is return 0. Am I reading that right? Thanks, |
| Comment by Hongchao Zhang [ 30/Sep/13 ] |
|
Hi I'll create a new debug patch with the http://review.whamcloud.com/#/c/7559/ Thanks! |
| Comment by Kit Westneat (Inactive) [ 01/Oct/13 ] |
|
Ah ok, that makes sense. Should handle_async_create add the RECOVERING flag to the oscc? It seems like osc_interpret_create would add it, but is it possible that it could get to oscc_internal_create without that flag? |
| Comment by Kit Westneat (Inactive) [ 03/Oct/13 ] |
|
Hi Hongchao, Any updates? Thanks, |
| Comment by Kit Westneat (Inactive) [ 03/Oct/13 ] |
|
I tested adding the degraded flag to the oscc and it seemed to fix the problem on my test system: if(rc != 0 && rc != -ENOTCONN && rc != -ENODEV) GOTO(out_wake, rc); + if (rc == -ENOTCONN || rc == -ENODEV) + oscc->oscc_flags |= OSCC_FLAG_DEGRADED; /* Handle the critical type errors first. I ran into one issue where the OST ran out of inodes and it caused the lov_create threads on the MDT to deadlock. I'm not sure if that's a new behavior or in stock 2.1.6 as well. I'll investigate. |
| Comment by Hongchao Zhang [ 08/Oct/13 ] |
|
Hi Kit, Sorry for delayed response, we're in holiday in the last several days. if "OSCC_FLAG_DEGRADED" is set, then this OST won't be used preferentially when creating object. do you only test the failover on one OSS? BTW, is there something liken "Unknown rc XXX from async create: failing oscc" in the kernel logs? Thanks |
| Comment by Kit Westneat (Inactive) [ 09/Oct/13 ] |
|
The fix that worked on my test system didn't work on the customer system, so it might have been a fluke. I haven't seen that message in the new logs. Here is the latest server dk's: 00000100:00100000:14.0:1381243911.199943:0:4598:0:(import.c:725:ptlrpc_connect_import()) @@@ (re)connect request (timeout 5) req@ffff8808181fb400 x1448249214211642/t0(
0) o8->pfscdat2-OST0000-osc-MDT0000@172.26.17.3@o2ib:28/4 lens 368/512 e 0 to 0 dl 0 ref 1 fl New:N/0/ffffffff rc 0/-1
00000100:00100000:13.0:1381243911.212619:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211637:172.26.17.3@o2ib:-1
00020000:00020000:13.0:1381243911.212624:0:4596:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10f5a sub-object on OST idx 0/1: rc = -11
00000100:00100000:13.0:1381243911.225308:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211638:172.26.17.3@o2ib:-1
00020000:00020000:13.0:1381243911.225312:0:4596:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x18c7 sub-object on OST idx 0/1: rc = -11
00000100:00100000:13.0:1381243911.250579:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:-1:
1448249214211639:172.26.17.3@o2ib:-1
00020000:00020000:6.0:1381243911.250579:0:29944:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10f5a sub-object on OST idx 0/1: rc = -5
00000100:00100000:13.0:1381243911.250586:0:4596:0:(client.c:1434:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:e76ce10d-1d27-cdc4-1091-649094c70331:4596:1448249214211641:172.26.17.1@o2ib:400
00020000:00020000:5.0:1381243911.250612:0:4755:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x18c7 sub-object on OST idx 0/1: rc = -5
00000100:00100000:13.0:1381243911.250638:0:4596:0:(client.c:1773:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd:pfscdat2-MDT0000-mdtlov_UUID:0:1448249214211644:172.26.17.3@o2ib:13
00000100:00100000:5.0:1381243911.250741:0:4755:0:(service.c:1771:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt_02:94b08aa2-54d7-b32e-019d-2561ed3286b5+6:58211:x1447047461344560:12345-172.26.4.3@o2ib:101 Request procesed in 49643595us (49643620us total) trans 0 rc 301/301
|
| Comment by Hongchao Zhang [ 10/Oct/13 ] |
|
Hi Kit, could you please test with the following debug patch diff --git a/lustre/lov/lov_request.c b/lustre/lov/lov_request.c index 0fc635e..f0476cb 100644 --- a/lustre/lov/lov_request.c +++ b/lustre/lov/lov_request.c @@ -577,6 +577,14 @@ static int lov_update_create_set(struct lov_request_set *se "on OST idx %d/%d: rc = %d\n", set->set_oi->oi_oa->o_id, req->rq_idx, lsm->lsm_stripe_count, rc); + +#ifdef __KERNEL__ + if (rc == -EAGAIN) { + CDEBUG(D_ERROR, "shouldn't encounter -EAGAIN!\n"); + dump_stack(); + } +#endif + if (rc > 0) { CERROR("obd_create returned invalid err %d\n", rc); rc = -EIO; diff --git a/lustre/osc/osc_create.c b/lustre/osc/osc_create.c index ffc81d4..22265bd 100644 --- a/lustre/osc/osc_create.c +++ b/lustre/osc/osc_create.c @@ -412,7 +412,7 @@ static int handle_async_create(struct ptlrpc_request *req, i LASSERT_SPIN_LOCKED(&oscc->oscc_lock); - if(rc) + if(rc != 0 && rc != -ENOTCONN && rc != -ENODEV) GOTO(out_wake, rc); /* Handle the critical type errors first. |
| Comment by Kit Westneat (Inactive) [ 23/Oct/13 ] |
|
Hi Hongchao, Sorry for the delay in getting the output: Oct 23 10:47:48 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Oct 23 10:48:39 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10949 sub-object on OST idx 0/1: rc = -11 Oct 23 10:48:39 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:583:lov_update_create_set()) shouldn't encounter -EAGAIN! Oct 23 10:48:39 pfscn2 kernel: : Pid: 9651, comm: ptlrpcd Not tainted 2.6.32-358.11.1.el6_lustre.es124.x86_64 #1 Oct 23 10:48:39 pfscn2 kernel: : Call Trace: Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0ad3505>] ? lov_update_create_set+0x4f5/0x500 [lov] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0ad78cd>] ? cb_create_update+0x2d/0x100 [lov] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0a6f28b>] ? handle_async_create+0x7b/0x390 [osc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0a6fad9>] ? async_create_interpret+0x39/0x50 [osc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0811c9b>] ? ptlrpc_check_set+0x29b/0x1b00 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8150ecda>] ? schedule_timeout+0x19a/0x2e0 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842f20>] ? ptlrpcd_check+0x1a0/0x230 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa08431be>] ? ptlrpcd+0x20e/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff81063310>] ? default_wake_function+0x0/0x20 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:48:39 pfscn2 kernel: : [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Oct 23 10:48:39 pfscn2 kernel: : LustreError: 10250:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x10949 sub-object on OST idx 0/1: rc = -5 Oct 23 10:49:21 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Oct 23 10:49:21 pfscn2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans Oct 23 10:53:38 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_setattr operation failed with -19 Oct 23 10:53:38 pfscn2 kernel: : LustreError: Skipped 1 previous similar message Oct 23 10:53:38 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Oct 23 10:54:09 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_connect operation failed with -19 Oct 23 10:54:09 pfscn2 kernel: : LustreError: Skipped 1 previous similar message Oct 23 10:54:16 pfscn2 kernel: : Lustre: 10169:0:(ldlm_lib.c:952:target_handle_connect()) MGS: connection from 6ede50b8-462e-c607-2c6d-f747541fabbd@172.26.17.4@o2ib t0 exp (null) cur 1382518456 last 0 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0xd948 sub-object on OST idx 0/1: rc = -11 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 9651:0:(lov_request.c:583:lov_update_create_set()) shouldn't encounter -EAGAIN! Oct 23 10:54:28 pfscn2 kernel: : Pid: 9651, comm: ptlrpcd Not tainted 2.6.32-358.11.1.el6_lustre.es124.x86_64 #1 Oct 23 10:54:28 pfscn2 kernel: : Call Trace: Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0ad3505>] ? lov_update_create_set+0x4f5/0x500 [lov] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0ad78cd>] ? cb_create_update+0x2d/0x100 [lov] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0a6f28b>] ? handle_async_create+0x7b/0x390 [osc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0a6fad9>] ? async_create_interpret+0x39/0x50 [osc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0811c9b>] ? ptlrpc_check_set+0x29b/0x1b00 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8150ecda>] ? schedule_timeout+0x19a/0x2e0 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842f20>] ? ptlrpcd_check+0x1a0/0x230 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa08431be>] ? ptlrpcd+0x20e/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff81063310>] ? default_wake_function+0x0/0x20 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffffa0842fb0>] ? ptlrpcd+0x0/0x370 [ptlrpc] Oct 23 10:54:28 pfscn2 kernel: : [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Oct 23 10:54:28 pfscn2 kernel: : LustreError: 24178:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0xd948 sub-object on OST idx 0/1: rc = -5 |
| Comment by Kit Westneat (Inactive) [ 23/Oct/13 ] |
|
should it actually be: I built a version with that just to test. I didn't get any IO errors, but I did get: Oct 23 22:27:56 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_create operation failed with -19 Oct 23 22:27:56 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Oct 23 22:27:56 pfscn2 kernel: : LustreError: 9255:0:(osc_create.c:175:osc_interpret_create()) @@@ Unknown rc -19 from async create: failing oscc req@ffff88081f6c3800 x1449716075693976/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/400 e 0 to 0 dl 1382560083 ref 1 fl Interpret:RN/0/0 rc -19/-19 Oct 23 22:28:53 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Oct 23 22:28:53 pfscn2 kernel: : Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans Oct 23 22:28:53 pfscn2 kernel: : Lustre: Skipped 1 previous similar message Oct 23 22:31:16 pfscn2 kernel: : Lustre: Service thread pid 10987 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Oct 23 22:31:16 pfscn2 kernel: : Pid: 10987, comm: mdt_13 Oct 23 22:31:16 pfscn2 kernel: : Oct 23 22:31:16 pfscn2 kernel: : Call Trace: Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa04de923>] ? cfs_alloc+0x63/0x90 [libcfs] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0808a45>] ? ptlrpc_next_xid+0x15/0x40 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa04de60e>] cfs_waitq_wait+0xe/0x10 [libcfs] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0ac650a>] lov_create+0xbaa/0x1400 [lov] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0d5c0d6>] ? mdd_get_md+0x96/0x2f0 [mdd] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffff81063310>] ? default_wake_function+0x0/0x20 Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0d7c916>] ? mdd_read_unlock+0x26/0x30 [mdd] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0d6090e>] mdd_lov_create+0x9ee/0x1ba0 [mdd] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0d72871>] mdd_create+0xf81/0x1a90 [mdd] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0e72414>] ? osd_object_init+0xe4/0x420 [osd_ldiskfs] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0eaa3f7>] cml_create+0x97/0x250 [cmm] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0de55b1>] ? mdt_version_get_save+0x91/0xd0 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0dfb156>] mdt_reint_open+0x1aa6/0x2940 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa081d754>] ? lustre_msg_add_version+0x74/0xd0 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0d7556e>] ? md_ucred+0x1e/0x60 [mdd] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0de3c51>] mdt_reint_rec+0x41/0xe0 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0ddaed4>] mdt_reint_internal+0x544/0x8e0 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0ddb53d>] mdt_intent_reint+0x1ed/0x500 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0dd9c09>] mdt_intent_policy+0x379/0x690 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa07d9391>] ldlm_lock_enqueue+0x361/0x8f0 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa07ff1dd>] ldlm_handle_enqueue0+0x48d/0xf50 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0dda586>] mdt_enqueue+0x46/0x130 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0dcf772>] mdt_handle_common+0x932/0x1750 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa0dd0665>] mdt_regular_handle+0x15/0x20 [mdt] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa082db9e>] ptlrpc_main+0xc4e/0x1a40 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa082cf50>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffff8100c0ca>] child_rip+0xa/0x20 Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa082cf50>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffffa082cf50>] ? ptlrpc_main+0x0/0x1a40 [ptlrpc] Oct 23 22:31:16 pfscn2 kernel: : [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 |
| Comment by Kit Westneat (Inactive) [ 23/Oct/13 ] |
|
the decode lustrelog dumped |
| Comment by Hongchao Zhang [ 28/Oct/13 ] |
|
Hi Kit, Do you apply any patch on the original 2.1.5? the error "-EAGAIN" is originated from ptlrpc_check_set as per the logs and it should be "ptlrpc_import_delay_req" but the logs show the the cause of the issue (the fake request failed and was not restarted if the OSC was recovering), will try to create the updated patch. Thanks! |
| Comment by Kit Westneat (Inactive) [ 31/Oct/13 ] |
|
I think it was stock 2.1.5, I will see if I can find any patches. Any updates on creating the patch? The customer is getting impatient. When can I tell them to expect the fix? Thanks, |
| Comment by Hongchao Zhang [ 01/Nov/13 ] |
|
the patch is under test and will be attached soon |
| Comment by Hongchao Zhang [ 01/Nov/13 ] |
|
the new patch is tracked at http://review.whamcloud.com/#/c/7559/ |
| Comment by Kit Westneat (Inactive) [ 05/Nov/13 ] |
|
Hi Hongchao, Still getting errors: Nov 4 21:33:56 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_create operation failed with -107 Nov 4 21:33:56 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete Nov 4 21:33:56 pfscn2 kernel: : LustreError: 9244:0:(layout.c:1659:__req_capsule_get()) @@@ Wrong buffer for field `ost_body' (1 of 1) in format `OST_CREATE': 0 vs. 208 (server) Nov 4 21:33:56 pfscn2 kernel: : req@ffff8808092fa000 x1450802648479850/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/192 e 0 to 0 dl 1383597243 ref 1 fl Interpret:RN/0/0 rc -107/-107 Nov 4 21:33:56 pfscn2 kernel: : LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_connect operation failed with -19 Nov 4 21:33:56 pfscn2 kernel: : LustreError: 9244:0:(osc_create.c:177:osc_interpret_create()) @@@ Unknown rc -107 from async create: failing oscc req@ffff8808092fa000 x1450802648479850/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/192 e 0 to 0 dl 1383597243 ref 1 fl Interpret:RN/0/0 rc -107/-107 Nov 4 21:33:56 pfscn2 kernel: : LustreError: 12037:0:(lov_request.c:579:lov_update_create_set()) error creating fid 0x1377d sub-object on OST idx 0/1: rc = -5 Nov 4 21:34:48 pfscn2 kernel: : Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Are you able to reproduce the error? Basically we are doing it by having 8 clients writing small files, then failing over an OSS. |
| Comment by Kit Westneat (Inactive) [ 05/Nov/13 ] |
|
Do you think that patchset 4 would fix this issue? It actually seems as if we are hitting 2 separate issues depending on where in the code path the OSS fails. |
| Comment by Kit Westneat (Inactive) [ 05/Nov/13 ] |
|
With both patchset 4 and 5, it seems to solve the issue: LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_create operation failed with -107 Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete LustreError: 9266:0:(layout.c:1659:__req_capsule_get()) @@@ Wrong buffer for field `ost_body' (1 of 1) in format `OST_CREATE': 0 vs. 208 (server) req@ffff880808005800 x1450889175112924/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/192 e 0 to 0 dl 1383677590 ref 1 fl Interpret:RN/0/0 rc -107/-107 LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_connect operation failed with -19 LustreError: 9266:0:(osc_create.c:177:osc_interpret_create()) @@@ Unknown rc -107 from async create: failing oscc req@ffff880808005800 x1450889175112924/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/192 e 0 to 0 dl 1383677590 ref 1 fl Interpret:RN/0/0 rc -107/-107 Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The ost_statfs operation failed with -107 Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans LustreError: 11-0: an error occurred while communicating with 172.26.17.4@o2ib. The ost_statfs operation failed with -107 LustreError: Skipped 1 previous similar message Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.4@o2ib) was lost; in progress operations using this service will wait for recovery to complete LustreError: 9266:0:(osc_create.c:177:osc_interpret_create()) @@@ Unknown rc -19 from async create: failing oscc req@ffff881001a70000 x1450889176260853/t0(0) o5->pfscdat2-OST0000-osc-MDT0000@172.26.17.4@o2ib:7/4 lens 400/400 e 0 to 0 dl 1383678377 ref 1 fl Interpret:RN/0/0 rc -19/-19 Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.3@o2ib) Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans LustreError: 11-0: an error occurred while communicating with 172.26.17.3@o2ib. The obd_ping operation failed with -107 LustreError: Skipped 2 previous similar messages Lustre: pfscdat2-OST0000-osc-MDT0000: Connection to pfscdat2-OST0000 (at 172.26.17.3@o2ib) was lost; in progress operations using this service will wait for recovery to complete Lustre: pfscdat2-OST0000-osc-MDT0000: Connection restored to pfscdat2-OST0000 (at 172.26.17.4@o2ib) Lustre: MDS mdd_obd-pfscdat2-MDT0000: pfscdat2-OST0000_UUID now active, resetting orphans This error message looks somewhat concerning: LustreError: 9266:0:(layout.c:1659:__req_capsule_get()) @@@ Wrong buffer for field `ost_body' (1 of 1) in format `OST_CREATE': 0 vs. 208 (server) Do you know if this should cause issues? |
| Comment by Hongchao Zhang [ 06/Nov/13 ] |
|
this error message should not cause issues, for it only detected there was no "ost_body" in the replied message (server side) of the failed ost_create request. if (field->rmf_flags & RMF_F_STRUCT_ARRAY) { /* * We've already asserted that field->rmf_size > 0 in * req_layout_init(). */ len = lustre_msg_buflen(msg, offset); if ((len % field->rmf_size) != 0) { CERROR("%s: array field size mismatch " "%d modulo %d != 0 (%d)\n", field->rmf_name, len, field->rmf_size, loc); return NULL; } } else if (pill->rc_area[loc][offset] != -1) { len = pill->rc_area[loc][offset]; } else { len = max(field->rmf_size, 0); } value = getter(msg, offset, len); if (value == NULL) { DEBUG_REQ(D_ERROR, pill->rc_req, "Wrong buffer for field `%s' (%d of %d) " "in format `%s': %d vs. %d (%s)\n", field->rmf_name, offset, lustre_msg_bufcount(msg), fmt->rf_name, lustre_msg_buflen(msg, offset), len, rcl_names[loc]); Yes, the two patches should be combined, and the patch has been updated. |
| Comment by Kit Westneat (Inactive) [ 06/Nov/13 ] |
|
Would it be possible to suppress the warning message in this case? I think it is going to confuse sysadmins. |
| Comment by Hongchao Zhang [ 07/Nov/13 ] |
|
Okay, the patch has been updated to suppress this error message if the corresponding request failed. Thanks |
| Comment by Kit Westneat (Inactive) [ 07/Nov/13 ] |
|
Perfect! I confirmed with the customer that this patch fixes their issue, so once this lands, I guess we can close the ticket. |
| Comment by Kit Westneat (Inactive) [ 13/Nov/13 ] |
|
Hi Hongchao, Do you think something like this patch is necessary on master? The osc appears to be a lot different in that version. Thanks, |
| Comment by Hongchao Zhang [ 16/Nov/13 ] |
|
Hi Kit, the master has waited the object precreation in "osp_declare_object_create"/"osp_precreate_reserve"/"osp_precreate_ready_condition" for recovering case static int osp_precreate_ready_condition(const struct lu_env *env, struct osp_device *d) { if (d->opd_pre_recovering) return 0; /* ready if got enough precreated objects */ /* we need to wait for others (opd_pre_reserved) and our object (+1) */ if (d->opd_pre_reserved + 1 < osp_objs_precreated(env, d)) return 1; /* ready if OST reported no space and no destroys in progress */ if (d->opd_syn_changes + d->opd_syn_rpc_in_progress == 0 && d->opd_pre_status == -ENOSPC) return 1; /* Bail out I/O fails to OST */ if (d->opd_pre_status == -EIO) return 1; return 0; } |