[LU-4584] Lock revocation process fails consistently Created: 05/Feb/14 Updated: 13/Oct/21 Resolved: 12/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 1 |
| Labels: | llnl | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||
| Rank (Obsolete): | 12530 | ||||||||||||||||||||||||||||
| Description |
|
Some users have reported to us that the "rm" command is taking a long time. Some investigation revealed that at least the first "rm" in a directory takes just over 100 seconds, which of course sounds like OBD_TIMEOUT_DEFAULT. This isn't necessarily the simplest reproducer, but the following reproducer is completely consistent:
The clients are running 2.4.0-19chaos, servers are at 2.4.0-21chaos. The servers are using zfs as the backend. I have some lustre logs that I will share and talk about in additional posts to this ticket. But essentially it looks like the server always times out on a AST to client A (explaining the 100 second delay). It is not really clear yet to me why that happens, because client A appears to be completely responsive. My current suspicion is the the MDT is to blame. |
| Comments |
| Comment by Christopher Morrone [ 05/Feb/14 ] |
|
Actually, it looks like the MDT evicts both the client on which the file was created and the client that is running robinhood (changelog client). Here is an example. The players are:
So I start by doing "rm foo" on 172.16.66.5@tcp: 00000080:00200000:10.0:1391563489.083622:0:212662:0:(file.c:3001:ll_inode_permission()) VFS Op:inode=144115923974954140/33554603(ffff8817ad125238), inode mode 41c0 mask 1 00000080:00200000:10.0:1391563489.083623:0:212662:0:(dcache.c:401:ll_revalidate_it()) VFS Op:name=foo,intent=getattr 00000080:00200000:10.0:1391563489.083625:0:212662:0:(file.c:3001:ll_inode_permission()) VFS Op:inode=144115923974954140/33554603(ffff8817ad125238), inode mode 41c0 mask 3 00000080:00200000:10.0:1391563489.083627:0:212662:0:(namei.c:1263:ll_unlink_generic()) VFS Op:name=foo,dir=144115923974954140/33554603(ffff8817ad125238) 00000100:00100000:10.0:1391563489.083656:0:212662:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc rm:a3cebad5-b812-989a-0fc6-cab928270ea2:212662:1457223513206688:172.20.20.201@o2ib500:36 00000100:00100000:10.0:1391563489.083668:0:212662:0:(client.c:2132:ptlrpc_set_wait()) set ffff8810df789240 going to sleep for 61 seconds Lets follow that RPC over to the MDS on 172.20.20.201@o2ib500: 00000100:00100000:8.0:1391563489.083302:0:18680:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1457223513206688 00000100:00100000:8.0:1391563489.083331:0:18680:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.16.66.5@tcp, seq: 322652232 00000100:00100000:8.0:1391563489.083335:0:18680:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt02_078:a3cebad5-b812-989a-0fc6-cab928270ea2+86:212662:x1457223513206688:12345-172.16.66.5@tcp:36 00000004:00000002:8.0:1391563489.083339:0:18680:0:(mdt_handler.c:1806:mdt_reint_opcode()) @@@ reint opt = 4 req@ffff88082d057800 x1457223513206688/t0(0) o36->a3cebad5-b812-989a-0fc6-cab928270ea2@172.16.66.5@tcp:0/0 lens 488/0 e 0 to 0 dl 1391563550 ref 1 fl Interpret:/0/ffffffff rc 0/-1 00000004:00000002:8.0:1391563489.083347:0:18680:0:(mdt_reint.c:686:mdt_reint_unlink()) @@@ unlink [0x20000ab57:0x11c9c:0x0]/foo req@ffff88082d057800 x1457223513206688/t0(0) o36->a3cebad5-b812-989a-0fc6-cab928270ea2@172.16.66.5@tcp:0/0 lens 488/696 e 0 to 0 dl 1391563550 ref 1 fl Interpret:/0/0 rc 0/0 00000004:00000002:8.0:1391563489.083361:0:18680:0:(mdt_reint.c:100:mdt_obj_version_get()) FID [0x20000ab57:0x11c9c:0x0] version is 0x14183989a3 00000100:00100000:8.0:1391563489.083383:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372844:172.16.66.5@tcp:104 00000100:00100000:8.0:1391563489.083395:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372848:172.16.66.30@tcp:104 00000100:00100000:8.0:1391563489.083404:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372852:172.16.66.30@tcp:104 00000100:00100000:8.0:1391563489.083411:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372856:172.16.66.4@tcp:104 00000100:00100000:8.0:1391563489.083419:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372860:172.16.66.4@tcp:104 00000100:00100000:8.0:1391563489.083425:0:18680:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372864:172.16.66.4@tcp:104 00000100:00100000:8.0:1391563489.083429:0:18680:0:(client.c:2132:ptlrpc_set_wait()) set ffff880e25a0c900 going to sleep for 61 seconds 00000100:00100000:8.0:1391563489.084163:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372852:172.16.66.30@tcp:104 00000100:00100000:8.0:1391563489.084185:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372856:172.16.66.4@tcp:104 00000100:00100000:8.0:1391563489.084191:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372844:172.16.66.5@tcp:104 00000100:00100000:8.0:1391563489.084194:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372860:172.16.66.4@tcp:104 00000100:00100000:8.0:1391563489.084234:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372848:172.16.66.30@tcp:104 00000100:00100000:8.0:1391563489.084240:0:18680:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc mdt02_078:fsv-MDT0000_UUID:18680:1458673558372864:172.16.66.4@tcp:104 00000100:00100000:6.0:1391563489.084258:0:14480:0:(events.c:352:request_in_callback()) peer: 12345-172.16.66.30@tcp 00000100:00100000:13.0:1391563489.084300:0:32258:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1455014558386756 00000100:00100000:13.0:1391563489.084310:0:32258:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.16.66.30@tcp, seq: 24327657 00000100:00100000:13.0:1391563489.084314:0:32258:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cn03_006:2120d720-a7c5-a236-f306-8bdb94a8b962+9371:6961:x1455014558386756:12345-172.16.66.30@tcp:103 00010000:00000002:13.0:1391563489.084317:0:32258:0:(ldlm_lockd.c:2368:ldlm_cancel_handler()) cancel 00000100:00100000:13.0:1391563489.084332:0:32258:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cn03_006:2120d720-a7c5-a236-f306-8bdb94a8b962+9370:6961:x1455014558386756:12345-172.16.66.30@tcp:103 Request procesed in 21us (75us total) trans 0 rc 0/0 00000100:00100000:13.0:1391563489.084334:0:32258:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.16.66.30@tcp, seq: 24327657 00000100:00100000:14.0:1391563489.084423:0:14484:0:(events.c:352:request_in_callback()) peer: 12345-172.16.66.4@tcp 00000100:00100000:13.0:1391563489.084426:0:32258:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1450886429351844 00000100:00100000:13.0:1391563489.084432:0:32258:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.16.66.4@tcp, seq: 24327658 00000100:00100000:13.0:1391563489.084434:0:32258:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cn03_006:1444d025-e75e-a124-88f8-c2f78d40926b+2471:10919:x1450886429351844:12345-172.16.66.4@tcp:103 00010000:00000002:13.0:1391563489.084436:0:32258:0:(ldlm_lockd.c:2368:ldlm_cancel_handler()) cancel 00000100:00100000:13.0:1391563489.084444:0:32258:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cn03_006:1444d025-e75e-a124-88f8-c2f78d40926b+2470:10919:x1450886429351844:12345-172.16.66.4@tcp:103 Request procesed in 11us (23us total) trans 0 rc 0/0 00000100:00100000:13.0:1391563489.084447:0:32258:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.16.66.4@tcp, seq: 24327658 So the MDS node immediately turns around and creates 6 RPCs, 3 for 172.16.66.4@tcp, 1 for 172.16.66.5@tcp, and 2 for 172.16.66.30@tcp. Those RPCs all appear to complete in very little time. Shortly after that, the server receives LDLM_CANCEL rpcs from both 172.16.66.4@tcp (the client where the file was originally created) and 172.16.66.30@tcp (the robinhood client node). Here is the 172.16.66.4@tcp client handling the three LDLM_BL_CALLBACK callbacks. For whatever reason, it only seems to result in a single LDLM_CANCEL back to the server. Is that a problem? 00000100:00100000:14.0F:1391563489.084416:0:9947:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:13.0F:1391563489.084429:0:9949:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:8.0F:1391563489.084461:0:51449:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673558372856 00000100:00100000:9.0F:1391563489.084465:0:11003:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673558372860 00000100:00100000:4.0F:1391563489.084487:0:9943:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:9.0:1391563489.084497:0:11003:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673558372864 00000100:00100000:8.0:1391563489.084503:0:51449:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 1412060 00000100:00100000:8.0:1391563489.084510:0:51449:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_005:LMV_MDC_UUID+8:18680:x1458673558372860:12345-172.20.20.201@o2ib500:104 00000100:00100000:9.0:1391563489.084515:0:11003:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 1412061 00000100:00100000:9.0:1391563489.084522:0:11003:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_000:LMV_MDC_UUID+8:18680:x1458673558372856:12345-172.20.20.201@o2ib500:104 00000100:00100000:8.0:1391563489.084567:0:51449:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_005:LMV_MDC_UUID+8:18680:x1458673558372860:12345-172.20.20.201@o2ib500:104 Request procesed in 61us (140us total) trans 0 rc 0/0 00000100:00100000:10.0F:1391563489.084571:0:48512:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 1412062 00000100:00100000:9.0:1391563489.084571:0:11003:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_000:LMV_MDC_UUID+8:18680:x1458673558372856:12345-172.20.20.201@o2ib500:104 Request procesed in 53us (158us total) trans 0 rc 0/0 00000100:00100000:8.0:1391563489.084575:0:51449:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 1412060 00000100:00100000:9.0:1391563489.084577:0:11003:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 1412061 00000100:00100000:10.0:1391563489.084581:0:48512:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_003:LMV_MDC_UUID+6:18680:x1458673558372864:12345-172.20.20.201@o2ib500:104 00000100:00100000:10.0:1391563489.084601:0:48512:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_003:LMV_MDC_UUID+6:18680:x1458673558372864:12345-172.20.20.201@o2ib500:104 Request procesed in 26us (124us total) trans 0 rc -22/-22 00000100:00100000:10.0:1391563489.084604:0:48512:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 1412062 00000100:00100000:5.0F:1391563489.084744:0:10919:0:(ptlrpcd.c:387:ptlrpcd_check()) transfer 1 async RPCs [2->3] 00000100:00100000:5.0:1391563489.084750:0:10919:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_3:1444d025-e75e-a124-88f8-c2f78d40926b:10919:1450886429351844:172.20.20.201@o2ib500:103 00000100:00100000:5.0:1391563489.085483:0:10919:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_3:1444d025-e75e-a124-88f8-c2f78d40926b:10919:1450886429351844:172.20.20.201@o2ib500:103 But the client seems to think its work is done, because the only rpcs I see it sending to the server over the next 100 seconds are the normal OBD_PING aliveness checks. The server is clearly not satisfied, because later after the 100 second timeout it evicts both 172.16.66.4@tcp and 172.16.66.30@tcp. He is the server doing that: 00010000:00020000:8.1F:1391563589.243260:0:0:0:(ldlm_lockd.c:391:waiting_locks_callback()) ### lock callback timer expired after 287s: evicting client at 172.16.66.30@tcp ns: mdt-fsv-MDT0000_UUID lock: ffff880f091cea00/0xf70254b1ce3273a8 lrc: 3/0,0 mode: PR/PR res: 8589983111/32 bits 0x1b rrc: 3 type: IBT flags: 0x200000000020 nid: 172.16.66.30@tcp remote: 0x60ad8f9843a9644 expref: 9434 pid: 16453 timeout: 4759057840 lvb_type: 0 ... 00010000:00020000:8.1:1391563589.283179:0:0:0:(ldlm_lockd.c:391:waiting_locks_callback()) ### lock callback timer expired after 302s: evicting client at 172.16.66.4@tcp ns: mdt-fsv-MDT0000_UUID lock: ffff880f17f60200/0xf70254b1ce2bd28a lrc: 3/0,0 mode: CR/CR res: 8589983111/32 bits 0x8 rrc: 3 type: IBT flags: 0x20 nid: 172.16.66.4@tcp remote: 0x1dfa39e9c64af6c1 expref: 2469 pid: 17054 timeout: 4759057840 lvb_type: 3 Note that the "timer expired after" numbers are bogus (which we recognized in LU4579). If you check the timestamps you'll see that it really is 100 seconds after the lock cancel was issued. After the evictions the server can finally reply to the initial unlink rpc: 00000004:00000002:9.0:1391563589.289089:0:18680:0:(mdt_reint.c:100:mdt_obj_version_get()) FID [0x20000bd87:0x20:0x0] version is 0x141a11cf09 00000040:00100000:9.0:1391563589.293701:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f3d:1: idx: 34051, 64 00000040:00100000:9.0:1391563589.293729:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ede:1: idx: 34754, 64 00000040:00100000:9.0:1391563589.293743:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7edf:1: idx: 34957, 64 00000040:00100000:9.0:1391563589.293757:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee0:1: idx: 34238, 64 00000040:00100000:9.0:1391563589.293772:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee1:1: idx: 34149, 64 00000040:00100000:9.0:1391563589.293787:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee2:1: idx: 34303, 64 00000040:00100000:9.0:1391563589.293803:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee3:1: idx: 35304, 64 00000040:00100000:9.0:1391563589.293827:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee4:1: idx: 34051, 64 00000040:00100000:9.0:1391563589.293842:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee5:1: idx: 36690, 64 00000040:00100000:9.0:1391563589.293857:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee6:1: idx: 35442, 64 00000040:00100000:9.0:1391563589.293883:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee7:1: idx: 35208, 64 00000040:00100000:9.0:1391563589.293898:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee8:1: idx: 35410, 64 00000040:00100000:9.0:1391563589.293914:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ee9:1: idx: 34747, 64 00000040:00100000:9.0:1391563589.293940:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eea:1: idx: 33825, 64 00000040:00100000:9.0:1391563589.293955:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eeb:1: idx: 33073, 64 00000040:00100000:9.0:1391563589.293969:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eec:1: idx: 35776, 64 00000040:00100000:9.0:1391563589.293984:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eed:1: idx: 34923, 64 00000040:00100000:9.0:1391563589.293999:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eee:1: idx: 34477, 64 00000040:00100000:9.0:1391563589.294014:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eef:1: idx: 34434, 64 00000040:00100000:9.0:1391563589.294029:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef0:1: idx: 34534, 64 00000040:00100000:9.0:1391563589.294043:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef1:1: idx: 34941, 64 00000040:00100000:9.0:1391563589.294068:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef2:1: idx: 35734, 64 00000040:00100000:9.0:1391563589.294083:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef3:1: idx: 35572, 64 00000040:00100000:9.0:1391563589.294097:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef4:1: idx: 35245, 64 00000040:00100000:9.0:1391563589.294122:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef5:1: idx: 35781, 64 00000040:00100000:9.0:1391563589.294136:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef6:1: idx: 35071, 64 00000040:00100000:9.0:1391563589.294151:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef7:1: idx: 35647, 64 00000040:00100000:9.0:1391563589.294166:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef8:1: idx: 35247, 64 00000040:00100000:9.0:1391563589.294181:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7ef9:1: idx: 34787, 64 00000040:00100000:9.0:1391563589.294205:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7efa:1: idx: 34774, 64 00000040:00100000:9.0:1391563589.294221:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7efb:1: idx: 35288, 64 00000040:00100000:9.0:1391563589.294236:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7efc:1: idx: 35788, 64 00000040:00100000:9.0:1391563589.294253:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7efd:1: idx: 34607, 64 00000040:00100000:9.0:1391563589.294284:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7efe:1: idx: 35667, 64 00000040:00100000:9.0:1391563589.294300:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7eff:1: idx: 35319, 64 00000040:00100000:9.0:1391563589.294317:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f00:1: idx: 35735, 64 00000040:00100000:9.0:1391563589.294343:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f01:1: idx: 34925, 64 00000040:00100000:9.0:1391563589.294359:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f02:1: idx: 35698, 64 00000040:00100000:9.0:1391563589.294374:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f03:1: idx: 34234, 64 00000040:00100000:9.0:1391563589.294390:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f04:1: idx: 36572, 64 00000040:00100000:9.0:1391563589.294405:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f05:1: idx: 35444, 64 00000040:00100000:9.0:1391563589.294421:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f06:1: idx: 35313, 64 00000040:00100000:9.0:1391563589.294436:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f07:1: idx: 34590, 64 00000040:00100000:9.0:1391563589.294451:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f08:1: idx: 35230, 64 00000040:00100000:9.0:1391563589.294467:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f09:1: idx: 35106, 64 00000040:00100000:9.0:1391563589.294482:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f0a:1: idx: 35179, 64 00000040:00100000:9.0:1391563589.294498:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f0b:1: idx: 34959, 64 00000040:00100000:9.0:1391563589.294512:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x7f0c:1: idx: 36182, 64 00000040:00100000:9.0:1391563589.294532:0:18680:0:(llog_osd.c:499:llog_osd_write_rec()) added record 0x829e:1: idx: 43128, 96 00000004:00000002:9.0:1391563589.294535:0:18680:0:(mdt_recovery.c:562:mdt_txn_stop_cb()) transno = 86336738210, last_committed = 86336737853 00000004:00000002:9.0:1391563589.294876:0:18680:0:(mdt_handler.c:404:mdt_pack_attr2body()) [0x20000bd87:0x20:0x0]: nlink=0, mode=100600, size=0 00000100:00100000:9.0:1391563589.294944:0:18680:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc mdt02_078:a3cebad5-b812-989a-0fc6-cab928270ea2+85:212662:x1457223513206688:12345-172.16.66.5@tcp:36 Request procesed in 100211610us (100211668us total) trans 86336738210 rc 0/0 00000100:00100000:9.0:1391563589.294947:0:18680:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.16.66.5@tcp, seq: 322652232 This info was gathered on production systems, so I've selected key info that I thought was relevant. I will also attach the full logs, in case anyone wants to sort through and look for something else. |
| Comment by Christopher Morrone [ 05/Feb/14 ] |
|
This won't be terribly surprising, but here is the back trace from the mdt thread that handled the initial unlink-related RPC. This backtrace was taken during the middle of the 100s waiting period: #0 [ffff88100d3af8d0] schedule at ffffffff81510bd2 #1 [ffff88100d3af998] schedule_timeout+0x192 at ffffffff81511a32 #2 [ffff88100d3afa48] cfs_waitq_timedwait+0x11 at ffffffffa05bc751 [libcfs] #3 [ffff88100d3afa58] ldlm_completion_ast+0x4ed at ffffffffa089143d [ptlrpc] #4 [ffff88100d3afb08] ldlm_cli_enqueue_local+0x1f8 at ffffffffa0890b78 [ptlrpc] (inlined mdt_fid_lock) #5 [ffff88100d3afb98] mdt_object_lock0+0x33b at ffffffffa0e3ac7b [mdt] #6 [ffff88100d3afc48] mdt_object_lock+0x14 at ffffffffa0e3b4f4 [mdt] #7 [ffff88100d3afc58] mdt_reint_unlink+0x66e at ffffffffa0e57f2e [mdt] #8 [ffff88100d3afcd8] mdt_reint_rec+0x41 at ffffffffa0e54a31 [mdt] #9 [ffff88100d3afcf8] mdt_reint_internal+0x4c3 at ffffffffa0e39ae3 [mdt] #10 [ffff88100d3afd38] mdt_reint+0x44 at ffffffffa0e39de4 [mdt] #11 [ffff88100d3afd58] mdt_handle_common+0x648 at ffffffffa0e3f758 [mdt] #12 [ffff88100d3afda8] mds_regular_handle+0x15 at ffffffffa0e78405 [mdt] #13 [ffff88100d3afdb8] ptlrpc_server_handle_request+0x398 at ffffffffa08caf88 [ptlrpc] #14 [ffff88100d3afeb8] ptlrpc_main+0xace at ffffffffa08cc31e [ptlrpc] #15 [ffff88100d3aff48] child_rip+0xa at ffffffff8100c10a But this also happens to be a common stack trace from hundreds of threads on our MDS nodes when they go unresponsive for hours at a time. So I am keen to get to understand this problem soon in case it turns out to be related to our wider production Lustre problems. |
| Comment by Bruno Faccini (Inactive) [ 05/Feb/14 ] |
|
Hello Chris, |
| Comment by Bruno Faccini (Inactive) [ 05/Feb/14 ] |
|
I am also trying to find/set-up a similar platform to run your reproducer there. |
| Comment by Christopher Morrone [ 06/Feb/14 ] |
|
I don't know yet if a changelog consumer is a requirement either. I haven't been able to reproduce this problem yet in our testbed, which is both smaller and doesn't have a changelog user. But I was hoping that someone could explain to me how lustre's protocol is expected to work. Since I have a great reproducer, I'm happy to investigate any point where the protocol is not behaving as expected. For instance, could someone explain what else the server might be needed from the clients? The clients have each sent a LDLM_CANCEL...is that not sufficient? Are the clients failing to do something, or did the server just fail to correctly associate those cancels with the outstanding rpc? Are the number of RPCs generated earlier in the sequence as expected? Does three LDLM_BL_CALLBACKs for the node where the file was created and two more for the node running the changelog client make sense? What are their purpose? |
| Comment by Bruno Faccini (Inactive) [ 06/Feb/14 ] |
|
I think it is likely that the multiple LDLM_CANCEL answers from the Clients have been packed in the same/single RPC. But to definitely validate these assumptions I need more/full traces. So, could it be possible that you enable both rpctrace and dlmtrace on Clients and Server/MDS sides, also set /proc/sys/lustre/dump_on_eviction (works only on Clients I think, so you will need to "lctl dk" on MDS) and also may be grow the debug-log buffer size, in order to get all traces after running your reproducer again? Or may be you already have this in the full logs you promised to attach ? BTW, the "rc -22/-22" during one of the LDLM_BL_CALLBACK you posted looks also suspicious. |
| Comment by Christopher Morrone [ 06/Feb/14 ] |
|
Doh! I definitely uploaded the logs to some ticket, but I must have accidentally attached them to the wrong one. Here the are in the right place. |
| Comment by Christopher Morrone [ 06/Feb/14 ] |
Ah, is that perhaps what this shows? 00000100:00100000:5.0F:1391563489.084744:0:10919:0:(ptlrpcd.c:387:ptlrpcd_check()) transfer 1 async RPCs [2->3] |
| Comment by Christopher Morrone [ 07/Feb/14 ] |
|
See the attached file "client_log_20140206.txt". I can't get more detailed logs from the MDS right now because it is too heavily loaded to make higher debugging level practical. But the new client log is from the client where I created a file named "foo". I did the "rm foo" on another client. This time I added "rpctrace dlmtrace vfstrace" to the log level on the client. dlmtrace is new in this log. With that, you can see better where the -22 comes from on the client. |
| Comment by Bruno Faccini (Inactive) [ 07/Feb/14 ] |
|
No, this trace means that 1 RPC has been handled by ptlrpcd #3 and stolen from partner #2, the packed requests will appear in the "CDEBUG(D_DLMTRACE, "%d locks packed\n", packed);" msg/trace from ldlm_cancel_pack(). |
| Comment by Bruno Faccini (Inactive) [ 07/Feb/14 ] |
|
BTW, in the last Client (the node doing the "touch") log you attached, there is only one ldlm_cancel_pack() trace reporting a single LDLM_CANCEL RPC to be sent back to MDS. And it concerns only 2 of the LDLM_BL_CALLBACKs sent by the Server for the same lock/handle ffff8801f8c5e200/0x6aff086a1c937ce8, as identified by the Client. And the -22/-EINVAL is for the 3rd LDLM_BL_CALLBACK and it concerns an other handle (0x6aff086a1c937ce1) that the Client can't find the corresponding lock : 00000100:00100000:3.0F:1391738160.718872:0:4105:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:5.0F:1391738160.718889:0:4108:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:0.0F:1391738160.718900:0:4106:0:(events.c:352:request_in_callback()) peer: 12345-172.20.20.201@o2ib500 00000100:00100000:8.0F:1391738160.718910:0:228232:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673631329952 00000100:00100000:8.0:1391738160.718931:0:228232:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673631329956 00000100:00100000:8.0:1391738160.718935:0:228232:0:(service.c:1867:ptlrpc_server_handle_req_in()) got req x1458673631329960 00000100:00100000:10.0F:1391738160.718940:0:177839:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 734293 00000100:00100000:8.0:1391738160.718954:0:228232:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 734294 00000100:00100000:10.0:1391738160.718961:0:177839:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_005:LMV_MDC_UUID+8:18904:x1458673631329952:12345-172.20.20.201@o2ib500:104 00000100:00100000:8.0:1391738160.718962:0:228232:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_003:LMV_MDC_UUID+8:18904:x1458673631329956:12345-172.20.20.201@o2ib500:104 00000100:00100000:9.0F:1391738160.718968:0:5327:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-172.20.20.201@o2ib500, seq: 734295 00000100:00100000:9.0:1391738160.718978:0:5327:0:(service.c:2011:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_001:LMV_MDC_UUID+8:18904:x1458673631329960:12345-172.20.20.201@o2ib500:104 00010000:00010000:9.0:1391738160.718988:0:5327:0:(ldlm_lockd.c:2242:ldlm_callback_handler()) callback on lock 0x6aff086a1c937ce1 - lock disappeared 00000100:00100000:8.0:1391738160.718993:0:228232:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_003:LMV_MDC_UUID+8:18904:x1458673631329956:12345-172.20.20.201@o2ib500:104 Request procesed in 37us (106us total) trans 0 rc 0/0 00000100:00100000:10.0:1391738160.718996:0:177839:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_005:LMV_MDC_UUID+8:18904:x1458673631329952:12345-172.20.20.201@o2ib500:104 Request procesed in 42us (126us total) trans 0 rc 0/0 00000100:00100000:8.0:1391738160.718996:0:228232:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 734294 00000100:00100000:10.0:1391738160.718999:0:177839:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 734293 00010000:00010000:9.0:1391738160.719012:0:5327:0:(ldlm_lockd.c:2095:ldlm_callback_errmsg()) @@@ Operate with invalid parameter: [nid 12345-172.20.20.201@o2ib500] [rc 0] [lock 0x6aff086a1c937ce1] req@ffff8803db6f7c00 x1458673631329960/t0(0) o104->LMV_MDC_UUID@172.20.20.201@o2ib500:0/0 lens 296/192 e 0 to 0 dl 1391738221 ref 1 fl Interpret:/0/0 rc -22/-22 00000100:00100000:9.0:1391738160.719019:0:5327:0:(service.c:2055:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ldlm_cb02_001:LMV_MDC_UUID+6:18904:x1458673631329960:12345-172.20.20.201@o2ib500:104 Request procesed in 48us (121us total) trans 0 rc -22/-22 00000100:00100000:9.0:1391738160.719022:0:5327:0:(nrs_fifo.c:244:nrs_fifo_req_stop()) NRS stop fifo request from 12345-172.20.20.201@o2ib500, seq: 734295 00010000:00010000:10.0:1391738160.719028:0:178165:0:(ldlm_lockd.c:1696:ldlm_handle_bl_callback()) ### client blocking AST callback handler ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/0,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x420000000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:8.0:1391738160.719031:0:178073:0:(ldlm_lockd.c:1696:ldlm_handle_bl_callback()) ### client blocking AST callback handler ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/0,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x420000000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:10.0:1391738160.719032:0:178165:0:(ldlm_lockd.c:1709:ldlm_handle_bl_callback()) Lock ffff8801f8c5e200 already unused, calling callback (ffffffffa0c7f470) 00010000:00010000:10.0:1391738160.719034:0:178165:0:(ldlm_request.c:1127:ldlm_cli_cancel_local()) ### client-side cancel ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 4/0,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x428400000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:8.0:1391738160.719034:0:178073:0:(ldlm_lockd.c:1709:ldlm_handle_bl_callback()) Lock ffff8801f8c5e200 already unused, calling callback (ffffffffa0c7f470) 00010000:00010000:8.0:1391738160.719036:0:178073:0:(ldlm_request.c:1358:ldlm_cli_cancel()) ### lock is already being destroyed 00010000:00010000:8.0:1391738160.719037:0:178073:0:(ldlm_lockd.c:1718:ldlm_handle_bl_callback()) ### client blocking callback handler END ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 4/0,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x428400000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:10.0:1391738160.719076:0:178165:0:(ldlm_request.c:1186:ldlm_cancel_pack()) ### packing ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 2/0,0 mode: --/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0xc29400000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:10.0:1391738160.719079:0:178165:0:(ldlm_request.c:1190:ldlm_cancel_pack()) 1 locks packed 00010000:00010000:10.0:1391738160.719082:0:178165:0:(ldlm_lockd.c:1718:ldlm_handle_bl_callback()) ### client blocking callback handler END ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 1/0,0 mode: --/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0xc29400000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:10.0:1391738160.719085:0:178165:0:(ldlm_lock.c:219:ldlm_lock_put()) ### final lock_put on destroyed lock, freeing it. ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 0/0,0 mode: --/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0xc29400000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00000100:00100000:8.0:1391738160.719148:0:5249:0:(ptlrpcd.c:387:ptlrpcd_check()) transfer 1 async RPCs [11->10] 00000100:00100000:8.0:1391738160.719152:0:5249:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_10:a3cebad5-b812-989a-0fc6-cab928270ea2:5249:1457223537877828:172.20.20.201@o2ib500:103 00000100:00100000:8.0:1391738160.719934:0:5249:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_10:a3cebad5-b812-989a-0fc6-cab928270ea2:5249:1457223537877828:172.20.20.201@o2ib500:103 Going up in the log we can see the following traces sequence for 0x6aff086a1c937ce1 (failing) handle : 00000080:00200000:1.0:1391738148.867174:0:217102:0:(namei.c:503:ll_lookup_it()) VFS Op:name=foo,dir=144115923974954140/33554603(ffff881e3b002bf8),intent=open|creat 00000100:00100000:1.0:1391738148.867209:0:217102:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877616:172.20.20.201@o2ib500:700 00000100:00100000:1.0:1391738148.867234:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:1.0:1391738148.868002:0:217102:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877616:172.20.20.201@o2ib500:700 00000002:00010000:1.0:1391738148.868013:0:217102:0:(mdc_locks.c:1083:mdc_intent_lock()) (name: foo,[0x20000be9b:0x1:0x0]) in obj [0x20000ab57:0x11c9c:0x0], intent: open|creat flags 0104502 00010000:00010000:1.0:1391738148.868041:0:217102:0:(ldlm_lock.c:796:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(CW) ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 3/0,1 mode: --/CW res: 8589978455/72860 bits 0x0 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.868046:0:217102:0:(ldlm_request.c:926:ldlm_cli_enqueue()) ### client-side enqueue START, flags 1000 ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 3/0,1 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.868050:0:217102:0:(ldlm_request.c:988:ldlm_cli_enqueue()) ### sending request ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 3/0,1 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00000100:00100000:1.0:1391738148.868054:0:217102:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877620:172.20.20.201@o2ib500:101 00000100:00100000:1.0:1391738148.868065:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:10.0F:1391738148.878667:0:4112:0:(events.c:126:reply_in_callback()) truncate req ffff880d7a7ab800 rpc 1136 - 2232+192 00000100:00100000:1.0:1391738148.878772:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:1.0:1391738148.879522:0:217102:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877620:172.20.20.201@o2ib500:101 00010000:00010000:1.0:1391738148.879539:0:217102:0:(ldlm_request.c:552:ldlm_cli_enqueue_fini()) ### client-side enqueue END (ABORTED) ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 4/0,1 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879545:0:217102:0:(ldlm_request.c:501:failed_lock_cleanup()) ### setting FL_LOCAL_ONLY | LDLM_FL_FAILED | LDLM_FL_ATOMIC_CB | LDLM_FL_CBPENDING ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 4/0,1 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x206400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879548:0:217102:0:(ldlm_lock.c:849:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(CW) ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 4/0,1 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x206400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879551:0:217102:0:(ldlm_lock.c:900:ldlm_lock_decref_internal()) ### final decref done on cbpending lock ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 3/0,0 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x206400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879554:0:217102:0:(ldlm_lockd.c:1696:ldlm_handle_bl_callback()) ### client blocking AST callback handler ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 4/0,0 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x206400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879556:0:217102:0:(ldlm_lockd.c:1709:ldlm_handle_bl_callback()) Lock ffff8801f8c5e800 already unused, calling callback (ffffffffa0c7f470) 00010000:00010000:1.0:1391738148.879558:0:217102:0:(ldlm_request.c:1127:ldlm_cli_cancel_local()) ### client-side cancel ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 5/0,0 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0x20e400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879562:0:217102:0:(ldlm_request.c:1140:ldlm_cli_cancel_local()) not sending request (at caller's instruction) 00010000:00010000:1.0:1391738148.879564:0:217102:0:(ldlm_lockd.c:1718:ldlm_handle_bl_callback()) ### client blocking callback handler END ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 3/0,0 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0xa0f400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00010000:00010000:1.0:1391738148.879567:0:217102:0:(ldlm_lock.c:219:ldlm_lock_put()) ### final lock_put on destroyed lock, freeing it. ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e800/0x6aff086a1c937ce1 lrc: 0/0,0 mode: --/CW res: 8589978455/72860 bits 0x2 rrc: 1 type: IBT flags: 0xa0f400000000 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 0 00000002:00100000:1.0:1391738148.879580:0:217102:0:(mdc_locks.c:579:mdc_finish_enqueue()) @@@ op: 3 disposition: 37, status: 0 req@ffff880d7a7ab800 x1457223537877620/t86458366655(86458366655) o101->fsv-MDT0000-mdc-ffff882019e97c00@172.20.20.201@o2ib500:12/10 lens 576/2232 e 0 to 0 dl 1391738254 ref 2 fl Complete:RP/4/0 rc 301/301 00000002:00100000:1.0:1391738148.879586:0:217102:0:(mdc_request.c:781:mdc_set_open_replay_data()) @@@ Set up open replay data req@ffff880d7a7ab800 x1457223537877620/t86458366655(86458366655) o101->fsv-MDT0000-mdc-ffff882019e97c00@172.20.20.201@o2ib500:12/10 lens 576/2232 e 0 to 0 dl 1391738254 ref 2 fl Complete:RP/4/0 rc 301/301 and then in immediate continuation the traces sequence for the "valid" handle (still within the same "touch" process context) : 00000080:00200000:1.0:1391738148.879598:0:217102:0:(file.c:3001:ll_inode_permission()) VFS Op:inode=144115923974954140/33554603(ffff881e3b002bf8), inode mode 41c0 mask 3 00000080:00200000:1.0:1391738148.879600:0:217102:0:(namei.c:848:ll_create_it()) VFS Op:name=foo,dir=144115923974954140/33554603(ffff881e3b002bf8),intent=open|creat 00000080:00200000:1.0:1391738148.879642:0:217102:0:(llite_lib.c:1913:ll_read_inode2()) VFS Op:inode=[0x20000be9b:0x1:0x0](ffff880fcd53f678) 00000080:00200000:1.0:1391738148.879750:0:217102:0:(namei.c:170:ll_iget()) got inode: ffff880fcd53f678 for [0x20000be9b:0x1:0x0] 00000080:00010000:1.0:1391738148.879754:0:217102:0:(namei.c:817:ll_create_node()) setting l_ast_data to inode ffff880fcd53f678 (144116006720110593/33554622) 00000080:00010000:1.0:1391738148.879755:0:217102:0:(llite_internal.h:1541:ll_set_lock_data()) setting l_data to inode ffff880fcd53f678 (144116006720110593/33554622) for lock 0x0 00000080:00200000:1.0:1391738148.879759:0:217102:0:(file.c:520:ll_file_open()) VFS Op:inode=144116006720110593/33554622(ffff880fcd53f678), flags 104501 00000002:00100000:1.0:1391738148.879767:0:217102:0:(mdc_request.c:781:mdc_set_open_replay_data()) @@@ Set up open replay data req@ffff880d7a7ab800 x1457223537877620/t86458366655(86458366655) o101->fsv-MDT0000-mdc-ffff882019e97c00@172.20.20.201@o2ib500:12/10 lens 1688/2232 e 0 to 0 dl 1391738254 ref 2 fl Complete:RP/4/0 rc 301/301 00000080:00200000:1.0:1391738148.879772:0:217102:0:(file.c:3001:ll_inode_permission()) VFS Op:inode=144116006720110593/33554622(ffff880fcd53f678), inode mode 8180 mask 0 00000080:00200000:1.0:1391738148.879784:0:217102:0:(llite_lib.c:1501:ll_setattr_raw()) fsv: setattr inode ffff880fcd53f678/fid:[0x20000be9b:0x1:0x0] from 0 to 18446612189935735584, valid 70 00000100:00100000:1.0:1391738148.879797:0:217102:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877624:172.20.20.201@o2ib500:36 00000100:00100000:1.0:1391738148.879808:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:1.0:1391738148.880673:0:217102:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877624:172.20.20.201@o2ib500:36 00000080:00010000:1.0:1391738148.880691:0:217102:0:(file.c:3493:ll_layout_refresh()) ### fsv: requeue layout lock for file ffff880fcd53f678/[0x20000be9b:0x1:0x0]. 00010000:00010000:1.0:1391738148.880703:0:217102:0:(ldlm_lock.c:796:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(CR) ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: --/CR res: 8589983387/1 bits 0x0 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1391738148.880706:0:217102:0:(ldlm_request.c:926:ldlm_cli_enqueue()) ### client-side enqueue START, flags 1000 ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: --/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1391738148.880709:0:217102:0:(ldlm_request.c:988:ldlm_cli_enqueue()) ### sending request ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: --/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00000100:00100000:1.0:1391738148.880712:0:217102:0:(client.c:1441:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877628:172.20.20.201@o2ib500:101 00000100:00100000:1.0:1391738148.880724:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:6.0F:1391738148.881445:0:4110:0:(events.c:126:reply_in_callback()) truncate req ffff8808735dc000 rpc 416 - 1496+192 00000100:00100000:1.0:1391738148.881499:0:217102:0:(client.c:2132:ptlrpc_set_wait()) set ffff8804dd189200 going to sleep for 61 seconds 00000100:00100000:1.0:1391738148.882288:0:217102:0:(client.c:1809:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc touch:a3cebad5-b812-989a-0fc6-cab928270ea2:217102:1457223537877628:172.20.20.201@o2ib500:101 00010000:00010000:1.0:1391738148.882295:0:217102:0:(ldlm_lock.c:1090:ldlm_granted_list_add_lock()) ### About to add lock: ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 4/1,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1391738148.882303:0:217102:0:(ldlm_request.c:704:ldlm_cli_enqueue_fini()) ### client-side enqueue END ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 4/1,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00000002:00100000:1.0:1391738148.882307:0:217102:0:(mdc_locks.c:579:mdc_finish_enqueue()) @@@ op: 1024 disposition: 0, status: 0 req@ffff8808735dc000 x1457223537877628/t0(0) o101->fsv-MDT0000-mdc-ffff882019e97c00@172.20.20.201@o2ib500:12/10 lens 376/1496 e 0 to 0 dl 1391738254 ref 1 fl Complete:R/2/0 rc 0/0 00000002:00010000:1.0:1391738148.882311:0:217102:0:(mdc_locks.c:699:mdc_finish_enqueue()) ### layout lock returned by: layout, lvb_len: 1184 ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00000080:00010000:1.0:1391738148.882317:0:217102:0:(llite_internal.h:1541:ll_set_lock_data()) setting l_data to inode ffff880fcd53f678 (144116006720110593/33554622) for lock 0x6aff086a1c937ce8 00000080:00010000:1.0:1391738148.882320:0:217102:0:(file.c:3330:ll_layout_lock_set()) ### File ffff880fcd53f678/[0x20000be9b:0x1:0x0] being reconfigured: 1. ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x0 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00000080:00200000:1.0:1391738148.882346:0:217102:0:(vvp_object.c:142:vvp_conf_set()) layout lock change: 4294967294 -> 0 00010000:00010000:1.0:1391738148.882359:0:217102:0:(ldlm_lock.c:849:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(CR) ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 3/1,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x20000000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1391738148.882362:0:217102:0:(ldlm_lock.c:917:ldlm_lock_decref_internal()) ### add lock into lru list ns: fsv-MDT0000-mdc-ffff882019e97c00 lock: ffff8801f8c5e200/0x6aff086a1c937ce8 lrc: 2/0,0 mode: CR/CR res: 8589983387/1 bits 0x8 rrc: 1 type: IBT flags: 0x20000000000 nid: local remote: 0xf70254b5f35563f2 expref: -99 pid: 217102 timeout: 0 lvb_type: 3 ………. So it looks like the Server/MDS sent a ELDLM_LOCK_ABORTED, but was still expecting the Client to cancel it when Client found it not necessary ? |
| Comment by Christopher Morrone [ 07/Feb/14 ] |
We mirror our branches on github. See http://github.com/chaos/lustre. Both of the mentioned tags, 2.4.0-19chaos and 2.4.0-21chaos, are on the 2.4.0-llnl branch. |
| Comment by Bruno Faccini (Inactive) [ 13/Feb/14 ] |
|
The diffs between 2.4.0-19chaos and 2.4.0-21chaos are : commit 4496fd9562bb1b2b0d9e1f925d7771e8dd795f4d
Author: wang di <di.wang@intel.com>
Date: Thu Dec 12 10:33:56 2013 -0800
LU-4222 mdt: extra checking for getattr RPC.
It should check whether getattr RPC can hold
layout MD(RMF_MDT_MD), in case the client sends
some invalid RPC, which can cause panic on MDT.
Client will retrieve cl_max_md_size/cl_default_md_size
from MDS during mount process, so it will initialize
cl_max_md_size/cl_default_md_size before sending getattr
to MDS.
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ic12f281d574e7138ee0e1d5aae6161b7607837cc
commit 4cd160ea66f09c71ff5cdced1bd19c09998b69b1
Author: Bruno Faccini <bruno.faccini@intel.com>
Date: Thu Nov 14 17:20:00 2013 +0100
LU-4194 ldlm: set l_lvb_type coherent when layout is returned
In case layout has been packed into server reply when not
requested, lock l_lvb_type must be set accordingly.
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Iaf54c9ba27785e529f4f2fb967d2fad4fc1dfbcb
commit fa3ec2e46e0e57d0c05ada2e20276cb86b3d9840
Author: Christopher J. Morrone <morrone2@llnl.gov>
Date: Fri Nov 15 13:40:19 2013 -0800
LU-4194 ldlm: Make OBD_[ALLOC|FREE]_LARGE use consistent
struct ldlm_lock's l_lvb_data field is freed in ldlm_lock_put()
using OBD_FREE. However, some other code paths can attach
a buffer to l_lvb_data that was allocated using OBD_ALLOC_LARGE.
This can lead to a kfree() of a vmalloc()ed buffer, which can
trigger a kernel Oops.
Change-Id: Ic75a67530862eeb4d065c14bbbac80939bff5731
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
commit 3e6edac776dab99b6de5db91406bbc6350a2a4d9
Author: Niu Yawei <yawei.niu@intel.com>
Date: Wed Nov 6 02:52:32 2013 -0500
LU-2619 quota: add debug message in osd_acct_index_lookup()
Added debug messages in osd_acct_index_lookup().
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I6d7a1a55c0772255f8d4f4ec0f36cd69f73165ec
commit 101bf16586f8a1398be2c05c514ec14ab619b57e
Author: wang di <di.wang@intel.com>
Date: Thu Aug 21 20:47:26 2014 -0700
LU-4179 mdt: skip open lock enqueue during resent
Skip open lock enqueue, if the open lock has been
acquired(mdt_intent_fixup_resent) during resent.
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I625ca438e28520416ee2af884d0a9f9e6f21cf2e
commit e522f74ce63262fb16b747785c53f809dd19079a
Author: Hongchao Zhang <hongchao.zhang@intel.com>
Date: Mon Jul 22 05:40:37 2013 +0800
LU-2613 recovery: free open/close request promptly
- For the non-create open or committed open, the open request
should be freed along with the close request as soon as the
close done, despite that the transno of open/close is
greater than the last committed transno known by client or not.
- Move the committed open request into another dedicated list,
that will avoid scanning a huge replay list on receiving each
reply (when there are many open files).
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Change-Id: I1a25a35fe7a16681368d92d16964680b6209b3ee
Reviewed-on: http://review.whamcloud.com/6665
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
And I did not found anything in these changes that may indicate any incompatibility between Clients running 2.4.0-19chaos and Servers running 2.4.0-21chaos. I am now investigating the ELDLM_LOCK_ABORTED reason/handling in the code. |
| Comment by Zhenyu Xu [ 14/Feb/14 ] |
|
from 172.20.20.201@o2ib500.log (MDS log) 00010000:00020000:8.1:1391563589.283179:0:0:0:(ldlm_lockd.c:391:waiting_locks_callback()) ### lock callback timer expired after 302s: evicting client at 172.16.66.4@tcp ns: mdt-fsv-MDT0000_UUID lock: ffff880f17f60200/0xf70254b1ce2bd28a lrc: 3/0,0 mode: CR/CR res: 8589983111/32 bits 0x8 rrc: 3 type: IBT flags: 0x20 nid: 172.16.66.4@tcp remote: 0x1dfa39e9c64af6c1 expref: 2469 pid: 17054 timeout: 4759057840 lvb_type: 3 the problem lock's bits is 0x8 (MDS_INODELOCK_LAYOUT), its lvb_type is 3 (LVB_T_LAYOUT), it's something about layout lock issue here. |
| Comment by Bruno Faccini (Inactive) [ 17/Feb/14 ] |
|
Thanks Bobi to have a look on the Server side/trace when I only focused on Client side and in client_log_20140206.txt log until now ... But anyway it is not really helpful too since it is the only trace about this particular lock in both Client and Server logs. So we are stuck here and as we discussed I started to build a platform to try to reproduce with full debug enabled. And see if problem is related to large striping layout, as we suspect actually, or anything else. |
| Comment by Bruno Faccini (Inactive) [ 17/Feb/14 ] |
|
I builded a configuration to allow me 48 stripe count on zfs, with robinhood, but I am unable to reproduce at the moment… I wonder if you could try to configure LDLM/RPC traces with a big debug buffer on your MDS, with dump_on_evict set on Clients, and re-run the reproducer during some quiet period to allow to get detailled traces for both sides ? |
| Comment by Bruno Faccini (Inactive) [ 18/Feb/14 ] |
|
Finally I have been able to reproduce locally, and running with Lustre 2.4.0-RC2 only !! So I have the full traces for both MDS and Clients sides. 00010000:00000040:29.0:1392685427.887184:0:127322:0:(ldlm_resource.c:1429:ldlm_resource_dump()) ### ### ns: mdt-lustrez-MDT0000_UUID lock: ffff880811cc8a00/0x2609ba00ab659d1c lrc: 2/0,0 mode: CR/CR res: 8589935617/2 bits 0x9 rrc: 9 type: IBT flags: 0x200000000000 nid: 10.10.18.181@tcp remote: 0x6963eaa818aefa09 expref: 15 pid: 103045 timeout: 0 lvb_type: 0 00010000:00000040:29.0:1392685427.887190:0:127322:0:(ldlm_resource.c:1429:ldlm_resource_dump()) ### ### ns: mdt-lustrez-MDT0000_UUID lock: ffff8808030dbc00/0x2609ba00ab659cf9 lrc: 2/0,0 mode: CR/CR res: 8589935617/2 bits 0x9 rrc: 9 type: IBT flags: 0x200000000000 nid: 10.10.18.181@tcp remote: 0x6963eaa818aefa09 expref: 15 pid: 103045 timeout: 0 lvb_type: 0 More to come. |
| Comment by Jinshan Xiong (Inactive) [ 18/Feb/14 ] |
|
Hi Bruno, Good job reproducing it local side. "And the -22/-EINVAL is for the 3rd LDLM_BL_CALLBACK and it concerns an other handle (0x6aff086a1c937ce1) that the Client can't find the corresponding lock " - Usually this means that the lock has already been canceled from the client side. This is a normal race. |
| Comment by Christopher Morrone [ 18/Feb/14 ] |
Just FYI, 2.4.0-RC2 and 2.4.0 tag the same commit. Are you suggesting that later 2.4.X tags do not have this problem? |
| Comment by Bruno Faccini (Inactive) [ 19/Feb/14 ] |
|
Chris: No, I just wanted to indicate that according to our/internal tag's nomenclature, I was able to reproduce with the latest 2.4.0-RC2 build running on both Client/Server sides. I will try to reproduce with later 2.4.1-RC2/2.4.2-RC2 versions and let you know. Jinshan: Ok, for the -22/EINVAL result due to the "normal" race. But concerning the problem and after more local reproducer traces analysis, it looks like on the Server side we are able to deal-with/create 2 identical IBITs locks/handles for the same Client/remote, but then the Client only answer to one during Blocking-AST/Cancel process and simply trashes the other, which seems to lead to this situation when later the Server evicts the Client due to no answer for the 2nd lock. |
| Comment by Christopher Morrone [ 19/Feb/14 ] |
|
I think you guys only use the -RC2 nomenclature while it is a release candidate, and then drop the -RC2 when the real release is done. There is no reason to keep using the release candidate number when the release is finished. |
| Comment by Bruno Faccini (Inactive) [ 20/Feb/14 ] |
|
More Lustre debug traces analysis seems to indicate that the "orphan" lock on the MDS causing the Client eviction is the one from the 1st open request and acquired via mdt_reint_internal()/mdt_reint_rec()/mdt_reint_open()/mdt_object_open_lock() when the "successfully" canceled one is from the 2nd/reconstructed open request via mdt_reint_internal()/mdt_check_resent()/mdt_reconstruct_open()/mdt_reint_open()/mdt_object_open_lock(). Strange is that it should have been fixed with patch http://review.whamcloud.com/8142 from Also, I need to find what is the reason of the reconstruct open situation. |
| Comment by Bruno Faccini (Inactive) [ 20/Feb/14 ] |
|
The reason of the resend (only, but no replay) from the Client is because the 1st reply from the Server has been truncated, due to a reply buffer of only 2048 bytes being reserved when the answer (including the layout for a 48 stripe width) required length was 2232 bytes. |
| Comment by Bruno Faccini (Inactive) [ 20/Feb/14 ] |
|
I am looking for a possible fix (Client to ask for replay upon open reply truncate?), and also a possible work-around (like to grow the "default" reply buffer?). |
| Comment by Di Wang [ 20/Feb/14 ] |
|
I did not read through the whole discussion, but if the client resends RPC with open lock request, and server should return the original one. If you see two locks there, something might be wrong IMHO. |
| Comment by Jinshan Xiong (Inactive) [ 21/Feb/14 ] |
|
Hi Bruno, The patch 8142 only applied to open by name path. The same fix has to be applied to mdt_open_by_fid_lock() code path as well. Will you please create a patch for it? Jinshan |
| Comment by Bruno Faccini (Inactive) [ 21/Feb/14 ] |
|
Di: thanks to point on Jinshan: I created patch to reproduce Change #8142 from Chris: what was the reason for running with 2.4.0-21chaos on Servers that include patch from Last, I pushed patch http://review.whamcloud.com/9326 to give a try to my own thought that Client should ask for replay if a truncate occurs during open and request is resent. |
| Comment by Christopher Morrone [ 21/Feb/14 ] |
|
Yes, we pulled that in because our MDS crashed with the same assertion and stack as |
| Comment by Bruno Faccini (Inactive) [ 25/Feb/14 ] |
|
Some more results/facts : _ reproducer triggers the problem on the latest b2_4 (ie, 2.4.2-RC2 + latest patches) and b2_5 builds. Will also try with latest master but I suspect problem will also be there. _ change #9342, where I simply reproduced the new controls and behavior to to skip open_lock upon resend in mdt_reint_open() (by patch #8142 for _ I abandonned change #9326, because after more analysis of the replay mechanisms, I found this way to be wrong. The resend mechanisms are ok (with patch #8142 from _ I had a discussion with Johann on this and now I am working on a debug patch to allow more infos to be printed upon resend during 1st lock retrieval with hash-mechanism that seem to fail, and also during 2nd/excess lock blocking-callback that seem also to fail (Client must answer, even return error) … |
| Comment by Bruno Faccini (Inactive) [ 27/Feb/14 ] |
|
According to more reproducer simplification (no more RobinHood required, now made only of "dd" command to create a 48-stripes file and a "file" command in a raw on a same node and then "rm" from an other node) and more debug trace analysis, it definitely seems that, as already indicated before, things start to go wrong when a 2nd lock is enqueued on Server upon truncate+resend, and then there are 2 possible reasons to explain why 1st lock is not found/lookup : _ 1st lock has not been hashed, or been unhashed _ hashing mechanism is buggy I pushed a debug patch at http://review.whamcloud.com/9421, that will help to find the real cause now. Also interesting, I have not been able to reproduce with latest master at this time …. |
| Comment by Bruno Faccini (Inactive) [ 03/Mar/14 ] |
|
Latest news, and not of lower interest, for this ticket : _ finally I have been able to reproduce with latest master !!! _ I have been able to reproduce the problem running with a b2_4 build including my debug patch, and I am now analyzing the traces. 1st constatation already is that the hashing mechanism is working and is not causing the problem. More to come. |
| Comment by Bruno Faccini (Inactive) [ 03/Mar/14 ] |
|
BTW, more indeep analysis of the Lustre debug logs generated by my debug patch, seems that the fact the old/1st lock is not found during hash lookup is due to the new lock already being hashed !! Will try to push a new patch/fix proposal where new_lock hashing will be deferred, or a way to find the old one during lookup if the new one is found 1st ... |
| Comment by Bruno Faccini (Inactive) [ 04/Mar/14 ] |
|
So, thanks to my additional debug traces from #9421!, the fact that the 1st/original lock is not found upon resend handling on Server, is due to the early new lock create+hash in ldlm_handle_enqueue0() that the cfs_hash_lookup() call in mdt_intent_fixup_resent() returns instead of the 1st/old one. I will try to develop a fix where mdt_intent_fixup_resent() uses cfs_hash_for_each_key() intead and with a callback that will simply return ok if the current/hashed lock is not new_lock, to allow the old/resent lock, if any, to be looked-up. |
| Comment by Bruno Faccini (Inactive) [ 04/Mar/14 ] |
|
New patch/fix attempt that implement previous comment solution is at http://review.whamcloud.com/9488. |
| Comment by Bruno Faccini (Inactive) [ 07/Mar/14 ] |
|
With patch-set #3 of #9488, running the reproducer I triggered the crash/LBUG tracked in LustreError: 45837:0:(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) ASSERTION( lock->l_readers > 0 ) failed: LustreError: 45837:0:(ldlm_lock.c:851:ldlm_lock_decref_internal_nolock()) LBUG Pid: 45837, comm: mdt00_002 Call Trace: [<ffffffffa105a895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa105ae97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa1218402>] ldlm_lock_decref_internal_nolock+0xd2/0x180 [ptlrpc] [<ffffffffa121cafd>] ldlm_lock_decref_internal+0x4d/0xaa0 [ptlrpc] [<ffffffffa1065d98>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa121dfb9>] ldlm_lock_decref+0x39/0x90 [ptlrpc] [<ffffffffa1791723>] mdt_save_lock+0x63/0x300 [mdt] [<ffffffffa1265890>] ? lustre_swab_ldlm_reply+0x0/0x40 [ptlrpc] [<ffffffffa1791a1c>] mdt_object_unlock+0x5c/0x160 [mdt] [<ffffffffa17bfccc>] mdt_object_open_unlock+0xac/0x110 [mdt] [<ffffffffa17c6ce4>] mdt_reint_open+0xdd4/0x21d0 [mdt] [<ffffffffa17c876c>] mdt_reconstruct_open+0x68c/0xc30 [mdt] [<ffffffffa106b2d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa17bb415>] mdt_reconstruct+0x45/0x120 [mdt] [<ffffffffa1796d1b>] mdt_reint_internal+0x6bb/0x780 [mdt] [<ffffffffa17970ad>] mdt_intent_reint+0x1ed/0x520 [mdt] [<ffffffffa1794f5e>] mdt_intent_policy+0x39e/0x720 [mdt] [<ffffffffa121a831>] ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc] [<ffffffffa12411cf>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] [<ffffffffa17953e6>] mdt_enqueue+0x46/0xe0 [mdt] [<ffffffffa179c797>] mdt_handle_common+0x647/0x16d0 [mdt] [<ffffffffa17d56e5>] mds_regular_handle+0x15/0x20 [mdt] [<ffffffffa12733b8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [<ffffffffa106ce05>] ? lc_watchdog_touch+0xd5/0x170 [libcfs] [<ffffffffa126a719>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffffa106b2d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffff81055ad3>] ? __wake_up+0x53/0x70 [<ffffffffa127474e>] ptlrpc_main+0xace/0x1700 [ptlrpc] [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG Pid: 45837, comm: mdt00_002 Tainted: P --------------- 2.6.32-358.23.2.el6_lustre.gc4f4d18.x86_64 #1 Call Trace: [<ffffffff8150deec>] ? panic+0xa7/0x16f [<ffffffffa105aeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa1218402>] ? ldlm_lock_decref_internal_nolock+0xd2/0x180 [ptlrpc] [<ffffffffa121cafd>] ? ldlm_lock_decref_internal+0x4d/0xaa0 [ptlrpc] [<ffffffffa1065d98>] ? libcfs_log_return+0x28/0x40 [libcfs] [<ffffffffa121dfb9>] ? ldlm_lock_decref+0x39/0x90 [ptlrpc] [<ffffffffa1791723>] ? mdt_save_lock+0x63/0x300 [mdt] [<ffffffffa1265890>] ? lustre_swab_ldlm_reply+0x0/0x40 [ptlrpc] [<ffffffffa1791a1c>] ? mdt_object_unlock+0x5c/0x160 [mdt] [<ffffffffa17bfccc>] ? mdt_object_open_unlock+0xac/0x110 [mdt] [<ffffffffa17c6ce4>] ? mdt_reint_open+0xdd4/0x21d0 [mdt] [<ffffffffa17c876c>] ? mdt_reconstruct_open+0x68c/0xc30 [mdt] [<ffffffffa106b2d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa17bb415>] ? mdt_reconstruct+0x45/0x120 [mdt] [<ffffffffa1796d1b>] ? mdt_reint_internal+0x6bb/0x780 [mdt] [<ffffffffa17970ad>] ? mdt_intent_reint+0x1ed/0x520 [mdt] [<ffffffffa1794f5e>] ? mdt_intent_policy+0x39e/0x720 [mdt] [<ffffffffa121a831>] ? ldlm_lock_enqueue+0x361/0x8d0 [ptlrpc] [<ffffffffa12411cf>] ? ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc] [<ffffffffa17953e6>] ? mdt_enqueue+0x46/0xe0 [mdt] [<ffffffffa179c797>] ? mdt_handle_common+0x647/0x16d0 [mdt] [<ffffffffa17d56e5>] ? mds_regular_handle+0x15/0x20 [mdt] [<ffffffffa12733b8>] ? ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc] [<ffffffffa106ce05>] ? lc_watchdog_touch+0xd5/0x170 [libcfs] [<ffffffffa126a719>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc] [<ffffffffa106b2d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffff81055ad3>] ? __wake_up+0x53/0x70 [<ffffffffa127474e>] ? ptlrpc_main+0xace/0x1700 [ptlrpc] [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffffa1273c80>] ? ptlrpc_main+0x0/0x1700 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 This is because the lock (created or not) is expected to have at least a reference when leaving mdt_reint_open(). But fix from So, after discussing this with Johann again, starting with patch-set #4, I take a reference now on old lock found to be resent. |
| Comment by Bruno Faccini (Inactive) [ 07/Mar/14 ] |
|
My local+intensive tests of patch-set #5 and running the reproducer are successful! Actually I take/restore the required reference on old/orig lock now found upon truncate/resend (thus avoiding a new/duplicate lock to be created and the 1st one to become orphan leading to the evict during blocking-AST process) with ldlm_lock_addref(). But may be more checks need to be done vs racing (cancel?) access ?? Will see how full auto-tests run. Di, if I want to write a master version of this patch, actually I will need to remove (change?) your very restrictive test in patch for |
| Comment by Bruno Faccini (Inactive) [ 10/Mar/14 ] |
|
Auto-tests of #9488 p-s #5 have been successful too. So even if patch still needs to be reviewed I am pretty confident that it solves the original problem. I may push a new version of patch with more protection vs simultaneous actions/race, that may be based on ldlm_revalidate_lock_handle() usage instead ldlm_lock_addref() to restore the reference on lock. And to fit with patch #8642/#8680 for |
| Comment by Bruno Faccini (Inactive) [ 24/Mar/14 ] |
|
Di, after I had more indeep look into the source code, I don't think I can check for MDS_INODELOCK_LAYOUT in mdt_intent_fixup_resent(), because the decision to add layout to open is made down/later in mdt_object_open_lock()/try_layout ... BTW, the check/filter "if ((opcode != MDT_IT_OPEN) || (opcode == MDT_IT_OPEN && info->mti_spec.sp_cr_flags & MDS_OPEN_LOCK))" you added in patch (#8642/b2_4 and #8680/master) for According to this, my patch version for master branch is at http://review.whamcloud.com/9764. Since it is master patch version where patch for |
| Comment by Matt Ezell [ 30/Mar/14 ] |
|
We are seeing consistent evictions with 2.5 clients when running a shared-file IOR (stripe count 64). I'm seeing some similar messages in our client logs: 00010000:00010000:2.0:1396143171.559956:0:3053:0:(ldlm_lockd.c:1668:ldlm_handle_bl_callback()) Lock ffff8803f9c19100 already unused, calling callback (ffffffffa0793880) 00010000:00010000:2.0:1396143171.559958:0:3053:0:(ldlm_request.c:1135:ldlm_cli_cancel_local()) ### client-side cancel ns: atlastds-MDT0000-mdc-ffff880403f44c00 lock: ffff8803f9c19100/0x330d1ef03081582 lrc: 3/0,0 mode: CR/CR res: [0x200003ac6:0x648:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x428400000000 nid: local remote: 0xe7f79ec3f6545595 expref: -99 pid: 4138 timeout: 0 lvb_type: 3 00010000:00010000:2.0:1396143171.559987:0:3053:0:(ldlm_request.c:1194:ldlm_cancel_pack()) ### packing ns: atlastds-MDT0000-mdc-ffff880403f44c00 lock: ffff8803f9c19100/0x330d1ef03081582 lrc: 2/0,0 mode: --/CR res: [0x200003ac6:0x648:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x4c29400000000 nid: local remote: 0xe7f79ec3f6545595 expref: -99 pid: 4138 timeout: 0 lvb_type: 3 00010000:00010000:2.0:1396143171.559991:0:3053:0:(ldlm_request.c:1198:ldlm_cancel_pack()) 1 locks packed 00010000:00010000:2.0:1396143171.560003:0:3053:0:(ldlm_lockd.c:1677:ldlm_handle_bl_callback()) ### client blocking callback handler END ns: atlastds-MDT0000-mdc-ffff880403f44c00 lock: ffff8803f9c19100/0x330d1ef03081582 lrc: 1/0,0 mode: --/CR res: [0x200003ac6:0x648:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x4c29400000000 nid: local remote: 0xe7f79ec3f6545595 expref: -99 pid: 4138 timeout: 0 lvb_type: 3 00010000:00010000:2.0:1396143171.560007:0:3053:0:(ldlm_lock.c:219:ldlm_lock_put()) ### final lock_put on destroyed lock, freeing it. ns: atlastds-MDT0000-mdc-ffff880403f44c00 lock: ffff8803f9c19100/0x330d1ef03081582 lrc: 0/0,0 mode: --/CR res: [0x200003ac6:0x648:0x0].0 bits 0x9 rrc: 1 type: IBT flags: 0x4c29400000000 nid: local remote: 0xe7f79ec3f6545595 expref: -99 pid: 4138 timeout: 0 lvb_type: 3 00010000:00010000:5.0:1396143171.560209:0:2437:0:(ldlm_lockd.c:2194:ldlm_callback_handler()) callback on lock 0x330d1ef03081582 - lock disappeared 00010000:00010000:5.0:1396143171.560220:0:2437:0:(ldlm_lockd.c:2055:ldlm_callback_errmsg()) @@@ Operate with invalid parameter: [nid 12345-10.36.226.79@o2ib] [rc 0] [lock 0x330d1ef03081582] req@ffff8802b2680c00 x1463740310294316/t0(0) o104->LMV_MDC_UUID@10.36.226.79@o2ib:0/0 lens 296/192 e 0 to 0 dl 1396143263 ref 1 fl Interpret:/0/0 rc -22/-22 Is the re-send due to LLNL's client-side optimization to avoid vmallocs, or something else? If it is, with only 64 stripes I don't think a re-send is necessary. But I need to do some more analysis and get traces with RPCTRACE enabled. Bruno - is the patch ready for reviewers and possible landing, or is more work still needed? |
| Comment by Christopher Morrone [ 31/Mar/14 ] |
No, I have been waiting for patch to get closer to final form. It sounds like we are close. |
| Comment by Oleg Drokin [ 31/Mar/14 ] |
|
Apparently this is a dup of lu2827, there's even a testcase in http://review.whamcloud.com/#/c/6000/ that we should use along with this proposed patch to ensure it works as expected |
| Comment by Oleg Drokin [ 01/Apr/14 ] |
|
btw Vitaly advises that he's almost done creating a proper patch for this issue (in |
| Comment by Bruno Faccini (Inactive) [ 02/Apr/14 ] |
|
Matt: Yes, b2_4/master patches were ready for reviewers, which finally and very recently happened ... Chris: may be I made it not clear enough, but it is now about a month that patch #9488-5 works and fixes your problem with no regression in b2_4. And this applies for master patch too. Oleg: Sorry, I had no idea of parallel work at Xyratex... BTW, |
| Comment by James A Simmons [ 02/Apr/14 ] |
|
I tested both your b2_4 patch with 2.4 and your master patch with b2_5. Both times my MDS went up in flames. I'm in the process to ask the admins for the logs of the event. |
| Comment by James A Simmons [ 02/Apr/14 ] |
|
here is the stack trace resulting from this patch. I named the log inflames.log |
| Comment by Christopher Morrone [ 02/Apr/14 ] |
That was fairly clear. And I really appreciate your work! But I'm not really ready to take a patch for code that I don't understand and run it in production until it has, at a minimum, gone through the code review process and is on its way to landing. |
| Comment by James A Simmons [ 03/Apr/14 ] |
|
I have a since single shared file IOR job that duplicate this problem. I turned on all debugging and pushed the logs to ftp.whamcloud.com/uploads/ |
| Comment by James A Simmons [ 09/Apr/14 ] |
|
Uploaded clients debug logs to ftp.whamcloud.com/uploads/ |
| Comment by James A Simmons [ 16/Apr/14 ] |
|
The syslogs and nid mapping have also been uploaded to uploads/ |
| Comment by James A Simmons [ 21/Apr/14 ] |
|
Did the logs shine light on this problem? |
| Comment by Bruno Faccini (Inactive) [ 28/Apr/14 ] |
|
James, I am sorry but I am puzzled with the latest infos you uploaded. Mar 31 14:18:21 atlas-tds-mds1 kernel: [ 565.057997] LustreError: 12309:0:(mdt_handler.c:3615:mdt_intent_lock_replace()) ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed: Mar 31 14:18:21 atlas-tds-mds1 kernel: [ 565.058059] LustreError: 12460:0:(mdt_handler.c:3615:mdt_intent_lock_replace()) ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed: Mar 31 14:18:21 atlas-tds-mds1 kernel: [ 565.058062] LustreError: 12460:0:(mdt_handler.c:3615:mdt_intent_lock_replace()) LBUG and that was shown within the inflames.log file you attached to this ticket. But in fact, I understand that it is certainly an other problem, but then I need some help/highlights : I am also still very interested to get the logs for the LBUG, can you provide them finally ? |
| Comment by Lukasz Flis [ 28/Apr/14 ] |
|
I have uploaded logs from client caught just after crash (dlmtrace) ftp.whamcloud.com/uploads/ The problem described in |
| Comment by James A Simmons [ 29/Apr/14 ] |
|
Looking at the logs I noticed the source of our problems is the value of the at_max. So with a JIRA search I found |
| Comment by Bruno Faccini (Inactive) [ 30/Apr/14 ] |
|
Lucasz: I am working on the trace you provided and will let you know if I can confirm this is the same problem or not. Also, since both tickets (and their patches!) are highly related, I tried running the reproducer of this |
| Comment by James A Simmons [ 30/Apr/14 ] |
|
I did more testing last night and found that the patch from aprun -n 112 /lustre/sultan/scratch/jsimmons/simul -d /lustre/sultan/scratch/jsimmons/test_simul were test_simul is striped at 56. This fails consistently for me. Sorry Bruno I haven't got back to you about the patch. We dropped testing in recently but should give it another try. |
| Comment by Christopher Morrone [ 30/Apr/14 ] |
|
I am a little worried that we are taking the wrong approach on this issue. I don't fully understand what you guys are working on, so maybe my concern is misplaced. Folks keep talking about this problem along the lines of "if a message is lost, we are not handling that situation correctly". But that is not the situation in this ticket, to the best of my knowledge. The issue in this ticket is 100% reliable. If a message is getting "lost" every single time I run this test, then clearly that is itself a bug that needs to be worked in Lustre. That would also be the primary issue that needs to be worked under this ticket. It would also be good to respond quickly and correctly to lost message situations, but that sounds like a somewhat separate issue. |
| Comment by Bruno Faccini (Inactive) [ 02/May/14 ] |
|
Hello Chris, My b2_4 patch #9488, has successfully passed auto-tests and 3 positive reviews, but a -ve one from James since he experienced a MDS LBUG when he integrated it. I am waiting for him to provide me with the necessary infos to allow me to find what wnet wrong. Also, since it was discovered that a similar work is on-going as part of |
| Comment by James A Simmons [ 02/May/14 ] |
|
I was just cross referencing to another ticket which seemed to help ease the pain. The description to me was kind of loose so I figured it was a umbrella to all things lock and AST breakage. Also I had to get something going for the users to be able to test on. Now that I have a seems stable system I can focus again on this work which will be applied to my smaller testbed. Yes that patch is not a cure all. Now that we have |
| Comment by Christopher Morrone [ 02/May/14 ] |
|
James, could you clarify whether or not patch 9488 is introducing a new problem in your testing? If the problems you are seeing are unrelated to that patch, please clear your -1 review. |
| Comment by James A Simmons [ 05/May/14 ] |
|
Last time I tested it my MDS oops. Removed the patch and the oops went away. From Oleg's comments this patch will not be landed to master since a more generic solution needs to be developed. |
| Comment by Bruno Faccini (Inactive) [ 06/May/14 ] |
|
James, the more generic solution is addressed by the current state/patch-set #8 of patch for |
| Comment by Oleg Drokin [ 10/May/14 ] |
|
Chris, so now that we see why there are resends not in the face of lost packets, I have another question. Do you carry patch for |
| Comment by Bruno Faccini (Inactive) [ 10/May/14 ] |
|
Oleg, |
| Comment by Oleg Drokin [ 11/May/14 ] |
|
I think James already verified that while 2.5/2.5 triggers due to layout lock (and increased request sizes from that I imagine), 2.4/2.4 and 2.4/2.5 does not trigger any ill effects. |
| Comment by Bruno Faccini (Inactive) [ 11/May/14 ] |
|
Oleg, sorry, my 1st answer against your |
| Comment by Christopher Morrone [ 12/May/14 ] |
Yes. It is probably the one that begins the first line of its commit comment with " |
| Comment by Bruno Faccini (Inactive) [ 20/May/14 ] |
|
Chris, have you been able to check Oleg's proposal to run without patch from On the other hand, I have submitted an add-on patch (http://review.whamcloud.com/10378, to handle the layout-lock case you trigger here) to the "generic" one (http://review.whamcloud.com/5978/) handling resends from |
| Comment by James A Simmons [ 21/May/14 ] |
|
Currently I'm in the middle of my 24 hour test shot window and I have some interesting results. First the problem I reported here happened with 2.4 servers without any patches from this ticket or |
| Comment by Bruno Faccini (Inactive) [ 21/May/14 ] |
|
James, Also, am I right if I understand, from your last comment, that you had these occurrences of (mdt_handler.c:3615:mdt_intent_lock_replace()) ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed: (mdt_handler.c:3615:mdt_intent_lock_replace()) LBUG LBUG/Assert on a platform made of 2.4 Servers and 2.5 Clients, none of them having any of the patches from this/ |
| Comment by James A Simmons [ 21/May/14 ] |
|
Yes this showed up with 2.4 servers and 2.5 clients without any patches. I uploaded all the debuginfo rpms to the same ftp spot. Now you will be able to look at the core dumps. If you need anything else let me know. Thank you. |
| Comment by Christopher Morrone [ 21/May/14 ] |
|
James, it sounds like your assertion is unrelated to this ticket, |
| Comment by James A Simmons [ 22/May/14 ] |
|
Yes it is looking like this assertion is not related to this ticket. Waiting on feedback from Bruno and/or Oleg about the results of the core dump. Then it will most likely be moved to its own ticket. |
| Comment by Bruno Faccini (Inactive) [ 23/May/14 ] |
|
James, 00000004:00080000:2.0:1400620887.232132:0:17723:0:(mdt_recovery.c:681:mdt_req_from_lcd()) @@@ restoring transno 51539764600/status 0 req@ffff881e4d7cf800 x1468652173289564/t0(0) o101->3472002b-ab6b-3d44-377e-cbe24f35879e@1395@gni100:0/0 lens 664/3512 e 0 to 0 dl 1400621204 ref 1 fl Interpret:/2/0 rc 0/0 ........... 00000004:00040000:2.0:1400620887.245915:0:17723:0:(mdt_handler.c:3615:mdt_intent_lock_replace()) ASSERTION( new_lock->l_readers + new_lock->l_writers == 0 ) failed: ............ 00000004:00040000:2.0:1400620887.261204:0:17723:0:(mdt_handler.c:3615:mdt_intent_lock_replace()) LBUG and all their similar threads looking as following : crash> bt 17723
PID: 17723 TASK: ffff881e84174ae0 CPU: 2 COMMAND: "mdt00_012"
#0 [ffff88011c427e90] crash_nmi_callback at ffffffff8102d3a6
#1 [ffff88011c427ea0] notifier_call_chain at ffffffff81512105
#2 [ffff88011c427ee0] atomic_notifier_call_chain at ffffffff8151216a
#3 [ffff88011c427ef0] notify_die at ffffffff8109cf0e
#4 [ffff88011c427f20] do_nmi at ffffffff8150fdcb
#5 [ffff88011c427f50] nmi at ffffffff8150f690
[exception RIP: delay_tsc+50]
RIP: ffffffff81280c72 RSP: ffff881e841bb620 RFLAGS: 00000012
RAX: fffffffff796f896 RBX: 0000000000000002 RCX: 00000000f796f896
RDX: 0000000000000688 RSI: 00000000002625ec RDI: 00000000000009c5
RBP: ffff881e841bb640 R8: ffffffff81c07840 R9: 0000000000000000
R10: 0000000000000005 R11: 0000000000000096 R12: fffffffff796f20e
R13: 0000000000000002 R14: 00000000000009c5 R15: ffffffff813542e0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
#6 [ffff881e841bb620] delay_tsc at ffffffff81280c72
#7 [ffff881e841bb648] __const_udelay at ffffffff81280c36
#8 [ffff881e841bb658] wait_for_xmitr at ffffffff81354262
#9 [ffff881e841bb688] serial8250_console_putchar at ffffffff81354306
#10 [ffff881e841bb6a8] uart_console_write at ffffffff8134fc9e
#11 [ffff881e841bb6e8] serial8250_console_write at ffffffff8135466d
#12 [ffff881e841bb738] __call_console_drivers at ffffffff8106e775
#13 [ffff881e841bb768] _call_console_drivers at ffffffff8106e7da
#14 [ffff881e841bb788] release_console_sem at ffffffff8106ee28
#15 [ffff881e841bb7c8] vprintk at ffffffff8106f528
#16 [ffff881e841bb868] printk at ffffffff8150c005
#17 [ffff881e841bb8c8] __print_symbol at ffffffff810b9636
#18 [ffff881e841bb9e8] print_trace_address at ffffffffa03b1920 [libcfs]
#19 [ffff881e841bba38] print_context_stack at ffffffff8100f551
#20 [ffff881e841bbaa8] dump_trace at ffffffff8100e4a0
#21 [ffff881e841bbb48] libcfs_debug_dumpstack at ffffffffa03b1895 [libcfs]
#22 [ffff881e841bbb68] lbug_with_loc at ffffffffa03b1e97 [libcfs]
#23 [ffff881e841bbb88] mdt_intent_lock_replace at ffffffffa0d365a6 [mdt]
#24 [ffff881e841bbbd8] mdt_intent_reint at ffffffffa0d3c387 [mdt]
#25 [ffff881e841bbc28] mdt_intent_policy at ffffffffa0d39fae [mdt]
#26 [ffff881e841bbc68] ldlm_lock_enqueue at ffffffffa0699841 [ptlrpc]
#27 [ffff881e841bbcc8] ldlm_handle_enqueue0 at ffffffffa06c02cf [ptlrpc]
#28 [ffff881e841bbd38] mdt_enqueue at ffffffffa0d3a436 [mdt]
#29 [ffff881e841bbd58] mdt_handle_common at ffffffffa0d3ebc7 [mdt]
#30 [ffff881e841bbda8] mds_regular_handle at ffffffffa0d7aae5 [mdt]
#31 [ffff881e841bbdb8] ptlrpc_server_handle_request at ffffffffa06f2568 [ptlrpc]
#32 [ffff881e841bbeb8] ptlrpc_main at ffffffffa06f38fe [ptlrpc]
#33 [ffff881e841bbf48] kernel_thread at ffffffff8100c0ca
crash>
So this looks like the same kind of issue, even if the symptom is different (LBUG vs eviction), that has been reported starting with this ticket, likely to be caused by some large striping/layout. |
| Comment by James A Simmons [ 23/May/14 ] |
|
Thank you for looking at our logs. I have tried the #9488 patch and also #5978 but it did not help. I have not tried the#5978 and #10378 together yet. That is next on the agenda. |
| Comment by Bruno Faccini (Inactive) [ 26/May/14 ] |
|
James, |
| Comment by James A Simmons [ 02/Jun/14 ] |
|
Got a nice lockup with a ls to a directory containing a single shared file. I did a full lctl set_param debug=+all. You can find the logs at ftp.whamcloud.com/uploads/ For the client it has multiple file systems mounted so filter for atlastds which is the 2.5 file system. |
| Comment by Bruno Faccini (Inactive) [ 03/Jun/14 ] |
|
Thanks James, but can you also provide me with the dmesg/syslog messages/stacks that should have been displayed indicating the lockup ? Also, I did not find any eviction trace in these logs. |
| Comment by James A Simmons [ 03/Jun/14 ] |
|
With the two patches from From the client side 2014-06-02T09:19:38.010961-04:00 c0-0c0s6n1 Lustre: 15326:0:(client.c:1908:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1401714172/real 1401714172] req@ffff88014aca5400 x1469570034748356/t0(0) o36->atlastds-MDT0000-mdc-ffff8801491a7800@10.36.226.79@o2ib:12/10 lens 496/3880 e 12 to 1 dl 1401715177 ref 2 fl Rpc:IX/0/ffffffff rc 0/-1 MDS side Jun 2 09:12:47 atlas-tds-mds1 kernel: [323268.180954] Lustre: 16989:0:(service.c:1349:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply |
| Comment by Cory Spitz [ 04/Jun/14 ] |
|
James, I think that you said you needed 48 OSTs to trigger the buffer resize case. Is that all? We have a system with 48 OSTs that does not see this issue. |
| Comment by James A Simmons [ 04/Jun/14 ] |
|
Did you set the default stripe size to 48? |
| Comment by Christopher Morrone [ 04/Jun/14 ] |
|
We certainly only needed to stripe over 48 OSTs, as stated in this ticket's Description section. |
| Comment by Bruno Faccini (Inactive) [ 05/Jun/14 ] |
|
James, _ log uploads/ _ log uploads/ _ so according to this I have been only able to work on transaction-id x1469570037379276 from the syslog msgs you pointed, and I already lack the end of its processing at both Client and Server side. But I can see that the request that caused it to be sent from the Client is : 00800000:00000002:5.0:1401729341.437250:0:21275:0:(lmv_intent.c:321:lmv_intent_lock()) INTENT LOCK 'getattr' for 'ior_fpp_185100' on [0x200000400:0x2:0x0] .............. 00010000:00010000:5.0:1401729341.437311:0:21275:0:(ldlm_lock.c:795:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(CR) ns: atlastds-MDT0000-mdc-ffff880333607800 lock: ffff88027e231880/0x8aab3c4b568e4c64 lrc: 3/1,0 mode: --/CR res: [0x200000400:0x2:0x0].0 bits 0x0 rrc: 3 type: IBT flags: 0x10000000000000 nid: local remote: 0x0 expref: -99 pid: 21275 timeout: 0 lvb_type: 0 00010000:00010000:5.0:1401729341.437320:0:21275:0:(ldlm_request.c:926:ldlm_cli_enqueue()) ### client-side enqueue START, flags 1000 ns: atlastds-MDT0000-mdc-ffff880333607800 lock: ffff88027e231880/0x8aab3c4b568e4c64 lrc: 3/1,0 mode: --/CR res: [0x200000400:0x2:0x0].0 bits 0x2 rrc: 3 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 21275 timeout: 0 lvb_type: 0 00010000:00010000:5.0:1401729341.437325:0:21275:0:(ldlm_request.c:988:ldlm_cli_enqueue()) ### sending request ns: atlastds-MDT0000-mdc-ffff880333607800 lock: ffff88027e231880/0x8aab3c4b568e4c64 lrc: 3/1,0 mode: --/CR res: [0x200000400:0x2:0x0].0 bits 0x2 rrc: 3 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 21275 timeout: 0 lvb_type: 0 00000100:00000001:5.0:1401729341.437329:0:21275:0:(client.c:2676:ptlrpc_queue_wait()) Process entered 00000100:00000001:5.0:1401729341.437330:0:21275:0:(client.c:861:ptlrpc_prep_set()) Process entered 00000100:00000010:5.0:1401729341.437331:0:21275:0:(client.c:862:ptlrpc_prep_set()) kmalloced 'set': 144 at ffff88021dcf0f00. 00000100:00000001:5.0:1401729341.437332:0:21275:0:(client.c:878:ptlrpc_prep_set()) Process leaving (rc=18446612141404262144 : -131932305289472 : ffff88021dcf0f00) 00000100:00000001:5.0:1401729341.437333:0:21275:0:(client.c:2605:ptlrpc_request_addref()) Process entered 00000100:00000001:5.0:1401729341.437334:0:21275:0:(client.c:2607:ptlrpc_request_addref()) Process leaving (rc=18446612141417609216 : -131932291942400 : ffff88021e9ab800) 00000020:00000001:5.0:1401729341.437336:0:21275:0:(class_obd.c:131:lustre_get_jobid()) Process entered 00000020:00000001:5.0:1401729341.437337:0:21275:0:(class_obd.c:136:lustre_get_jobid()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:5.0:1401729341.437338:0:21275:0:(client.c:2107:ptlrpc_set_wait()) Process entered 00000100:00000001:5.0:1401729341.437339:0:21275:0:(client.c:1415:ptlrpc_send_new_req()) Process entered 00000100:00000040:5.0:1401729341.437343:0:21275:0:(lustre_net.h:3283:ptlrpc_rqphase_move()) @@@ move req "New" -> "Rpc" req@ffff88021e9ab800 x1469570037379276/t0(0) o101->atlastds-MDT0000-mdc-ffff880333607800@10.36.226.79@o2ib:12/10 lens 584/2344 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1 00000100:00000001:5.0:1401729341.437348:0:21275:0:(client.c:1066:ptlrpc_import_delay_req()) Process entered 00000100:00000001:5.0:1401729341.437349:0:21275:0:(client.c:1117:ptlrpc_import_delay_req()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:5.0:1401729341.437350:0:21275:0:(sec.c:647:sptlrpc_req_refresh_ctx()) Process entered 02000000:00000001:5.0:1401729341.437351:0:21275:0:(sec.c:674:sptlrpc_req_refresh_ctx()) Process leaving (rc=0 : 0 : 0) 00000100:00100000:5.0:1401729341.437353:0:21275:0:(client.c:1476:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc bash:c9bf7a8c-2264-2712-4013-110119f32786:21275:1469570037379276:10.36.226.79@o2ib:101 at around 13:15:41. _ then nothing else can be found about these lock-request/transaction-id in Client's log remaining 1:30s of traces. _ on Server side we can find its handling start with the following traces : 00000100:00100000:4.0:1401729341.436978:0:17237:0:(service.c:2079:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt01_160:c9bf7a8c-2264-2712-4013-110119f32786+17:21275:x1469570037379276:12345-12@gni2:101 ............... 00010000:00010000:4.0:1401729341.437006:0:17237:0:(ldlm_lockd.c:1259:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e81f6940/0x65bb92c64a60ecce lrc: 2/0,0 mode: --/CR res: [0x200000400:0x2:0x0].0 bits 0x0 rrc: 15 type: IBT flags: 0x40000000000000 nid: local remote: 0x8aab3c4b568e4c64 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 ............... 00000004:00000002:4.0:1401729341.440386:0:17237:0:(mdt_handler.c:1315:mdt_getattr_name_lock()) getattr with lock for [0x200000400:0x2:0x0]/ior_fpp_185100, ldlm_rep = ffff8807e3fca1f8 ............... 00010000:00010000:4.0:1401729341.440443:0:17237:0:(ldlm_lock.c:1089:ldlm_granted_list_add_lock()) ### About to add lock: ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578b00/0x65bb92c64a60ecd5 lrc: 3/1,0 mode: CR/CR res: [0x200000400:0x2:0x0].0 bits 0x2 rrc: 16 type: IBT flags: 0x50210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 ............... 00010000:00010000:4.0:1401729341.440453:0:17237:0:(ldlm_request.c:472:ldlm_cli_enqueue_local()) ### client-side local enqueue handler, new lock created ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578b00/0x65bb92c64a60ecd5 lrc: 3/1,0 mode: CR/CR res: [0x200000400:0x2:0x0].0 bits 0x2 rrc: 16 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 ............... 00010000:00010000:4.0:1401729341.440486:0:17237:0:(ldlm_lock.c:1089:ldlm_granted_list_add_lock()) ### About to add lock: ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578700/0x65bb92c64a60ecdc lrc: 3/1,0 mode: PR/PR res: [0x200000400:0x2:0x0].b30b5d7b bits 0x2 rrc: 6 type: IBT flags: 0x50210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 ................ 00010000:00010000:4.0:1401729341.440493:0:17237:0:(ldlm_request.c:472:ldlm_cli_enqueue_local()) ### client-side local enqueue handler, new lock created ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578700/0x65bb92c64a60ecdc lrc: 3/1,0 mode: PR/PR res: [0x200000400:0x2:0x0].b30b5d7b bits 0x2 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 ................. 00000004:00000002:4.0:1401729341.440522:0:17237:0:(osd_handler.c:4031:osd_add_oi_cache()) add [0x200000401:0xd7:0x0] 3649491:0 to info ffff8807efb9a000 ................. 00000004:00000040:4.0:1401729341.440528:0:17237:0:(mdt_handler.c:2468:mdt_object_find()) Find object for [0x200000401:0xd7:0x0] ................. 00010000:00000001:4.0:1401729341.440817:0:17237:0:(ldlm_lock.c:750:ldlm_add_ast_work_item()) Process entered 00010000:00000001:4.0:1401729341.440818:0:17237:0:(ldlm_lock.c:756:ldlm_add_ast_work_item()) Process leaving 00010000:00000001:4.0:1401729341.440818:0:17237:0:(ldlm_lock.c:750:ldlm_add_ast_work_item()) Process entered 00010000:00000001:4.0:1401729341.440818:0:17237:0:(ldlm_lock.c:756:ldlm_add_ast_work_item()) Process leaving 00010000:00000001:4.0:1401729341.440819:0:17237:0:(ldlm_lock.c:750:ldlm_add_ast_work_item()) Process entered 00010000:00000001:4.0:1401729341.440819:0:17237:0:(ldlm_lock.c:756:ldlm_add_ast_work_item()) Process leaving 00010000:00000001:4.0:1401729341.440820:0:17237:0:(ldlm_inodebits.c:163:ldlm_inodebits_compat_queue()) Process leaving (rc=0 : 0 : 0) 00010000:00010000:4.0:1401729341.440821:0:17237:0:(ldlm_resource.c:1316:ldlm_resource_add_lock()) ### About to add this lock: ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578300/0x65bb92c64a60ece3 lrc: 3/1,0 mode: --/PR res: [0x200000401:0xd7:0x0].0 bits 0x13 rrc: 619 type: IBT flags: 0x50210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 00010000:00000001:4.0:1401729341.440824:0:17237:0:(ldlm_lock.c:1973:ldlm_run_ast_work()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:4.0:1401729341.440824:0:17237:0:(ldlm_inodebits.c:238:ldlm_process_inodebits_lock()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:4.0:1401729341.440825:0:17237:0:(ldlm_lock.c:1759:ldlm_lock_enqueue()) Process leaving via out (rc=0 : 0 : 0x0) 00010000:00000001:4.0:1401729341.440826:0:17237:0:(ldlm_request.c:238:ldlm_completion_ast()) Process entered 00010000:00010000:4.0:1401729341.440826:0:17237:0:(ldlm_request.c:252:ldlm_completion_ast()) ### client-side enqueue returned a blocked lock, sleeping ns: mdt-atlastds-MDT0000_UUID lock: ffff8807e7578300/0x65bb92c64a60ece3 lrc: 3/1,0 mode: --/PR res: [0x200000401:0xd7:0x0].0 bits 0x13 rrc: 619 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17237 timeout: 0 lvb_type: 0 00000100:00000001:1.0:1401729341.451231:0:12772:0:(client.c:1967:ptlrpc_expired_set()) Process entered ................. that shows parent's/dir [0x200000400:0x2:0x0] locks have been granted, child/file [0x200000401:0xd7:0x0] is blocked, and then nothing else happen for these fids/locks in Server's remaining 27s of log/traces. Which seems to indicate that any of the conflicting locks already had a blocking-AST sent before and did not cancel it at this time. _ Also, I can't find in these logs, even if they do not cover the whole story, any truncate/resend situation that is being handled and modified by both patches being worked as part of this/ But it seems to me that this new lockup problem (vs previous evictions and LBUG) looks more as something different, and I wonder if it could be tracked as part of a new ticket. |
| Comment by Christopher Morrone [ 05/Jun/14 ] |
|
Yes, please move that discussion to a new ticket. |
| Comment by James A Simmons [ 06/Jun/14 ] |
|
Something is really broken in 2.5 which might be independent of this patch. At this point I don't know. Only data will provide the needed evidence. So I tried the next step and moved to the latest 2.5.59 with the |
| Comment by James A Simmons [ 09/Jun/14 ] |
|
I collected data for the evictions I'm seeing from two different test runs. The clients enabled rpctrace+dlmtrace for extra info. The logs are uploaded to ftp.whamcloud.com/uploads/ |
| Comment by James A Simmons [ 16/Jun/14 ] |
|
Any insight gained from the logs? |
| Comment by Bruno Faccini (Inactive) [ 16/Jun/14 ] |
|
James, sorry I was out, but now back. Will give you feedback soon on these new logs analysis. |
| Comment by Bruno Faccini (Inactive) [ 17/Jun/14 ] |
|
James, BTW, this definitelly confirms that this needs to be addressed in an other ticket since there is no relation to the resend mechanism that is concerned/fixed in both |
| Comment by James A Simmons [ 17/Jun/14 ] |
|
Can you open the ticket since you will be able to describe the problem far better than I can |
| Comment by Bruno Faccini (Inactive) [ 18/Jun/14 ] |
|
I just created |
| Comment by Peter Jones [ 21/Jun/14 ] |
|
So now that the unrelated issue being tracked for ORNL has moved to |
| Comment by Christopher Morrone [ 23/Jun/14 ] |
First, can we get an explanation of how that fixes this problem, and then a clear list of the patch(es) I need to apply to b2_4? |
| Comment by Bruno Faccini (Inactive) [ 26/Jun/14 ] |
|
Prior to patch from this ticket and/or |
| Comment by Christopher Morrone [ 26/Jun/14 ] |
|
Why are messages being resent? |
| Comment by Bruno Faccini (Inactive) [ 27/Jun/14 ] |
|
Because Client's reply buffer was not big enough to receive 1st Server's reply including LVB/layout due to default/large striping. |
| Comment by Bruno Faccini (Inactive) [ 30/Jun/14 ] |
|
Merged b2_4 backport of both #5978 and #10378 master changes from |
| Comment by Bruno Faccini (Inactive) [ 22/Jul/14 ] |
|
BTW, I forgot to indicate here that my b2_4 patch/back-port for |
| Comment by James A Simmons [ 29/Jul/14 ] |
|
Just finished a test shot with Cray 2.5 clients to see if the client evicts stopped. Their default client which is some 2.5 version with many many patches lacked the ftp.whamcloud.com/uploads/ |
| Comment by Christopher Morrone [ 11/Aug/14 ] |
|
James, can you share the patch stack you are using? That might help us figure out if you are reporting the same issue or something else. And if it isn't exactly the same issue, we really need to get you to report it in another ticket. |
| Comment by James A Simmons [ 12/Aug/14 ] |
|
It was my bad. The last test shot we used our 2.4 production file system which didn't have the patch from here. So the above breakage is expected. We are in the process of testing this at larger scale (500 nodes) production machine. Yes ORNL has created a public git tree https://github.com/ORNL-TechInt/lustre So people can examine our special sauce. |
| Comment by James A Simmons [ 18/Aug/14 ] |
|
I have been testing with the |
| Comment by Christopher Morrone [ 20/Aug/14 ] |
|
FYI, we put the I am beginning to suspect that the patch either did cause the assertion, or made the assertion more common. |
| Comment by Christopher Morrone [ 26/Aug/14 ] |
|
Could you please provide an explanation of what operations will not be fixed by the |
| Comment by James A Simmons [ 26/Aug/14 ] |
|
I have several different reproducers of this problem. What I found what |
| Comment by Christopher Morrone [ 26/Aug/14 ] |
|
Yes, that was my belief. I would like Intel to enumerate the failure modes that users can expect to be fixed, and those that will not be fixed by |
| Comment by Oleg Drokin [ 10/Sep/14 ] |
|
patch http://review.whamcloud.com/#/c/9488/ fixed to eliminate the assertion in mdt_intent_lock_replace() now. |
| Comment by Christopher Morrone [ 10/Sep/14 ] |
|
Ok, I will give those patches a try. We are planning to try to move to b2_5 in less than 2 months, so we'll need the |
| Comment by Oleg Drokin [ 11/Sep/14 ] |
|
For -2827, we have: Some of those patches are not picking cleanly into b2_5, but I know James Simmons from ORNL ported them. All of this is now in testing after which they hopefully will publish their tree with backports. Or another easy way to get rid of all of these problems (occuring as frequently, at least, sicne resend might also happen if a reply from server was genuinely lost on the network) is to drop |
| Comment by James A Simmons [ 11/Sep/14 ] |
|
Actually our Cray clients don't have the We still need this work due to another file system running at the lab that will be stuck at 2.4 for the time being but some of our clients will be moving to 2.5 so these problems will show up. |
| Comment by Oleg Drokin [ 11/Sep/14 ] |
|
re 1.8 clients - since they don't know about wide striping, they hit this because their buffers are not large enough anyway, I suspect, when you have enough stripes. The patches are needed regardless of course because the bugs are real during resends. It's just the resends used to be rare and only in the face of lost messages. |
| Comment by James A Simmons [ 17/Sep/14 ] |
|
Chris did you try the patch set for 2.4? If you did was their positive results? |
| Comment by Christopher Morrone [ 17/Sep/14 ] |
|
We won't know until it is installed on larger systems. |
| Comment by James A Simmons [ 19/Sep/14 ] |
|
I have been testing the b2_4 version of the LustreError: 4012:0:(ldlm_resource.c:1445:ldlm_resource_dump()) — Resource: [0xdeaf:0x0:0x0].0 (ffff88020290da80) refcount = 2 Its my simul job struggling. |
| Comment by Andreas Dilger [ 12/Dec/17 ] |
|
No updates in ticket since sites have upgraded to newer releases. |