[LU-5559] ptlrpc_import_delay_req(): req wrong generation: req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Created: 29/Aug/14 Updated: 27/Apr/15 Resolved: 27/Apr/15 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Li Wei (Inactive) | Assignee: | Li Wei (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 15506 | ||||||||
| Description |
|
A client reconnected while the server was trying to (re)send it a BL AST: Aug 17 23:13:15 lola-10 kernel: Lustre: 5600:0:(client.c:1926:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [se nt 1408342388/real 1408342388] req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Aug 17 23:13:15 lola-10 kernel: Lustre: soaked-OST0004: Client c1d7cd54-55f6-0482-0887-cf6de8216f19 (at 192.168.1.124@o2ib1) reconnecting Aug 17 23:13:15 lola-10 kernel: LustreError: 5600:0:(client.c:1097:ptlrpc_import_delay_req()) @@@ req wrong generation: req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Aug 17 23:13:15 lola-10 kernel: LustreError: 5600:0:(ldlm_lockd.c:661:ldlm_handle_ast_error()) ### client (nid 192.168.1.124@o2ib1) returned 0 (rc -5) from blocking AST ns: filter-soaked-OST0004_UUID lock: ffff8805681c20c0/0x4a2dd95cee2a53fc lrc: 1/0,0 mode: --/PR res: [0x7b1a04:0x 0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x64801000010020 nid: 192.168.1.124@o2ib1 remote: 0x cc587d962f8bc36b expref: 11 pid: 6337 timeout: 4550787123 lvb_type: 1 Although the build has the AST resend patch, the "req wrong generation" error caused the BL AST not to be resent any further. It might not be a good idea to always destroy and recreate the reverse import upon a reconnection. |
| Comments |
| Comment by Li Wei (Inactive) [ 01/Sep/14 ] |
| Comment by Andreas Dilger [ 30/Oct/14 ] |
|
There is a separate patch http://review.whamcloud.com/11750 that is also addressing this same issue. |
| Comment by Andreas Dilger [ 30/Oct/14 ] |
|
LI Wei, can you test if the 11750 patch resolves the problem that you hit in this bug? |
| Comment by Li Wei (Inactive) [ 31/Oct/14 ] |
|
Andreas, I have reviewed that patch already. It should be able to fix this problem. However, there are a number of defects and questions there that must be addressed first. |