[LU-5559] ptlrpc_import_delay_req(): req wrong generation: req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 Created: 29/Aug/14  Updated: 27/Apr/15  Resolved: 27/Apr/15

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Li Wei (Inactive) Assignee: Li Wei (Inactive)
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-5569 recreating a reverse import produce a... Resolved
Severity: 3
Rank (Obsolete): 15506

 Description   

A client reconnected while the server was trying to (re)send it a BL AST:

Aug 17 23:13:15 lola-10 kernel: Lustre: 5600:0:(client.c:1926:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [se
nt 1408342388/real 1408342388]  req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0
to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1
Aug 17 23:13:15 lola-10 kernel: Lustre: soaked-OST0004: Client c1d7cd54-55f6-0482-0887-cf6de8216f19 (at 192.168.1.124@o2ib1) reconnecting
Aug 17 23:13:15 lola-10 kernel: LustreError: 5600:0:(client.c:1097:ptlrpc_import_delay_req()) @@@ req wrong generation:  req@ffff880583d69800
 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1
Aug 17 23:13:15 lola-10 kernel: LustreError: 5600:0:(ldlm_lockd.c:661:ldlm_handle_ast_error()) ### client (nid 192.168.1.124@o2ib1) returned
0 (rc -5) from blocking AST ns: filter-soaked-OST0004_UUID lock: ffff8805681c20c0/0x4a2dd95cee2a53fc lrc: 1/0,0 mode: --/PR res: [0x7b1a04:0x
0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x64801000010020 nid: 192.168.1.124@o2ib1 remote: 0x
cc587d962f8bc36b expref: 11 pid: 6337 timeout: 4550787123 lvb_type: 1

Although the build has the AST resend patch, the "req wrong generation" error caused the BL AST not to be resent any further. It might not be a good idea to always destroy and recreate the reverse import upon a reconnection.



 Comments   
Comment by Li Wei (Inactive) [ 01/Sep/14 ]

http://review.whamcloud.com/11715 (draft)

Comment by Andreas Dilger [ 30/Oct/14 ]

There is a separate patch http://review.whamcloud.com/11750 that is also addressing this same issue.

Comment by Andreas Dilger [ 30/Oct/14 ]

LI Wei, can you test if the 11750 patch resolves the problem that you hit in this bug?

Comment by Li Wei (Inactive) [ 31/Oct/14 ]

Andreas, I have reviewed that patch already. It should be able to fix this problem. However, there are a number of defects and questions there that must be addressed first.

Generated at Sat Feb 10 01:52:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.