Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
3
-
15532
Description
Don't reallocate a new reverse import for each client reconnect.
a reverse import disconnecting on each client reconnect open
several races in request sending (AST mostly) code.
First problem is send_rpc vs class_destroy_import() race. If sending
RPC (or resending) issued after class_destroy_import function was
called, RPC sending will failed due import generation check.
Second problem, Target_handle_connect function stop an update a
connection information for older reverse import. So RPC can't be
delivered from server to the client due wrong connection information
or security flavor changed.
Target_handle_connect function stops update connection information
for older reverse import. So we can't delivers a RPC from server to
the client due wrong connection information or security flavor
changed.
Third problem, connection flags aren't updates atomically for an
import. Target_handle_connect function does link new import before
message headers flags are set. So, RPC will have a wrong flags set
if it would be sent at the same time.
Fourth problem, client reconnecting after network flap have result
none wakeup event send to a RPC in import queues. That situation adds
noticeable timeout in case server don't send request before network
flap.
some examples
00000100:00100000:1.0:1407845348.937766:0:62024:0:(service.c:1929:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_419:4960df0f-75ed-07a2-cee7-063090dc59cd+4:19257:x1475700821793316:12345-1748@gni1:8 Request procesed in 55us (106us total) trans 0 rc 0/0 00000100:00020000:1.0:1407845393.600747:0:81897:0:(client.c:1115:ptlrpc_import_delay_req()) @@@ req wrong generation: req@ffff880304e39800 x1475078782385806/t0(0) o105->snx11063-OST0070@1748@gni1:15/16 lens 344/192 e 0 to 1 dl 1407845389 ref 1 fl Rpc:X/2/ffffffff rc 0/-1
Attachments
Issue Links
- is blocking
-
LU-5520 BL AST resend
-
- Resolved
-
- is duplicated by
-
LU-5559 ptlrpc_import_delay_req(): req wrong generation: req@ffff880583d69800 x1476486655962316/t0(0) o104->soaked-OST0004@192.168.1.124@o2ib1:15/16 lens 296/224 e 0 to 1 dl 1408342395 ref 1 fl Rpc:X/0/ffffffff rc 0/-1
-
- Closed
-
- is related to
-
LU-5590 client should don't send a reply after eviction
-
- Open
-
-
LU-5581 blocking ast error handling lack eviction for a local errors and some remote.
-
- Resolved
-
- is related to
-
LU-7221 replay-ost-single test_3: ASSERTION( __v > 0 && __v < ((int)0x5a5a5a5a5a5a5a5a) ) failed: value: 0
-
- Resolved
-
We discovered the bugs addressed by this change while investigating some non-POSIX compliant behavior exhibited by Lustre. Below is a description of the problem based on my own understanding and the fixes that were proposed to address it.
Higher layers of Lustre (CLIO) generally rely on lower layers to enforce POSIX compliance. In this case, the Lustre Distributed Lock Manager (LDLM) and ptlrpc layers are interacting in such a way that results in inappropriate errors being returned to the client. The interaction revolves around a pair of clients performing I/O.
One client (the writer) creates and writes data to a single file striped across eight OSTs. A second client (the reader) reads the data written by the writer. The reader requests a protected read lock for each stripe of the file from the corresponding OST. Upon receipt of the lock enqueue request, the OSS notes that it has already granted a conflicting lock to the writer. As a result the server sends a blocking AST (BL AST) to the writer. This is a request that the writer cancel its lock so that it may be granted to the reader. Upon receipt of the BL AST the writer should first reply that it has received the BL AST, and then, after flushing any pages to the server, send a lock cancel request back to the server. When the server receives both the reply to the BL AST and the lock cancel request it can then grant the lock to the reader via a completion AST (CP AST). The server sends a CP AST to the reader who must then acknowledge receipt of the CP AST before it can use the resource covered by the requested lock.
In Lustre, different components communicate via import and export pairs. An import is for sending requests and receiving replies, and an export is for receiving requests and sending replies. However, it is not possible to send a request via an export. As a result, servers utilize a reverse import to send AST requests to clients. A reverse import converts an import and export pair into a corresponding export and import pair. Currently, a new reverse import is created whenever the server creates an export for a client (re)connection. This prevents us from being able to re-send requests that were linked to the old reverse import. Historically this has not been problematic as servers did not have the ability to resend ASTs. With
LU-5520servers are now able to resend ASTs.This particular bug (there are other potential flavors) arises if the reader reconnects to an OST granting the lock while the OSS is trying to deliver the CP AST (or, equivalently, if the client is unable to acknowledge receipt of the CP AST). Based on Lustre trace data we determined that an OSS was unable to deliver a CP AST to the reader. While the OSS was waiting for the CP AST acknowledgement the reader reconnected to the OST granting the lock. As mentioned above, this created a new reverse import for this client. When the OSS attempted to resend the CP AST (after a timeout) it found that the old import for the reader had been destroyed. It was thus unable to re-send the request, and the request was immediately failed with a status of -EIO. When LDLM interpreted the failed request, it did not handle the -EIO request status appropriately. LDLM converted the -EIO error into -EAGAIN which was then returned to the client.
Two fixes are proposed to address different aspects of this bug:
1. Server-side: evict clients returning errors on ASTs
LU-55812. Server-side: change reverse import life cycle:
LU-5569