Details
-
Bug
-
Resolution: Incomplete
-
Minor
-
None
-
Lustre 1.8.7
-
None
-
3
-
10099
Description
There are 8 clients, each creating files on one OST.
The 10G cable is removed from the first OSS. Within a few minutes, that OSS is killed by STONITH.
All the OSTs mounted on the peer OSS.
However, the test on one of the clients failed, with error:
cp: cannot fstat `/mnt/lustre/ost/ost-01/file.002645': Interrupted system call
The test on all the other clients was fine. Here is a bit of the client's dmesg output:
LustreError: 167-0: This client was evicted by lstr96-OST0001; in progress operations using this service will fail.
LustreError: 22101:0:(file.c:995:ll_glimpse_size()) obd_enqueue returned rc -4, returning -EIO
LustreError: 24435:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff810105f98000 x1400324452088101/t0 o4->lstr96-OST0001_UUID@10.7.90.4@tcp:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0
LustreError: 24435:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8101236a5400 x1400324452088110/t0 o4->lstr96-OST0001_UUID@10.7.90.4@tcp:6/4 lens 448/608 e 0 to 1 dl 0 ref 2 fl Rpc:/0/0 rc 0/0
LustreError: 24435:0:(client.c:858:ptlrpc_import_delay_req()) Skipped 8 previous similar messages
Unfortunately, there are no dates in the dmesg output, and /var/log/messages on the client has nothing in it. The problem occurred May 21 15:09, as will be seen in the log files from the OSS
I will attach the rest of this log, and the logs from the OSS. Please let me know if you need more info.
Attachments
Issue Links
- Trackbacks
-
Lustre 1.8.x known issues tracker
While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA