[LU-8347] granting conflicting locks Created: 29/Jun/16 Updated: 29/Oct/16 Resolved: 29/Oct/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Andriy Skulysh | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
2016-04-14T13:52:45.530794+00:00 c3-2c1s12n0 Lustre: 21681:0:(client.c:1944:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1460641949/real 1460641949] req@ffff881fe6245700 x1530142443559856/t0(0) o101->snx11209-OST0007-osc-ffff880fe6fbbc00@10.149.209.11@o2ib1302:28/4 lens 328/400 e 1 to 1 dl 1460641965 ref 1 fl Rpc:XU/40/ffffffff rc 0/-1 |
| Comments |
| Comment by Gerrit Updater [ 29/Jun/16 ] |
|
Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/21059 |
| Comment by John Hammond [ 01/Jul/16 ] |
|
This assertion was removed by http://review.whamcloud.com/#/c/14989/ |
| Comment by Jinshan Xiong (Inactive) [ 05/Jul/16 ] |
|
Hi Andriy, can you please describe the root cause of this problem in detail? The reason to remove the assert is that this usually occurs when an OSC import is being evicted therefore this error is not that severe to make the client stop working. |
| Comment by Andriy Skulysh [ 06/Jul/16 ] |
|
the waiting lock replay could come to the server first and gets granted immediately as no conflict exists, the next granted lock replay is just placed to the granted list - so conflicts could be granted. |
| Comment by Jinshan Xiong (Inactive) [ 06/Jul/16 ] |
|
as far as I know, replaying locks will be added into lists directly, and they won't be processed with lock enqueue policy at replay time. Do you have a reproducer? |
| Comment by Jinshan Xiong (Inactive) [ 07/Jul/16 ] |
|
I realize you must be talking about resent locks. However, I don't think this patch fixes the problem. |
| Comment by Andriy Skulysh [ 12/Jul/16 ] |
|
Yes, I have a test, but it depends on other ticket. |
| Comment by John Hammond [ 26/Jul/16 ] |
|
Hi Andriy, Could you point us to that test or upload it to gerrit? |
| Comment by Patrick Farrell (Inactive) [ 26/Jul/16 ] |
|
It's not integrated in to the test framework, but the test I attached to https://jira.hpdd.intel.com/secure/attachment/21600/mpi_test.c Running the test case is a little complicated: Exactly what it's doing is described in this comment: |
| Comment by John Hammond [ 12/Sep/16 ] |
|
Andriy, can you add a test to http://review.whamcloud.com/#/c/21059/? |
| Comment by Andriy Skulysh [ 14/Sep/16 ] |
|
I've added the test to the patch |
| Comment by Gerrit Updater [ 28/Oct/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21059/ |
| Comment by Peter Jones [ 29/Oct/16 ] |
|
Landed for 2.9 |