[LU-11836] DOM read-open resend vs getattr deadlock Created: 06/Jan/19 Updated: 11/Sep/19 Resolved: 11/Sep/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mikhail Pershin | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | DoM2 | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
DOM read-on-open may cause resend when reply buffer is larger then client buffer, that is OK in general, client just re-allocate buffer and resend request. The problem occurs when between first reply and resend the new request on the same file is arrived, e.g. getattr. That specific combination exists only with DOM files (PR/PW modes causes conflicts with getattr) and only with read-on-open feature because it produces resent without reconnect. |
| Comments |
| Comment by Mikhail Pershin [ 06/Jan/19 ] |
|
This issue happens from time to time in racer.sh with DOM files. I have a reproducer for that scenario and is working on patch. |
| Comment by Gerrit Updater [ 20/Jan/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34072 |
| Comment by Gerrit Updater [ 15/Feb/19 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34264 |
| Comment by Mikhail Pershin [ 15/Feb/19 ] |
|
this issue should be resolved with proper open resent/reconstruct handling. As noted by Vitaly that is just not right to take parent lock on server again while we already have child lock, that cause reverse lock ordering. Meanwhile this intersects with |
| Comment by Gerrit Updater [ 15/Mar/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34264/ |
| Comment by Peter Jones [ 16/Mar/19 ] |
|
Landed for 2.13 |
| Comment by Mikhail Pershin [ 16/Mar/19 ] |
|
Re-open ticket, there are still things to resolve |
| Comment by Peter Jones [ 11/Sep/19 ] |
|
It looks like the remaining work would be landed under |