[LU-13617] Client dead lock leads to eviction from MDS (selinux is enabled) Created: 01/Jun/20 Updated: 27/Jun/22 Resolved: 13/Aug/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.12.7, Lustre 2.12.8 |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Alexander Boyko | Assignee: | Alexander Boyko |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
One thread got PR lock and waited on inode_lock to notify security context. Another thread got inode_lock and waited on CW lock(conflicts with PR). After timeout the client was evicted, it didn't cancel a PR lock. The dead lock happens on a parent directory. |
| Comments |
| Comment by Gerrit Updater [ 01/Jun/20 ] |
|
Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/38792 |
| Comment by Gerrit Updater [ 01/Jun/20 ] |
|
Alexander Boyko (alexander.boyko@hpe.com) uploaded a new patch: https://review.whamcloud.com/38793 |
| Comment by Gerrit Updater [ 23/Jun/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38792/ |
| Comment by Gerrit Updater [ 13/Aug/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38793/ |
| Comment by Hans Henrik Happe [ 22/Mar/22 ] |
|
We hit this behavior in 2.12.8. Client got evicted due to lock timeout when selinux is enabled. Wonder if this patch also should go into 2.12? |
| Comment by Alexander Boyko [ 23/Mar/22 ] |
|
regression was
git log - So 2.12 has the same problem as description. |
| Comment by Etienne Aujames [ 08/Apr/22 ] |
|
We hit this on a robinhood node:
|
| Comment by Gerrit Updater [ 08/Apr/22 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/47025 |
| Comment by Gerrit Updater [ 11/Apr/22 ] |
|
"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/47034 |
| Comment by Etienne Aujames [ 28/Apr/22 ] |
|
We hit this issue during maintenance regression tests (after updating Lustre clients from 2.12.6 LTS to 2.12.7 LTS on compute nodes). We were able to reproduce the issue with 5 nodes running mdtest. The CEA will patch Lustre clients with https://review.whamcloud.com/47034 on compute nodes and run regression tests on it. For now, we cannot activate selinux on clients without this patch on Lustre LTS >= 2.12.7. |