[LU-16316] ZFS OSS locks Created: 16/Nov/22 Updated: 23/Dec/22 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Dominika Wanat | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre: 2.15.0_RC3, zfs 2.0.7 (both self-compiled) |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
We have experienced locks over the past few weeks on OSS based on ZFS 2.0.7, which makes the node unresponsive in terms of Lustre (OSS node goes unhealthy) and causes a huge load (>400) on OSS. In some situations, directly after that, the load on MDS also increases, but it seems like a consequence of lost communication between MDS and affected OSS. We cannot associate this problem with the exact IO pattern or type of operation. We first address this problem here, but we cannot exclude that it should be addressed to ZFS developers - if you consider it, please let us know. We attach two types of logs: the first from the 16th of October when both MDS and OSS were affected and the second from the 13th of November when only OSS was stuck. If you need more information, please don't hesitate to let us know.
Regards
Dominika Wanat |
| Comments |
| Comment by Alex Zhuravlev [ 25/Nov/22 ] |
correct, this is because MDS gets stuck awaiting for new objects from the problem OST. I'm not 100% positive, but I found number of OST threads trying to prefetch data. you can try to disable prefetching to see whether it's related: echo 1 > /sys/module/zfs/parameters/zfs_prefetch_disable – on OSTs |
| Comment by Dominika Wanat [ 28/Nov/22 ] |
|
Thanks for the hint. We are investigating the nodes with prefetch disabled. |
| Comment by Dominika Wanat [ 20/Dec/22 ] |
|
It looks like it helps - nodes have not hung since then. Do you consider fixing this behaviour of Lustre with ZFS prefetch enabled? |
| Comment by Alex Zhuravlev [ 23/Dec/22 ] |
this is very workload specific thing.. we've seen number of reports where prefetching does improve performance. |