Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
3
-
9223372036854775807
Description
There is an important locking issue around cdt_llog_lock when adding new HSM requests.
# time wc -l /proc/fs/lustre/mdt/snx11133-MDT0000/hsm/actions 219759 /proc/fs/lustre/mdt/snx11133-MDT0000/hsm/actions real 11m45.068s user 0m0.020s sys 0m21.372s
11 minutes to cat the list is too high. Such operation should take a couple seconds at most.
The contention appears to come from the coordinator. Every time a new request is posted, the whole list of request is browsed, under that lock. That's not a problem when there is only a handful of request, but it doesn't scale when there is hundreds of thousands of them.
I recompiled a centos 7 kernel with CONFIG_LOCK_STAT on a VM. I ran test creating 10000 files and archiving them without a copytool present. Total time was 146 seconds. Lock contention result:
lock_stat version 0.3 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- class name con-bounces contentions waittime-min waittime-max waittime-total acq-bounces acquisitions holdtime-min holdtime-max holdtime-total ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [...] &cdt->cdt_llog_lock: 6296 6296 15.45 23074.17 43436574.06 17791 27134 25.09 37745.03 138558199.24 ------------------- &cdt->cdt_llog_lock 6296 [<ffffffffa0fb096d>] cdt_llog_process+0x9d/0x3a0 [mdt] ------------------- &cdt->cdt_llog_lock 6296 [<ffffffffa0fb096d>] cdt_llog_process+0x9d/0x3a0 [mdt] [...]
(time units are micro-seconds).
With waittime-total=43 seconds and holdtime-total=138s, this is a very contentious lock, way above the other locks in Lustre or the whole system.
AFAICS, contention is between these mechanisms:
- adding a new request (lfs hsm_archive, ...)
- changing a request status (WAITING->STARTED->SUCCEED)
- removing a request (archive completed)
- housekeeping (coordinator loop every 10 seconds)
- dumping the list of actions from /proc
The net result is that when there is a lot of requests, they trickle down to the copytool, exacerbating the problem by increasing the number in the list.
As we finally seem to be at an end of patches queued up for this ticket let's close it and open a new ticket to track any new fixes identified in this area of code in the future.