Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
3
-
9223372036854775807
Description
There is an important locking issue around cdt_llog_lock when adding new HSM requests.
# time wc -l /proc/fs/lustre/mdt/snx11133-MDT0000/hsm/actions 219759 /proc/fs/lustre/mdt/snx11133-MDT0000/hsm/actions real 11m45.068s user 0m0.020s sys 0m21.372s
11 minutes to cat the list is too high. Such operation should take a couple seconds at most.
The contention appears to come from the coordinator. Every time a new request is posted, the whole list of request is browsed, under that lock. That's not a problem when there is only a handful of request, but it doesn't scale when there is hundreds of thousands of them.
I recompiled a centos 7 kernel with CONFIG_LOCK_STAT on a VM. I ran test creating 10000 files and archiving them without a copytool present. Total time was 146 seconds. Lock contention result:
lock_stat version 0.3 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- class name con-bounces contentions waittime-min waittime-max waittime-total acq-bounces acquisitions holdtime-min holdtime-max holdtime-total ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [...] &cdt->cdt_llog_lock: 6296 6296 15.45 23074.17 43436574.06 17791 27134 25.09 37745.03 138558199.24 ------------------- &cdt->cdt_llog_lock 6296 [<ffffffffa0fb096d>] cdt_llog_process+0x9d/0x3a0 [mdt] ------------------- &cdt->cdt_llog_lock 6296 [<ffffffffa0fb096d>] cdt_llog_process+0x9d/0x3a0 [mdt] [...]
(time units are micro-seconds).
With waittime-total=43 seconds and holdtime-total=138s, this is a very contentious lock, way above the other locks in Lustre or the whole system.
AFAICS, contention is between these mechanisms:
- adding a new request (lfs hsm_archive, ...)
- changing a request status (WAITING->STARTED->SUCCEED)
- removing a request (archive completed)
- housekeeping (coordinator loop every 10 seconds)
- dumping the list of actions from /proc
The net result is that when there is a lot of requests, they trickle down to the copytool, exacerbating the problem by increasing the number in the list.