Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.15.7
-
lustre-2.15.7_3.llnl-1.3shs13.t4.x86_64
zfs-2.2.8_6llnl-1.t4.x86_64
toss 4.8-21
kernel 4.18.0-553.104.1.1toss.t4.x86_64
-
3
-
9223372036854775807
Description
We recently saw what looks like a deadlock on the MDTs of our yuba cluster. When checking /proc/spl/kstat/zfs/<pool_name>/txgs, the txg number was not advancing for tens of minutes and the affected nodes had to be rebooted. This happened on several of the MDT nodes all around the same time.