Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.11.0, Lustre 2.10.2, Lustre 2.10.3
-
Soak performance cluster - Lustre version=2.10.2_4_gb151f34
-
3
-
9223372036854775807
Description
We do OSS failover, trigger LFSCK:
lctl lfsck_start -M soaked-MDT0000 -s 1000 -t all -A{code]
The lfsck start hangs, lfsck is not started, the clients wedge in state 'comp' the entire system wedges. I have dumped Lustre Logs from all MDS, attached. I have crash-dumped all the MDT nodes and the dumps are available on Spirit. lfsck_layout is unkillable.
Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/31627
Subject:
LU-10419lfsck: single master engine when stopProject: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e3e7d1a41711cfb0a12b941a88bf8c0bf3b4cc89