Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.5.2
-
CentOS 6.5, kernel 2.6.32-431.17.1.el6_lustre.x86_64
-
3
-
16216
Description
One of our Stampede filesystems running Lustre 2.5.2 has an OST offline due to a different problem described in another ticket. Since the OST has been offline, the MDS server crashed with an LBUG and was restarted last Friday. After the restart, the MDS server no longer automatically creates files on any OSTs after the offline OSTs. In our case, OST0010 is offline so now the MDS will only create files on the first 16 OSTs unless we manually specify the stripeoffset in lfs setstripe. This is overloading the the servers with these OSTs while the others are doing nothing. If we deactivate the first 16 OSTs on the MDS, then all files are created with the first stripe on the lowest numbered active OST.
Can you suggest any way to force the MDS to use all the other OSTs through any lctl set_param options? Getting the offline OST back online is not currently an option due to corruption and ongoing e2fsck, it can't be mounted. Manually setting the stripe is also not an option, we need it to work automatically like it should. Could we set some qos options to try and have it balance the OST file creation?