Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.8.0
-
None
-
Lustre ldiskfs server back end running version 2.8.1 with a few additional patches. The OS is RHEL6.9
-
2
-
9223372036854775807
Description
One of production file systems running lustre 2.8.1 experienced a soft lock up very similar to LU-9488. I attempted to back port the patch but way to many changes have happened between 2.8.1 and lustre 2.10.0. Unsure if I would get the port right. I have attached the back trace.
We wouldn't be running the test framework on our production system. It looks like I just need to create a bunch of files on the file system.
lctl set_param -n osd*.MDT.force_sync=1
lctl set_param fail_val=1 fail_loc=0x190
lctl lfsck_start -M lustre-MDT0000
lctl set_param fail_val=0 fail_loc=0x198
While you check status:
lctl get_param -n osd-ldiskfs.lustre-MDT000.oi_scrub | grep status
Does this look right? What values do I use to reset it back to normal working conditions.