We have some OSTs that we let get out of hand and have reached 100% capacity. We have offlined them using "lctl --device <device_num> deactivate" along with others that are approaching capacity. Despite having users delete multi-terabyte files and using the lfs_migrate script (with two patches from
LU-4293 included to allow it to use "lfs migrate" as root instead of rsync) to migrate over 100 TB of data (with the full OSTs deactivated), we are not freeing up any space on the OSTs.
Our initial guess was that after the layout swap of the "lfs migrate", the old objects were not being deleted from disk because those OSTs were deactivated on the MDS. Therefore on one OST I re-activated it on the MDS, unmounted from the OSS, and ran an "e2fsck -v -f -p /dev/..." and that seemed to free about 300 GB on the OST. I tried the same procedure on another OST and it did not change anything. The e2fsck output indicates that nothing "happened" in either case.
This is a live, production file system so after yanking two OSTs offline I thought I'd stop testing theories before too many users called