[LU-5931] Deactivated OST still contains data Created: 18/Nov/14  Updated: 08/Jul/15  Resolved: 08/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Eric Kolb Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: zfs
Environment:

ZFS on Linux 0.6.2, Scientific Linux 6.4


Issue Links:
Related
is related to LU-4825 lfs migrate not freeing space on OST Resolved
Epic/Theme: zfs
Severity: 3
Epic: zfs
Rank (Obsolete): 16565

 Description   

Not sure this is the appropriate location for this issue but have little evidence elsewhere of a similar nature.

We have two ZFS backed OSTs (1 and 2) which we would like to remove from our Lustre environment for maintenance purposes. So we ran a 'lfs find /RSF1 --ost 1,2' to locate any stripes and subsequently copied the data to new files and removed the old. Running "lfs getstripe" confirms the new file resides and the remaining OSTs. The mystery is that the original OSTs still indicate that they house a significant amount of data.

UUID                   1K-blocks        Used   Available Use% Mounted on
RSF1-MDT0000_UUID       76477312     6747648    69727616   9% /RSF1[MDT:0]
RSF1-MDT0001_UUID       76416000     7168256    69245696   9% /RSF1[MDT:1]
* RSF1-OST0001_UUID     8053647744  3711864960  4341780736  46% /RSF1[OST:1]
* RSF1-OST0002_UUID     8053646848  3706844416  4346767616  46% /RSF1[OST:2]
RSF1-OST2776_UUID    12387717248  6162569728  6225144320  50% /RSF1[OST:10102]
RSF1-OST2840_UUID    12387719040  5993136000  6394579328  48% /RSF1[OST:10304]
RSF1-OST290a_UUID    12387720832  6174761856  6212955520  50% /RSF1[OST:10506]
RSF1-OST29d4_UUID    12387713408  6129103104  6258606848  49% /RSF1[OST:10708]
RSF1-OST2a9e_UUID    12387713536  5944379008  6443330944  48% /RSF1[OST:10910]
RSF1-OST2b68_UUID    12387710464  5959099904  6428606592  48% /RSF1[OST:11112]
RSF1-OST2c32_UUID    12387712384  6011423872  6376284800  49% /RSF1[OST:11314]

On the OSS nodes mounting those ZFS backed OSTs we run 'zdb -dd OST1 | grep "ZFS plain file"' for example and using the zfsobj2fid utility to map the resultant list of ZFS OID to FIDs. Then on a Lustre client we run:

lfs fid2path /RSF1 [0x280005221:0x11424:0x0]
fid2path error: No such file or directory

on all the FIDs but nothing is found.

This situation is concerning us as we will be permanently removing the two OSTs in question but is valid data still housed there?

Does the data still reside on the two OSTs in question because they are deactivated and thus read-only?

Sorry if this is a duplicate or an inappropriate location but have few avenues left to try and seems like a bug to us.



 Comments   
Comment by Oleg Drokin [ 18/Nov/14 ]

did you deactivate OSTs on the MDS by any chance, that would cause exact same symptoms because MDS is not able to clean up objects on the OSTs in that case.

Comment by Eric Kolb [ 18/Nov/14 ]

Thanks. Yes we did deactivate the OSTs on the MDS because the documentation instructed us too?

https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#section_k3l_4gt_tl

Is this not the case?

Comment by Andreas Dilger [ 18/Nov/14 ]

There is a bit of a disconnect between the documented process and the 2.4 releases and beyond. Deactivating the OST on the MDS stops new files from being created there, but since 2.4 it also prevents the MDS from deleting those files. This is indeed a bug that needs to be fixed.

As for using lfs fid2path to check these objects shows that the MDS inode that was previously referencing the OST object is no longer there, because you deleted or migrated them. That is as it should be.

One option is to keep the OSTs around until your next outage and then reactivate the OSTs after user processes that create files have been stopped. That would allow the OST objects to be deleted if you are concerned about the remaining objects.

As for a long-term solution to this problem I think there are two options. Firstly we can allow the OST to be marked inactive for object creation on the MDS without preventing it from destroying the OST objects. The second is to have a setting on the OST to prevent the MDS from selecting it for new object allocation. There are two hooks for this second option already: mark OST full/ENOSPC so the MDS ignores it completely, and to mark the OST in RAID rebuild so that the MDS avoids it. This could be enhanced to make it a hard stop on file creation.

Comment by Eric Kolb [ 19/Nov/14 ]

Thanks for the information. With this we are more comfortable in
proceeding with our maintenance.

Comment by Sean Brisbane [ 24/Jun/15 ]

This is a duplicate of https://jira.hpdd.intel.com/browse/LU-4825.

Comment by Andreas Dilger [ 08/Jul/15 ]

Closing as a duplicate of LU-4825.

Generated at Sat Feb 10 01:55:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.