[LU-5931] Deactivated OST still contains data Created: 18/Nov/14 Updated: 08/Jul/15 Resolved: 08/Jul/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Eric Kolb | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | zfs | ||
| Environment: |
ZFS on Linux 0.6.2, Scientific Linux 6.4 |
||
| Issue Links: |
|
||||||||
| Epic/Theme: | zfs | ||||||||
| Severity: | 3 | ||||||||
| Epic: | zfs | ||||||||
| Rank (Obsolete): | 16565 | ||||||||
| Description |
|
Not sure this is the appropriate location for this issue but have little evidence elsewhere of a similar nature. We have two ZFS backed OSTs (1 and 2) which we would like to remove from our Lustre environment for maintenance purposes. So we ran a 'lfs find /RSF1 --ost 1,2' to locate any stripes and subsequently copied the data to new files and removed the old. Running "lfs getstripe" confirms the new file resides and the remaining OSTs. The mystery is that the original OSTs still indicate that they house a significant amount of data. UUID 1K-blocks Used Available Use% Mounted on RSF1-MDT0000_UUID 76477312 6747648 69727616 9% /RSF1[MDT:0] RSF1-MDT0001_UUID 76416000 7168256 69245696 9% /RSF1[MDT:1] * RSF1-OST0001_UUID 8053647744 3711864960 4341780736 46% /RSF1[OST:1] * RSF1-OST0002_UUID 8053646848 3706844416 4346767616 46% /RSF1[OST:2] RSF1-OST2776_UUID 12387717248 6162569728 6225144320 50% /RSF1[OST:10102] RSF1-OST2840_UUID 12387719040 5993136000 6394579328 48% /RSF1[OST:10304] RSF1-OST290a_UUID 12387720832 6174761856 6212955520 50% /RSF1[OST:10506] RSF1-OST29d4_UUID 12387713408 6129103104 6258606848 49% /RSF1[OST:10708] RSF1-OST2a9e_UUID 12387713536 5944379008 6443330944 48% /RSF1[OST:10910] RSF1-OST2b68_UUID 12387710464 5959099904 6428606592 48% /RSF1[OST:11112] RSF1-OST2c32_UUID 12387712384 6011423872 6376284800 49% /RSF1[OST:11314] On the OSS nodes mounting those ZFS backed OSTs we run 'zdb -dd OST1 | grep "ZFS plain file"' for example and using the zfsobj2fid utility to map the resultant list of ZFS OID to FIDs. Then on a Lustre client we run: lfs fid2path /RSF1 [0x280005221:0x11424:0x0] on all the FIDs but nothing is found. This situation is concerning us as we will be permanently removing the two OSTs in question but is valid data still housed there? Does the data still reside on the two OSTs in question because they are deactivated and thus read-only? Sorry if this is a duplicate or an inappropriate location but have few avenues left to try and seems like a bug to us. |
| Comments |
| Comment by Oleg Drokin [ 18/Nov/14 ] |
|
did you deactivate OSTs on the MDS by any chance, that would cause exact same symptoms because MDS is not able to clean up objects on the OSTs in that case. |
| Comment by Eric Kolb [ 18/Nov/14 ] |
|
Thanks. Yes we did deactivate the OSTs on the MDS because the documentation instructed us too? Is this not the case? |
| Comment by Andreas Dilger [ 18/Nov/14 ] |
|
There is a bit of a disconnect between the documented process and the 2.4 releases and beyond. Deactivating the OST on the MDS stops new files from being created there, but since 2.4 it also prevents the MDS from deleting those files. This is indeed a bug that needs to be fixed. As for using lfs fid2path to check these objects shows that the MDS inode that was previously referencing the OST object is no longer there, because you deleted or migrated them. That is as it should be. One option is to keep the OSTs around until your next outage and then reactivate the OSTs after user processes that create files have been stopped. That would allow the OST objects to be deleted if you are concerned about the remaining objects. As for a long-term solution to this problem I think there are two options. Firstly we can allow the OST to be marked inactive for object creation on the MDS without preventing it from destroying the OST objects. The second is to have a setting on the OST to prevent the MDS from selecting it for new object allocation. There are two hooks for this second option already: mark OST full/ENOSPC so the MDS ignores it completely, and to mark the OST in RAID rebuild so that the MDS avoids it. This could be enhanced to make it a hard stop on file creation. |
| Comment by Eric Kolb [ 19/Nov/14 ] |
|
Thanks for the information. With this we are more comfortable in |
| Comment by Sean Brisbane [ 24/Jun/15 ] |
|
This is a duplicate of https://jira.hpdd.intel.com/browse/LU-4825. |
| Comment by Andreas Dilger [ 08/Jul/15 ] |
|
Closing as a duplicate of |