[LU-7012] files not being deleted from OST after being re-activated Created: 17/Aug/15  Updated: 15/Sep/16  Resolved: 11/Nov/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.4
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Dustin Leverman Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL-6.6, lustre-2.5.4


Issue Links:
Duplicate
is duplicated by LU-4295 removing files on deactivated OST doe... Resolved
Related
is related to LU-4825 lfs migrate not freeing space on OST Resolved
Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

We had 4 OSTs that we deactivated because of an imbalance in utilization that was causing ENOSPC messages to our users. We identified a file that was consuming a significant amount of space that we deleted while the OSTs were deactivated. The file is no longer seen in the directory structure (the MDS processed the request), but the objects on the OSTs were not marked as free. After re-activating the OSTs, it doesn't appear that the llog was flushed, which should free up those objects.

At this time, some users are not able to run jobs because they cannot allocated any space.

We understand how this is supposed to work, but as the user in LU-4295 pointed out, it is not.

Please advise.



 Comments   
Comment by Oleg Drokin [ 17/Aug/15 ]

Does your OST shows as active on the MDT (i.e. did MDT reconnect)?

Comment by Matt Ezell [ 17/Aug/15 ]
[root@atlas-mds1 ~]# cat /proc/fs/lustre/osc/*/active|sort|uniq -c
   1008 1
[root@atlas-mds1 ~]# cat /proc/fs/lustre/osp/*/active|sort|uniq -c
   1008 1
Aug 17 09:40:25 atlas-mgs1.ccs.ornl.gov kernel: [7161385.311368] Lustre: Permanently reactivating atlas1-OST02ce
Aug 17 09:40:25 atlas-mgs1.ccs.ornl.gov kernel: [7161385.321383] Lustre: Setting parameter atlas1-OST02ce-osc.osc.active in log atlas1-client
Aug 17 09:40:40 atlas-mgs1.ccs.ornl.gov kernel: [7161400.916159] Lustre: Permanently reactivating atlas1-OST039b
Aug 17 09:40:40 atlas-mgs1.ccs.ornl.gov kernel: [7161400.926057] Lustre: Setting parameter atlas1-OST039b-osc.osc.active in log atlas1-client
Aug 17 09:40:51 atlas-mgs1.ccs.ornl.gov kernel: [7161411.936736] Lustre: Permanently reactivating atlas1-OST02c1
Aug 17 09:40:51 atlas-mgs1.ccs.ornl.gov kernel: [7161411.946798] Lustre: Setting parameter atlas1-OST02c1-osc.osc.active in log atlas1-client
Aug 17 09:41:00 atlas-mgs1.ccs.ornl.gov kernel: [7161420.990618] Lustre: Permanently reactivating atlas1-OST02fb
Aug 17 09:41:00 atlas-mgs1.ccs.ornl.gov kernel: [7161421.000097] Lustre: Setting parameter atlas1-OST02fb-osc.osc.active in log atlas1-client
Comment by Oleg Drokin [ 17/Aug/15 ]

"Permanently reactivating" is just a message from mgs.
How about on the MDs logs showing reconnect to the OST and OST showing thta MDT connected to it?

Comment by Oleg Drokin [ 17/Aug/15 ]

Essentially the footprint I am looking for (on the MDS) would be:

[ 7384.128329] Lustre: setting import lustre-OST0001_UUID INACTIVE by administrator request
[ 7403.759510] Lustre: lustre-OST0001-osc-ffff8800b96a1800: Connection to lustre-OST0001 (at 192.168.10.227@tcp) was lost; in progress operations using this service will wait for recovery to complete
[ 7403.764253] LustreError: 167-0: lustre-OST0001-osc-ffff8800b96a1800: This client was evicted by lustre-OST0001; in progress operations using this service will fail.
[ 7403.765235] Lustre: lustre-OST0001-osc-ffff8800b96a1800: Connection restored to lustre-OST0001 (at 192.168.10.227@tcp)

Where the first INACTIVE would come from lctl deactivate and the connection restored would come from lctl activate.

Comment by Dustin Leverman [ 17/Aug/15 ]

Oleg,
Below is the log messages for the reactivation of atlas1-OST039b, atlas1-OST02c1, atlas1-OST02fb, and atlas1-OST02ce:

Aug 17 08:11:19 atlas-mds1.ccs.ornl.gov kernel: [2973887.632969] Lustre: setting import atlas1-OST02c1_UUID INACTIVE by administrator request
Aug 17 08:11:25 atlas-mds1.ccs.ornl.gov kernel: [2973893.078469] Lustre: setting import atlas1-OST02fb_UUID INACTIVE by administrator request
Aug 17 08:11:30 atlas-mds1.ccs.ornl.gov kernel: [2973898.379605] Lustre: setting import atlas1-OST039b_UUID INACTIVE by administrator request
Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.381423] Lustre: atlas1-OST039b-osc-MDT0000: Connection to atlas1-OST039b (at 10.36.225.89@o2ib) was lost; in progress operations using this service will wait for recovery to complete
Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.400737] LustreError: 167-0: atlas1-OST039b-osc-MDT0000: This client was evicted by atlas1-OST039b; in progress operations using this service will fail.
Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.416837] Lustre: atlas1-OST039b-osc-MDT0000: Connection restored to atlas1-OST039b (at 10.36.225.89@o2ib)
Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.822971] Lustre: atlas1-OST02fb-osc-MDT0000: Connection to atlas1-OST02fb (at 10.36.225.73@o2ib) was lost; in progress operations using this service will wait for recovery to complete
Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.842235] LustreError: 167-0: atlas1-OST02fb-osc-MDT0000: This client was evicted by atlas1-OST02fb; in progress operations using this service will fail.
Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.858294] Lustre: atlas1-OST02fb-osc-MDT0000: Connection restored to atlas1-OST02fb (at 10.36.225.73@o2ib)
Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.287935] Lustre: atlas1-OST02c1-osc-MDT0000: Connection to atlas1-OST02c1 (at 10.36.225.159@o2ib) was lost; in progress operations using this service will wait for recovery to complete
Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.307394] LustreError: 167-0: atlas1-OST02c1-osc-MDT0000: This client was evicted by atlas1-OST02c1; in progress operations using this service will fail.
Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.323480] Lustre: atlas1-OST02c1-osc-MDT0000: Connection restored to atlas1-OST02c1 (at 10.36.225.159@o2ib)
Aug 17 11:53:44 atlas-mds1.ccs.ornl.gov kernel: [2987244.922580] Lustre: setting import atlas1-OST02c7_UUID INACTIVE by administrator request
Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.220947] Lustre: atlas1-OST02c7-osc-MDT0000: Connection to atlas1-OST02c7 (at 10.36.225.165@o2ib) was lost; in progress operations using this service will wait for recovery to complete
Aug 17 11:53:47 atlas-oss2h8.ccs.ornl.gov kernel: [7165636.459725] Lustre: atlas1-OST02c7: Client atlas1-MDT0000-mdtlov_UUID (at 10.36.226.72@o2ib) reconnecting
Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.265826] LustreError: 167-0: atlas1-OST02c7-osc-MDT0000: This client was evicted by atlas1-OST02c7; in progress operations using this service will fail.
Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.281892] Lustre: atlas1-OST02c7-osc-MDT0000: Connection restored to atlas1-OST02c7 (at 10.36.225.165@o2ib)
Aug 17 11:53:47 atlas-oss2h8.ccs.ornl.gov kernel: [7165636.501432] Lustre: atlas1-OST02c7: deleting orphan objects from 0x0:11321511 to 0x0:11321537
Comment by Matt Ezell [ 17/Aug/15 ]

Oleg-

We have some OST object IDs of large files that should be deleted. I just checked with debugfs, and the objects are still there. If we unmount, mount as ldiskfs, remove the objects, unmount, and remount as lustre, will this cause a problem later (if the MDS delete request ever makes it through)? We'd also prefer a solution that doesn't require taking OSTs offline, but we'll do what we have to. And we have an unknown number of other orphan objects out there.

We also dumped the llog on the MDS, and the latest entry was from October 2013.

Comment by Oleg Drokin [ 17/Aug/15 ]

Removing objects is not going to be a problem later.
In fact I imagine you can even mount ost in parallel as ldiskfs and remove the objects in the object dir (just make sure not to delete anything that is actually referenced).
Kernel will moderate access so lustre and parallel ldiskfs mount can coexist (just make sure to mount it on the same node).

Though it's still strange that objects are not deleted by log replay.
An interesting experiment would be an MDS restart/failover, though I guess you would rather prefer not to try it.

Comment by Andreas Dilger [ 17/Aug/15 ]

While this is related to LU-4825, I think that there are two separate issues here:

  • files are not deleted while the import is deactivated. I think that issue should be handled by LU-4825.
  • orphans are not cleaned up when the import is reactivated. I think that issue should be handled by this ticket.

I'm not sure why the OSP doesn't restart orphan cleanup when it is reactivated, but currently this needs an MDS restart. That issue should be fixed to allow orphan cleanup to resume once the import is reactivated.

Comment by Matt Ezell [ 17/Aug/15 ]

We chose the "safer" route and unmounted the OST before mounting as ldiskfs. We removed the files and usage went back down.

Comment by Mikhail Pershin [ 02/Sep/15 ]

Andreas, what is the difference between two cases in you comment? As I can see LU-4825 is about orphans as well. If file was deleted while OST is deactivated then its objects on OST are orphans and are not deleted after all. This is what LU-4825 is going to solve, isn't it?

Comment by Andreas Dilger [ 03/Sep/15 ]

Mike, there are two separate problems:
1) the current method for doing OST space balancing is to deactivate the OSP and then migrate files (or let users do this gradually), so the deactivated OST will not be used for new objects. However, deactivating the OSP also prevents the MDS from destroying the objects of unlinked files (since 2.4) so space is never released on the OST, which confuses users. This issue will be addressed by LU-4825 by adding a new method for disabling object allocation on an OST without fully deactivating the OSP, so that the MDS can still process object destroys.

2) when the deactivated OSP is reactivated again, even after restarting the OST, it does not process the unlink llogs (and presumably Astarte logs, but that is harder to check) until the MDS is stopped and restarted. The MDS should begin processing the recovery llogs after the OSP has been reactivated. That is what this bug is for.

Even though LU-4825 will reduce the times when an OSP needs to be deactivated (i.e. Not for space balancing anymore), there are other times when this still needs to be done (e.g. OST offline for maintenance or similar) so recovery llog processing still needs to work.

Comment by Mikhail Pershin [ 06/Sep/15 ]

This is OSP problem it seems, which doesn't restart llog processing from the point where OST was de-activated. I am testing it locally now.

Comment by Mikhail Pershin [ 07/Sep/15 ]

Well, I was trying to reproduce that locally and objects are not deleted while OSP is deactivated but they are deleted immediately when I re-activate OSP back. I used 'lctl --device <osp device> deactivate' command to deactivate an OSP. Then destroy big file that was previously created on that OST. The 'df' shows that space on related OST is not freed, after that I re-activated OSP back and 'df' shows space is returned back. Any thoughts what else may affect that?

Comment by Jian Yu [ 15/Sep/15 ]

Hi Mike,

What Lustre version did you test on? Is it Lustre 2.5.x or master branch?

Comment by Mikhail Pershin [ 17/Sep/15 ]

Yu Jian, you are right, I have used master branch instead of b2_5, my mistake. I am repeating local tests with 2.5 now

Comment by Mikhail Pershin [ 18/Sep/15 ]

I can reproduce this with 2.5, considering that master is working fine I am going to find out related changes in it and port them to the 2.5

Comment by Gerrit Updater [ 23/Sep/15 ]

Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/16612
Subject: LU-7012 osp: don't use OSP when import is deactivated
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: b7daf6b218a34c18330ff6f5d8e023e48bee1e0b

Comment by Mikhail Pershin [ 23/Sep/15 ]

That is interesting that there are no obvious changes between master and b2_5 related to this behavior. Meanwhile I've made a simple fix for this issue, it works for me. Please check it.

Comment by James A Simmons [ 07/Oct/15 ]

Finishing testing your patch and it appears to have resolved our issues.

Comment by Andreas Dilger [ 08/Oct/15 ]

Mike, is this needed for 2.7.x or only 2.5.x? It would be great to link this to a specific patch/bug that fixed the problem for master if possible.

Comment by Mikhail Pershin [ 26/Oct/15 ]

Andreas, I see the same problem in 2.7 but it works somehow, I suppose that DNE changes fixed that indirectly by adding more synchronization mechanisms in OSP. Meanwhile, I'd add this patch to the 2.7 just as direct fix for that particular problem

Comment by Gerrit Updater [ 26/Oct/15 ]

Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/16937
Subject: LU-7012 osp: don't use OSP when import is deactivated
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9a0a96518ab32908d381d22a4eccbfaa28cafd1d

Comment by Gerrit Updater [ 11/Nov/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16937/
Subject: LU-7012 osp: don't use OSP when import is deactivated
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 82cbfd77f33bc33ea047407dfaecf4b04d44930a

Comment by Joseph Gmitter (Inactive) [ 11/Nov/15 ]

Landed for 2.8

Generated at Sat Feb 10 02:05:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.