[LU-7012] files not being deleted from OST after being re-activated Created: 17/Aug/15 Updated: 15/Sep/16 Resolved: 11/Nov/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.4 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Dustin Leverman | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RHEL-6.6, lustre-2.5.4 |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 2 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
We had 4 OSTs that we deactivated because of an imbalance in utilization that was causing ENOSPC messages to our users. We identified a file that was consuming a significant amount of space that we deleted while the OSTs were deactivated. The file is no longer seen in the directory structure (the MDS processed the request), but the objects on the OSTs were not marked as free. After re-activating the OSTs, it doesn't appear that the llog was flushed, which should free up those objects. At this time, some users are not able to run jobs because they cannot allocated any space. We understand how this is supposed to work, but as the user in Please advise. |
| Comments |
| Comment by Oleg Drokin [ 17/Aug/15 ] |
|
Does your OST shows as active on the MDT (i.e. did MDT reconnect)? |
| Comment by Matt Ezell [ 17/Aug/15 ] |
[root@atlas-mds1 ~]# cat /proc/fs/lustre/osc/*/active|sort|uniq -c 1008 1 [root@atlas-mds1 ~]# cat /proc/fs/lustre/osp/*/active|sort|uniq -c 1008 1 Aug 17 09:40:25 atlas-mgs1.ccs.ornl.gov kernel: [7161385.311368] Lustre: Permanently reactivating atlas1-OST02ce Aug 17 09:40:25 atlas-mgs1.ccs.ornl.gov kernel: [7161385.321383] Lustre: Setting parameter atlas1-OST02ce-osc.osc.active in log atlas1-client Aug 17 09:40:40 atlas-mgs1.ccs.ornl.gov kernel: [7161400.916159] Lustre: Permanently reactivating atlas1-OST039b Aug 17 09:40:40 atlas-mgs1.ccs.ornl.gov kernel: [7161400.926057] Lustre: Setting parameter atlas1-OST039b-osc.osc.active in log atlas1-client Aug 17 09:40:51 atlas-mgs1.ccs.ornl.gov kernel: [7161411.936736] Lustre: Permanently reactivating atlas1-OST02c1 Aug 17 09:40:51 atlas-mgs1.ccs.ornl.gov kernel: [7161411.946798] Lustre: Setting parameter atlas1-OST02c1-osc.osc.active in log atlas1-client Aug 17 09:41:00 atlas-mgs1.ccs.ornl.gov kernel: [7161420.990618] Lustre: Permanently reactivating atlas1-OST02fb Aug 17 09:41:00 atlas-mgs1.ccs.ornl.gov kernel: [7161421.000097] Lustre: Setting parameter atlas1-OST02fb-osc.osc.active in log atlas1-client |
| Comment by Oleg Drokin [ 17/Aug/15 ] |
|
"Permanently reactivating" is just a message from mgs. |
| Comment by Oleg Drokin [ 17/Aug/15 ] |
|
Essentially the footprint I am looking for (on the MDS) would be: [ 7384.128329] Lustre: setting import lustre-OST0001_UUID INACTIVE by administrator request [ 7403.759510] Lustre: lustre-OST0001-osc-ffff8800b96a1800: Connection to lustre-OST0001 (at 192.168.10.227@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 7403.764253] LustreError: 167-0: lustre-OST0001-osc-ffff8800b96a1800: This client was evicted by lustre-OST0001; in progress operations using this service will fail. [ 7403.765235] Lustre: lustre-OST0001-osc-ffff8800b96a1800: Connection restored to lustre-OST0001 (at 192.168.10.227@tcp) Where the first INACTIVE would come from lctl deactivate and the connection restored would come from lctl activate. |
| Comment by Dustin Leverman [ 17/Aug/15 ] |
|
Oleg, Aug 17 08:11:19 atlas-mds1.ccs.ornl.gov kernel: [2973887.632969] Lustre: setting import atlas1-OST02c1_UUID INACTIVE by administrator request Aug 17 08:11:25 atlas-mds1.ccs.ornl.gov kernel: [2973893.078469] Lustre: setting import atlas1-OST02fb_UUID INACTIVE by administrator request Aug 17 08:11:30 atlas-mds1.ccs.ornl.gov kernel: [2973898.379605] Lustre: setting import atlas1-OST039b_UUID INACTIVE by administrator request Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.381423] Lustre: atlas1-OST039b-osc-MDT0000: Connection to atlas1-OST039b (at 10.36.225.89@o2ib) was lost; in progress operations using this service will wait for recovery to complete Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.400737] LustreError: 167-0: atlas1-OST039b-osc-MDT0000: This client was evicted by atlas1-OST039b; in progress operations using this service will fail. Aug 17 08:42:11 atlas-mds1.ccs.ornl.gov kernel: [2975741.416837] Lustre: atlas1-OST039b-osc-MDT0000: Connection restored to atlas1-OST039b (at 10.36.225.89@o2ib) Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.822971] Lustre: atlas1-OST02fb-osc-MDT0000: Connection to atlas1-OST02fb (at 10.36.225.73@o2ib) was lost; in progress operations using this service will wait for recovery to complete Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.842235] LustreError: 167-0: atlas1-OST02fb-osc-MDT0000: This client was evicted by atlas1-OST02fb; in progress operations using this service will fail. Aug 17 08:42:18 atlas-mds1.ccs.ornl.gov kernel: [2975747.858294] Lustre: atlas1-OST02fb-osc-MDT0000: Connection restored to atlas1-OST02fb (at 10.36.225.73@o2ib) Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.287935] Lustre: atlas1-OST02c1-osc-MDT0000: Connection to atlas1-OST02c1 (at 10.36.225.159@o2ib) was lost; in progress operations using this service will wait for recovery to complete Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.307394] LustreError: 167-0: atlas1-OST02c1-osc-MDT0000: This client was evicted by atlas1-OST02c1; in progress operations using this service will fail. Aug 17 08:42:26 atlas-mds1.ccs.ornl.gov kernel: [2975756.323480] Lustre: atlas1-OST02c1-osc-MDT0000: Connection restored to atlas1-OST02c1 (at 10.36.225.159@o2ib) Aug 17 11:53:44 atlas-mds1.ccs.ornl.gov kernel: [2987244.922580] Lustre: setting import atlas1-OST02c7_UUID INACTIVE by administrator request Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.220947] Lustre: atlas1-OST02c7-osc-MDT0000: Connection to atlas1-OST02c7 (at 10.36.225.165@o2ib) was lost; in progress operations using this service will wait for recovery to complete Aug 17 11:53:47 atlas-oss2h8.ccs.ornl.gov kernel: [7165636.459725] Lustre: atlas1-OST02c7: Client atlas1-MDT0000-mdtlov_UUID (at 10.36.226.72@o2ib) reconnecting Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.265826] LustreError: 167-0: atlas1-OST02c7-osc-MDT0000: This client was evicted by atlas1-OST02c7; in progress operations using this service will fail. Aug 17 11:53:47 atlas-mds1.ccs.ornl.gov kernel: [2987248.281892] Lustre: atlas1-OST02c7-osc-MDT0000: Connection restored to atlas1-OST02c7 (at 10.36.225.165@o2ib) Aug 17 11:53:47 atlas-oss2h8.ccs.ornl.gov kernel: [7165636.501432] Lustre: atlas1-OST02c7: deleting orphan objects from 0x0:11321511 to 0x0:11321537 |
| Comment by Matt Ezell [ 17/Aug/15 ] |
|
Oleg- We have some OST object IDs of large files that should be deleted. I just checked with debugfs, and the objects are still there. If we unmount, mount as ldiskfs, remove the objects, unmount, and remount as lustre, will this cause a problem later (if the MDS delete request ever makes it through)? We'd also prefer a solution that doesn't require taking OSTs offline, but we'll do what we have to. And we have an unknown number of other orphan objects out there. We also dumped the llog on the MDS, and the latest entry was from October 2013. |
| Comment by Oleg Drokin [ 17/Aug/15 ] |
|
Removing objects is not going to be a problem later. Though it's still strange that objects are not deleted by log replay. |
| Comment by Andreas Dilger [ 17/Aug/15 ] |
|
While this is related to
I'm not sure why the OSP doesn't restart orphan cleanup when it is reactivated, but currently this needs an MDS restart. That issue should be fixed to allow orphan cleanup to resume once the import is reactivated. |
| Comment by Matt Ezell [ 17/Aug/15 ] |
|
We chose the "safer" route and unmounted the OST before mounting as ldiskfs. We removed the files and usage went back down. |
| Comment by Mikhail Pershin [ 02/Sep/15 ] |
|
Andreas, what is the difference between two cases in you comment? As I can see |
| Comment by Andreas Dilger [ 03/Sep/15 ] |
|
Mike, there are two separate problems: 2) when the deactivated OSP is reactivated again, even after restarting the OST, it does not process the unlink llogs (and presumably Astarte logs, but that is harder to check) until the MDS is stopped and restarted. The MDS should begin processing the recovery llogs after the OSP has been reactivated. That is what this bug is for. Even though |
| Comment by Mikhail Pershin [ 06/Sep/15 ] |
|
This is OSP problem it seems, which doesn't restart llog processing from the point where OST was de-activated. I am testing it locally now. |
| Comment by Mikhail Pershin [ 07/Sep/15 ] |
|
Well, I was trying to reproduce that locally and objects are not deleted while OSP is deactivated but they are deleted immediately when I re-activate OSP back. I used 'lctl --device <osp device> deactivate' command to deactivate an OSP. Then destroy big file that was previously created on that OST. The 'df' shows that space on related OST is not freed, after that I re-activated OSP back and 'df' shows space is returned back. Any thoughts what else may affect that? |
| Comment by Jian Yu [ 15/Sep/15 ] |
|
Hi Mike, What Lustre version did you test on? Is it Lustre 2.5.x or master branch? |
| Comment by Mikhail Pershin [ 17/Sep/15 ] |
|
Yu Jian, you are right, I have used master branch instead of b2_5, my mistake. I am repeating local tests with 2.5 now |
| Comment by Mikhail Pershin [ 18/Sep/15 ] |
|
I can reproduce this with 2.5, considering that master is working fine I am going to find out related changes in it and port them to the 2.5 |
| Comment by Gerrit Updater [ 23/Sep/15 ] |
|
Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/16612 |
| Comment by Mikhail Pershin [ 23/Sep/15 ] |
|
That is interesting that there are no obvious changes between master and b2_5 related to this behavior. Meanwhile I've made a simple fix for this issue, it works for me. Please check it. |
| Comment by James A Simmons [ 07/Oct/15 ] |
|
Finishing testing your patch and it appears to have resolved our issues. |
| Comment by Andreas Dilger [ 08/Oct/15 ] |
|
Mike, is this needed for 2.7.x or only 2.5.x? It would be great to link this to a specific patch/bug that fixed the problem for master if possible. |
| Comment by Mikhail Pershin [ 26/Oct/15 ] |
|
Andreas, I see the same problem in 2.7 but it works somehow, I suppose that DNE changes fixed that indirectly by adding more synchronization mechanisms in OSP. Meanwhile, I'd add this patch to the 2.7 just as direct fix for that particular problem |
| Comment by Gerrit Updater [ 26/Oct/15 ] |
|
Mike Pershin (mike.pershin@intel.com) uploaded a new patch: http://review.whamcloud.com/16937 |
| Comment by Gerrit Updater [ 11/Nov/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16937/ |
| Comment by Joseph Gmitter (Inactive) [ 11/Nov/15 ] |
|
Landed for 2.8 |