[LU-8916] ods-zfs doesn't manage ZAP sizes correctly Created: 07/Dec/16 Updated: 23/Jan/18 Resolved: 22/Jan/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Giuseppe Di Natale (Inactive) | Assignee: | Alex Zhuravlev |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | llnl, patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
I mounted a ZFS backed lustre target thru the ZPL and it appears that ZAP sizes are not being adjusted as child objects are added or removed. Examples below. Looking at update_log_dir, performing a ls -l yields the following: drw-r--r--. 2 root root 2 Dec 31 1969 update_log_dir But, that's wrong. We should be seeing a size of 3 as an ls shows us there are 3 children. -bash-4.1# ls -a update_log_dir . .. [0x200000400:0x1:0x0] Looking at oi.10, ls -l displays a size of 0: drwxr-xr-x. 0 root root 0 Dec 31 1969 oi.10 But, the size we should be seeing is 12 as per the ls below. -bash-4.1# ls -a oi.10 . 0x20000000a:0x0:0x0 0xa:0x3:0x0 0xa:0x5:0x0 0xa:0x7:0x0 0xa:0x9:0x0 .. 0xa:0x0:0x0 0xa:0x4:0x0 0xa:0x6:0x0 0xa:0x8:0x0 0xa:0xa:0x0 In the ZPL, the size of a directory should be the number of objects contained within it. The osd-zfs is clearly breaking that. As a practical use case, in |
| Comments |
| Comment by Giuseppe Di Natale (Inactive) [ 07/Dec/16 ] |
|
I'll be submitting a preliminary patch shortly. I also have a few questions I need help with. |
| Comment by Gerrit Updater [ 07/Dec/16 ] |
|
Giuseppe Di Natale (dinatale2@llnl.gov) uploaded a new patch: https://review.whamcloud.com/24207 |
| Comment by Olaf Faaland [ 07/Dec/16 ] |
|
Note that the patch does not fix all cases. Joe will submit questions as well. |
| Comment by Peter Jones [ 07/Dec/16 ] |
|
Nathaniel Could you please assist with this one? Thanks Peter |
| Comment by Andreas Dilger [ 07/Dec/16 ] |
|
Note that the OIs are ZAPs, but are not strictly real directories because entries there do not increase the nlink count on the files/dirs that they are referencing. I definitely agree that being able to view and possibly remove entries from the OI but if that also decrements the dnode link count then this could cause potential issues if unlinking FID files from the OI ZAP via ZPL mountpoint. |
| Comment by Giuseppe Di Natale (Inactive) [ 07/Dec/16 ] |
|
I've not been looking at the link count too closely yet. I was going to look at that bit later. I'm strictly talking about the size of a directory as reported by ZPL. It is quite possible that I'm missing part of the picture and I should start paying attention to the link count. |
| Comment by Andreas Dilger [ 08/Dec/16 ] |
|
The OI files explicitly do NOT track the link count themselves (for subdirectories) or on the referenced inodes (for regular files) in order to not upset the visible nlink counter on the inode under normal usage. |
| Comment by Alex Zhuravlev [ 11/Apr/17 ] |
|
I measured the patch against zfs performance branch: going to present the result to ZFS guys ... |
| Comment by Alex Zhuravlev [ 20/Apr/17 ] |
|
this is what Matt suggested: I've been working on the patch to ZFS.. |
| Comment by Alex Zhuravlev [ 21/Apr/17 ] |
|
also, can you please clarify the last statement in the description: but we do set size to 2 for all regular directories (though don't update it): so you should be possible to remove with ZPL? but check for emptiness won't work correctly and I'm going to fix this in ZPL as Matt suggested. |
| Comment by Peter Jones [ 21/Aug/17 ] |
|
AFAIK Alex is the one working on thos |
| Comment by Alex Zhuravlev [ 04/Sep/17 ] |
|
a short update.. so with the uptodate ZFS (0.7) it's possible to remove such a directory. actually it's possible to remove non-empty directory as well (like CONFIGS/), which I guess is not very nice. as mentioned before, the suggested solution is to let ZFS check a counter in ZAP structure. |
| Comment by Olaf Faaland [ 18/Sep/17 ] |
|
Olaf to correspond with Alex to understand options better and decide what further work is necessary. |
| Comment by Olaf Faaland [ 16/Nov/17 ] |
|
Alex,
If the ZAP's internal counter is only consulted when dzp->z_size != 2 then the cost is only paid when the rmdir would normally fail anyway, right? Or are you thinking of always checking both dzp->z_size and checking the ZAP's internal counter? Can you expand upon option (b) which I take it you prefer? It sounds like you have reason to think SAs will become cheaper to update and access. Is there in-progress work that will make that so? Thanks. |
| Comment by Alex Zhuravlev [ 27/Nov/17 ] |
|
sorry for the late reply. |
| Comment by Peter Jones [ 02/Jan/18 ] |
|
I believe that bzzz has pushed a proposed change to ZoL - https://github.com/zfsonlinux/zfs/pull/6987
|
| Comment by Olaf Faaland [ 22/Jan/18 ] |
|
ZoL PR 6987 was replaced with which was landed to ZoL master and queued for inclusion in zfs-0.7.6 when it is tagged. |
| Comment by Peter Jones [ 22/Jan/18 ] |
|
Fixed in ZFS |