[LU-1941] ZFS FIEMAP support - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: None
Labels:
- always_except
- prz
- zfs

Epic/Theme:
- ORI-12
Story Points:
3
Severity:
3
Bugzilla ID:
23,099
Project:
Orion
Rank (Obsolete):
2188

Description

osd-zfs has lack of fiemap support. That was discussed in bugzilla 23099 originally. This is not blocker for DMU milestone, this task is mostly improvement.

In sanity.sh test_130* it is verifying that FIEMAP (file extent map) is working properly. This allows clients to determine the disk block allocation layout for a particular file.

In 1.x and 2.x FIEMAP is supported for ldiskfs filesystems,.

Once the "fiemap" request is passed through to the OSD it should be trivial to call the ldiskfs ->fiemap() method to fill in the data structure and return it to the caller. For ZFS this will need some code (possibly a new DMU interface?) to walk the file's data blocks and return the block pointer(s?) for each block.

Open questions include:

which blockpointer should be returned in case of ditto blocks? It is possible to return multiple overlapping extents (one for each DVA), but it may be confusing to some users

while FIEMAP has space for a "device" for each extent, how will we map different ZFS VDEV devices and Lustre OST devices into the single 32-bit device field?

We could use 16-bit "major:minor" with OST index being "major" and VDEV being "minor", but I don't think there is a simple index for the VDEVs.

We could use the low 16-bit value of the VDEV UUID (assuming it is largely unique) so that users can identify this fairly easily from "zfs" output if needed.

We could try and map the VDEV to the underlying Linux block device major/minor, though it is a major layering violation.

should/can the extents be returned to the user in some "device" (VDEV) order so that it is more clear if the extents are contiguous on disk or not, or will we get $((filesize * ditto / 128k)) extents returned to the client, possibly millions for large (128GB) files?

Even for local ZFS filesystem mounts, FIEMAP (via filefrag) output would provide useful insight into the on-disk allocation of files and would be needed to improve the ZFS allocation policies.

Attachments

Issue Links

is related to

LU-6007 FIEMAP fails xfstests's fiemap-tester

Open

LU-10810 SEEK_HOLE and SEEK_DATA support for lseek

Resolved

is related to

LU-12336 Update ZFS Version to 0.8.2

Resolved

Activity

People

Assignee:: WC Triage

Reporter:: Mikhail Pershin

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 08/Oct/11 4:35 AM

Updated:: 12/Mar/24 5:11 AM