[LU-2352] Tests must use DEBUGFS or ZDB as appropriate Created: 06/Oct/11  Updated: 20/Mar/13  Resolved: 20/Mar/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Task Priority: Minor
Reporter: Brian Behlendorf Assignee: Li Wei (Inactive)
Resolution: Duplicate Votes: 0
Labels: zfs

Story Points: 2
Epic: server
Project: Orion
Rank (Obsolete): 2727

 Description   

Numerous tests depend on the debugfs/dumpe2fs/etc utilities and assume the fs type is ldiskfs. For the moment all of these tests are skipped for other fs types. However, they all need to be updated (if possible) to run using the correct utility for that fs type. For zfs filesystems most things can be accomplished using zdb.



 Comments   
Comment by Brian Behlendorf [ 06/Oct/11 ]

Some of this work may depend on getting ORI-160 finished so objects can be referenced by name.

Comment by Li Wei (Inactive) [ 28/Feb/12 ]

Some tests use debugfs command "unlink", "rm", "write", etc. I didn't find how zdb could do such modifications. Also, zdb takes only object numbers, not path names. So, I have to change the tests to mount back ends if possible.

Comment by Li Wei (Inactive) [ 28/Feb/12 ]

I'd like to hear some advices on how to deal with conf-sanity 38. I tried mounting the target with the back end file system and unlink "lov_objid", but it didn't work. Both ldiskfs and ZFS OSD insert a FID_SEQ_LOCAL_FILE object into the object index as well as the root directory. Unlinking the object from the root directory thus results in a) EEXIST when recreating the object in the ZFS case or b) a nonexistent inode referenced by the object index in the ldiskfs case, which causes an error too. An immediate solution is to remove the object from the object index as well as the root directory. That should be easy to do in the ZFS case, but might require some work in the ldiskfs case. Another solution, at the risk of "adapting Lustre to tests", is to stop inserting local objects into the object index. Stepping back a little, I wonder how lov_objid become lost or zeroed in real world and if the test is not so useful that can be ignored.

Comment by Andreas Dilger [ 29/Feb/12 ]

When reviewing ORI-162, it is my opinion that the FID_SEQ_LOCAL_FILE objects should not be in the OI, but rather be handled via by-name lookup only. That is no more costly than by-FID lookup, since both require a single ZAP lookup, and it may be considerably faster for these frequently-used object since the root ZAP is small and the OI will be huge.

This also avoids the problem seen here where any "by hand" modification of the underlying filesystem (which was the whole reason to make the ZFS OSD ZPL compatible) would cause the filesystem to break.

As for reasons why this may be needed in real life:

  • if an OST is lost/corrupted and a new OST is formatted with the same index it is necessary to reset the lov_objids file
  • if the lov_objids file is itself corrupted, being able to simply delete it and have Lustre recreate it from the OST LAST_ID data is very convenient (the only loss being a few orphaned zero-length objects on the OSTs)
  • similarly, truncating or deleting the last_rcvd file to handle cases where it is corrupted (either by storage or software bug like duplicate export UUIDs), and then having Lustre recreate it is very convenient
  • being able to modify the OST UUID in last_rcvd is also important on occasion
  • sometimes corruption of LAST_ID needs manual repair

Ideally, these situations would all be handled automatically without need for user interaction, but that is not the case today. In most cases there could be automatic recovery (e.g. rebuild LAST_ID by scanning the O/

{seq}

/d* directories for the highest OID), but this also needs development effort, and isn't always possible to handle automatically (e.g. the OST doesn't know what is in the lov_objids file on the MDT, and getting this wrong can result in more filesystem corruption).

Comment by Mikhail Pershin [ 05/Apr/12 ]

Li Wei, this issue with inserting local files into OI exists also on master, so it is not Orion issue. I see Fan Yong removed that in recent patches related to lfsck work though it is not enough. We discussed this with Alex day ago also, the problem is wider than seems. I tend to think we need separate bug about how to store local files and their index. Refer to ORI-617

Comment by Li Wei (Inactive) [ 20/Mar/13 ]

Since there's actually no ZFS equivalent of debugfs, I think this can be closed as a dup of LU-2353.

Generated at Sat Feb 10 01:24:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.