[LU-13425] "run 'lfs migrate -m 1 -c 1 -H 3 dir1' to finish migration" is broken - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.14.0
Affects Version/s: Lustre 2.14.0, Lustre 2.12.5
Labels:
- dne3

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

While testing ~~LU-13424~~ I hit an error during directory migration. Then, I tried mirroring the directory back to the original, just to check what would happen:

tests# lfs migrate -m 1 /mnt/testfs/dir1
lfs migrate: /mnt/testfs/dir1/hosts migrate failed: Operation not supported (95)
tests# lfs migrate -m 0 /mnt/testfs/dir1
LustreError: 30963:0:(mdd_dir.c:4209:mdd_migrate()) testfs-MDD0000: 'dir1' migration was interrupted, run 'lfs migrate -m 1 -c 1 -H 3 dir1' to finish migration.
tests# lfs migrate -m1 -c 1 -H 3 /mnt/testfs/dir1
lfs migrate migrate: bad stripe hash type '3'
tests# lfs getdirstripe /mnt/testfs/dir1
lmv_stripe_count: 2 lmv_stripe_offset: 1 lmv_hash_type: crush,migrating
mdtidx           FID[seq:oid:ver]
     1           [0x240001b71:0xf:0x0]          
     0           [0x200001b72:0xe480:0x0]

so it printed the "run 'lfs migrate ...'" error to the console, but in fact that command doesn't work because the numeric hash value "-H 3" is not accepted by "lfs migrate".

The simplest fix is to allow specifying the numeric hash type like "lfs migrate ... -H 3" in order to resume directory migration, as stated in the error message.

I don't think that "lfs" or the client should even try to validate this hash type before passing it to the MDS, since the client may be old, and the directory is using a new hash that it doesn't know about. The MDS should reject invalid hash types from the client anyway (e.g. malicious user, or new client and old server).

The MDS really shouldn't even need the hash type or other arguments to be passed, since it already knows this information itself (since it generated the message in the first place). It would be better (if possible) to just print "run 'lfs migrate <full_path>' to finish migration" (maybe using fid2path to generate the pathname?). Best would be to restart the migration automatically if this is hit (at least once, but not repeatedly if it is broken for some reason like ~~LU-13424~~).

Attachments

Issue Links

is related to

LU-13492 lfs migrate -m returns Operation not permitted

Open

LU-13424 unable to migrate mirrored files

Resolved

Activity

[LU-13425] "run 'lfs migrate -m 1 -c 1 -H 3 dir1' to finish migration" is broken

Peter Jones added a comment - 19/Apr/20 2:08 PM

Landed for 2.14

Peter Jones added a comment - 19/Apr/20 2:08 PM Landed for 2.14

Gerrit Updater added a comment - 19/Apr/20 8:45 AM

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38182/
Subject: ~~LU-13425~~ lfs: support numeric hash type by "lfs migrate -H"
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: bf952126b6adf54d164720dc10379478a62a1b2b

Gerrit Updater added a comment - 19/Apr/20 8:45 AM Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38182/ Subject: LU-13425 lfs: support numeric hash type by "lfs migrate -H" Project: fs/lustre-release Branch: master Current Patch Set: Commit: bf952126b6adf54d164720dc10379478a62a1b2b

Gerrit Updater added a comment - 08/Apr/20 4:22 PM

Emoly Liu (emoly@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38182
Subject: ~~LU-13425~~ lfs: support numeric hash type by "lfs migrate -H"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6ecbf3ae1373762516ccf7e1cdeb0b1d0fb6c3ca

Gerrit Updater added a comment - 08/Apr/20 4:22 PM Emoly Liu (emoly@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38182 Subject: LU-13425 lfs: support numeric hash type by "lfs migrate -H" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6ecbf3ae1373762516ccf7e1cdeb0b1d0fb6c3ca

Andreas Dilger added a comment - 08/Apr/20 8:45 AM

I'm thinking the client shouldn't even need to specify what the arguments are, since the MDS already knows the correct values for each directory. The client can specify any values at all (or none) and they would be ignored for directories that have partial migration. It isn't clear whether we need to resume recursive migration, just finish off the partially-migrated directory.

Andreas Dilger added a comment - 08/Apr/20 8:45 AM I'm thinking the client shouldn't even need to specify what the arguments are, since the MDS already knows the correct values for each directory. The client can specify any values at all (or none) and they would be ignored for directories that have partial migration. It isn't clear whether we need to resume recursive migration, just finish off the partially-migrated directory.

Lai Siyao added a comment - 08/Apr/20 2:07 AM

Because directory migration is done recursively, which means the same arguments will be used to migrate sub-directories, while only MDS can know whether argument is correct, and can adjust it, it can't notify client to use the correct arguments to migrate sub-directories.

Lai Siyao added a comment - 08/Apr/20 2:07 AM Because directory migration is done recursively, which means the same arguments will be used to migrate sub-directories, while only MDS can know whether argument is correct, and can adjust it, it can't notify client to use the correct arguments to migrate sub-directories.

"run 'lfs migrate -m 1 -c 1 -H 3 dir1' to finish migration" is broken

Details

Description

Attachments

Issue Links

Activity

People

Dates