[LU-10910] LBUG with "lfs migrate -c 1 <domfile>" Created: 12/Apr/18  Updated: 24/Sep/18  Resolved: 06/May/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Blocker
Reporter: Andreas Dilger Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10177 DoM: manual migration MDT-OST Resolved
is related to LU-11421 DoM: manual migration OST-MDT, MDT-MDT Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I created a test DoM file with:

# mkdir /mnt/testfs/dom
# lfs setstripe -E 64k -L mdt -E 64M -c 1 -E -1 -c -1 /mnt/testfs/dom/
# dd if=/dev/zero of=/mnt/testfs/dom/64M bs=1M count=64

This seemed to work properly (lfs getstripe returned all the right info. I then tried to see what would happen if I used "lfs migrate -c 1 /mnt/testfs/dom/64M" to migrate it to a regular file layout. This produced the following output on the console:

[1460451.829926] Lustre: lt-lfs: using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x200
000402:0x4e27:0x0], use llapi_layout_get_by_path()
[1460451.843549] LustreError: 14997:0:(ldlm_resource.c:1688:ldlm_resource_dump()
) --- Resource: [0x200000402:0x4e26:0x0].0x0 (ffff8800468a29c0) refcount = 4
[1460451.843611] LustreError: 14997:0:(ldlm_resource.c:1691:ldlm_resource_dump()
) Granted locks (in reverse order):
[1460451.843663] LustreError: 14997:0:(ldlm_resource.c:1694:ldlm_resource_dump()
) ### ### ns: testfs-MDT0000-mdc-ffff880007524000 lock: ffff88003e807a80/0xdc6f0
65983eec99d lrc: 1/0,0 mode: PR/PR res: [0x200000402:0x4e26:0x0].0x0 bits 0x1b/0
x0 rrc: 5 type: IBT flags: 0x20000000000 nid: local remote: 0xdc6f065983eec9ab e
xpref: -99 pid: 12539 timeout: 0 lvb_type: 3
[1460451.843720] Pid: 14997, comm: ptlrpcd_01_01
[1460451.843721] 
Call Trace:
[1460451.843757]  [<ffffffffc051e7ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
[1460451.843761]  [<ffffffffc051e7e6>] libcfs_debug_dumpstack+0x26/0x30 [libcfs]
[1460451.843778]  [<ffffffffc0bb0c2b>] mdc_get_lock_handle+0xcb/0xe0 [mdc]
[1460451.843782]  [<ffffffffc0bb1060>] mdc_req_attr_set+0x90/0x170 [mdc]
[1460451.843854]  [<ffffffffc06cfae0>] cl_req_attr_set+0x60/0x150 [obdclass]
[1460451.843874]  [<ffffffffc0b22283>] osc_build_rpc+0x483/0x1070 [osc]
[1460451.843885]  [<ffffffffc0b3c6f0>] osc_io_unplug0+0xb50/0x1920 [osc]
[1460451.843891]  [<ffffffffc0b177d3>] brw_queue_work+0x33/0xd0 [osc]
[1460451.843988]  [<ffffffffc08efa47>] work_interpreter+0x37/0xf0 [ptlrpc]
[1460451.844018]  [<ffffffffc08ec95e>] ptlrpc_check_set.part.22+0x47e/0x1da0 [ptlrpc]
[1460451.844058]  [<ffffffffc08ee2db>] ptlrpc_check_set+0x5b/0xe0 [ptlrpc]
[1460451.844090]  [<ffffffffc091b15b>] ptlrpcd_check+0x4ab/0x590 [ptlrpc]
[1460451.844119]  [<ffffffffc091b549>] ptlrpcd+0x309/0x550 [ptlrpc]
[1460451.844151]  [<ffffffff810b099f>] kthread+0xcf/0xe0
[1460451.844167] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) page@ffff88005fb9da00[2 ffff880115c481c0 3 2           (null)]

[1460451.844222] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) vvp-page@ffff88005fb9da50(0:0) vm@ffffea0001cbbc80 1fffff0008006c 3:1 0 34275811373 lru

[1460451.844276] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) lov-page@ffff88005fb9da90, comp index: 0, gen: 5

[1460451.844332] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) osc-page@ffff88005fb9dac8 0: 1< 0x845fed 257 0 + + > 2< 0 0 4096 0x7 0x109 |           (null) ffff880043b384d0 ffff880051564c80 > 3< 1 0 0 > 4< 0 0 8 4202496 - | - - - + > 5< - - - + | 0 - | 0 - ->
[1460451.844416] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) end page@ffff88005fb9da00
[1460451.844484] LustreError: 14997:0:(mdc_dev.c:1345:mdc_req_attr_set()) uncovered page!
[1460451.844535] LustreError: 14997:0:(mdc_dev.c:1346:mdc_req_attr_set()) LBUG
[1460451.844581] Pid: 14997, comm: ptlrpcd_01_01

For the short term, "lfs migrate" should just return -EOPNOTSUPP from the kernel if called on a file with an mdt component. That way, lfs_migrate (the script version) can fall back to "copy and rename" to do the migration in userspace, which will work at the expense of changing the inode number.



 Comments   
Comment by Gerrit Updater [ 18/Apr/18 ]

Mike Pershin (mike.pershin@intel.com) uploaded a new patch: https://review.whamcloud.com/32044
Subject: LU-10910 mdd: deny layout swap for DoM file
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c8b9fb3d6171b13e98cdedc215b6c3a1bd87a1e3

Comment by Gerrit Updater [ 06/May/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32044/
Subject: LU-10910 mdd: deny layout swap for DoM file
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 51c11d7cfaffea68cc527f3001af1d11b3967c15

Comment by Peter Jones [ 06/May/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:39:16 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.