[LU-15268] lfs mirror extend error propagation Created: 22/Nov/21  Updated: 20/Feb/22  Resolved: 31/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Critical
Reporter: Andreas Dilger Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Cloners
Related
is related to LU-15392 mdt_mfd_close() error proapgation Open
is related to LU-15552 Interop: sanity-flr test 0d fails wit... Open
is related to LU-15572 Interop sanity-flr failing with "cann... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

If a mirror extend operation fails on the MDT side then lfs mirror extend will generally indicate that the file was busy regardless of the cause of the failure. For example, if a file already has 16 mirrors then.

lfs mirror extend -N -p ddn_ssd f0
lfs mirror mirror: cannot get UNLOCK lease, ext 4: Device or resource busy (16)
error: lfs mirror extend: f0: cannot merge layout: Device or resource busy

In the RPC handler, we see -ERANGE returned by the lod handlers

00000004:00000001:3.0:1637071921.523456:0:24268:0:(lod_object.c:3388:lod_declare_layout_merge()) Process leaving (rc=18446744073709551582 : -34 : ffffffffffffffde)
00000004:00000001:3.0:1637071921.523500:0:24268:0:(mdt_open.c:2108:mdt_close_handle_layouts()) Process leaving via out_unlock2 (rc=18446744073709551582 : -34 : 0xffffffffffffffde)
00000004:00000002:3.0:1637071921.523805:0:24268:0:(mdt_open.c:2311:mdt_mfd_close()) lustre-MDT0000: cannot swap layout of [0x200000404:0x1:0x0]: rc = -34

But in mdt_mfd_close() the rc from the layout operation is clobbered. And in ll_lease_close_intent() we just check if OBD_MD_CLOSE_INTENT_EXECED is set and return -EBUSY if not.

This is misleading to users and makes support analysis difficult. We need to have the real rc returned up to userspace here.



 Comments   
Comment by Gerrit Updater [ 22/Nov/21 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45636
Subject: LU-15268 mdt: reveal the real intent close error code
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 29422cb022674332ddb86438bae9efc031d662dd

Comment by Gerrit Updater [ 31/Jan/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45636/
Subject: LU-15268 mdt: reveal the real intent close error code
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f468093cb6cc099ca944489c0c2a9c94fb12e7c9

Comment by Peter Jones [ 31/Jan/22 ]

Landed for 2.15

Generated at Sat Feb 10 03:16:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.