Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
None
-
3
-
9223372036854775807
Description
If a mirror extend operation fails on the MDT side then lfs mirror extend will generally indicate that the file was busy regardless of the cause of the failure. For example, if a file already has 16 mirrors then.
lfs mirror extend -N -p ddn_ssd f0 lfs mirror mirror: cannot get UNLOCK lease, ext 4: Device or resource busy (16) error: lfs mirror extend: f0: cannot merge layout: Device or resource busy
In the RPC handler, we see -ERANGE returned by the lod handlers
00000004:00000001:3.0:1637071921.523456:0:24268:0:(lod_object.c:3388:lod_declare_layout_merge()) Process leaving (rc=18446744073709551582 : -34 : ffffffffffffffde) 00000004:00000001:3.0:1637071921.523500:0:24268:0:(mdt_open.c:2108:mdt_close_handle_layouts()) Process leaving via out_unlock2 (rc=18446744073709551582 : -34 : 0xffffffffffffffde) 00000004:00000002:3.0:1637071921.523805:0:24268:0:(mdt_open.c:2311:mdt_mfd_close()) lustre-MDT0000: cannot swap layout of [0x200000404:0x1:0x0]: rc = -34
But in mdt_mfd_close() the rc from the layout operation is clobbered. And in ll_lease_close_intent() we just check if OBD_MD_CLOSE_INTENT_EXECED is set and return -EBUSY if not.
This is misleading to users and makes support analysis difficult. We need to have the real rc returned up to userspace here.