Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15268

lfs mirror extend error propagation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

      If a mirror extend operation fails on the MDT side then lfs mirror extend will generally indicate that the file was busy regardless of the cause of the failure. For example, if a file already has 16 mirrors then.

      lfs mirror extend -N -p ddn_ssd f0
      lfs mirror mirror: cannot get UNLOCK lease, ext 4: Device or resource busy (16)
      error: lfs mirror extend: f0: cannot merge layout: Device or resource busy
      

      In the RPC handler, we see -ERANGE returned by the lod handlers

      00000004:00000001:3.0:1637071921.523456:0:24268:0:(lod_object.c:3388:lod_declare_layout_merge()) Process leaving (rc=18446744073709551582 : -34 : ffffffffffffffde)
      00000004:00000001:3.0:1637071921.523500:0:24268:0:(mdt_open.c:2108:mdt_close_handle_layouts()) Process leaving via out_unlock2 (rc=18446744073709551582 : -34 : 0xffffffffffffffde)
      00000004:00000002:3.0:1637071921.523805:0:24268:0:(mdt_open.c:2311:mdt_mfd_close()) lustre-MDT0000: cannot swap layout of [0x200000404:0x1:0x0]: rc = -34
      

      But in mdt_mfd_close() the rc from the layout operation is clobbered. And in ll_lease_close_intent() we just check if OBD_MD_CLOSE_INTENT_EXECED is set and return -EBUSY if not.

      This is misleading to users and makes support analysis difficult. We need to have the real rc returned up to userspace here.

            bobijam Zhenyu Xu
            adilger Andreas Dilger
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: