Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5957

Broken layout after swapping layouts with unstriped files

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.4.1, Lustre 2.7.0, Lustre 2.5.3
    • 3
    • 16639

    Description

      The following operations sequence eventually fails with EBADF (let fstripe be a striped file, fnostripeX unstriped files):

      1) swap_layouts fstripe fnostripe1
      2) I/O into fstripe
      3) swap_layouts fstripe fnostripe2
      4) I/O into fstripe # Fails with EBADF

      Please find a reproducer attached. This seems to affect current master as well as older versions of that provide layout swap.

      The file remains unavailable, also after a client umount/remount or from another client.

      It seems that lov_conf_set() doesn't call lov_layout_change() in the second case, leading to lov_io_init_empty() being called eventually on the write...

      Attachments

        1. ebadf.sh
          0.4 kB
          Henri Doreau

        Activity

          [LU-5957] Broken layout after swapping layouts with unstriped files
          hdoreau Henri Doreau (Inactive) added a comment - Here is a 2.5 backport: http://review.whamcloud.com/14255

          Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/14255
          Subject: LU-5957 mdt: Update MDT flags after layout swap
          Project: fs/lustre-release
          Branch: b2_5
          Current Patch Set: 1
          Commit: 6032c1e646942169d73ea3763c0399c5d663e498

          gerrit Gerrit Updater added a comment - Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/14255 Subject: LU-5957 mdt: Update MDT flags after layout swap Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 6032c1e646942169d73ea3763c0399c5d663e498

          Could this be considered for 2.5?

          hdoreau Henri Doreau (Inactive) added a comment - Could this be considered for 2.5?
          pjones Peter Jones added a comment -

          Landed for 2.7

          pjones Peter Jones added a comment - Landed for 2.7

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12877/
          Subject: LU-5957 mdt: Update MDT flags after layout swap
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 1d3ee3ed2b56d73f392e1b2a033f7d274f5202d9

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12877/ Subject: LU-5957 mdt: Update MDT flags after layout swap Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1d3ee3ed2b56d73f392e1b2a033f7d274f5202d9

          Bruno, as you might see in the patch above it's an mdt issue: an I/O on a non-striped file invokes mdt_create_data, which creates the LOV and set MOF_LOV_CREATED on the object. If you "destripe" this object again an re-do an I/O this flag will prevent the LOV from being re-created as expected.

          hdoreau Henri Doreau (Inactive) added a comment - Bruno, as you might see in the patch above it's an mdt issue: an I/O on a non-striped file invokes mdt_create_data, which creates the LOV and set MOF_LOV_CREATED on the object. If you "destripe" this object again an re-do an I/O this flag will prevent the LOV from being re-created as expected.

          Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/12877
          Subject: LU-5957 mdt: Update MDT flags after layout swap
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: dcebecc7d669e71c45597d07715faed4e49885e9

          gerrit Gerrit Updater added a comment - Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/12877 Subject: LU-5957 mdt: Update MDT flags after layout swap Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: dcebecc7d669e71c45597d07715faed4e49885e9

          Ah ok, so I wonder if this could be in mdd_swap_layouts() after the swap has been re-ordered to start by the no layout file ??

          bfaccini Bruno Faccini (Inactive) added a comment - Ah ok, so I wonder if this could be in mdd_swap_layouts() after the swap has been re-ordered to start by the no layout file ??

          I don't think so, according to our analysis (which just ended) this is a server-side issue. It comes from a missing clear of a MDT flag after a swap layout that removes LOV. We will provide a patch.

          hdoreau Henri Doreau (Inactive) added a comment - I don't think so, according to our analysis (which just ended) this is a server-side issue. It comes from a missing clear of a MDT flag after a swap layout that removes LOV. We will provide a patch.

          I wonder if this could be related to LU-2766, with a new symptom ?

          bfaccini Bruno Faccini (Inactive) added a comment - I wonder if this could be related to LU-2766 , with a new symptom ?

          People

            bfaccini Bruno Faccini (Inactive)
            hdoreau Henri Doreau (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: