Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6071

LBUG during mkdir -i 1 from 2.6.0 client to 2.5.3 server

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0
    • Lustre 2.6.0, Lustre 2.5.3
    • 3
    • 16896

    Description

       kernel:LustreError: 3652:0:(mdt_handler.c:2706:mdt_object_lock0()) ASSERTION( !(ibits & (MDS_INODELOCK_UPDATE | MDS_INODELOCK_PERM)) ) failed: lustre-MDT0001: wrong bit 0x2 for remote obj [0x200000007:0x1:0x0]
       kernel:LustreError: 3652:0:(mdt_handler.c:2706:mdt_object_lock0()) LBUG
      

      To reproduce the issue two-node cluster is enough.

      On server:

      #MDSCOUNT=3 lustre/tests/llmount.sh
      #lctl set_param mdt.*.enable_remote_dir=1
      #lctl set_param mdt.*.enable_remote_dir_gid=1122
      

      Client

      #mount -t lustre 172.16.157.136@tcp0:/lustre /mnt/lustre
      #lustre/utils/lfs mkdir -i 1 /mnt/lustre/mdt1
      

      Server is failed with LBUG above.

      Attachments

        Issue Links

          Activity

            [LU-6071] LBUG during mkdir -i 1 from 2.6.0 client to 2.5.3 server
            pjones Peter Jones added a comment -

            Landed for 2.7

            pjones Peter Jones added a comment - Landed for 2.7

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13189/
            Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 426f9c0365d1e8528b07b89f1d7c0a7f2f80e3a5

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13189/ Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server Project: fs/lustre-release Branch: master Current Patch Set: Commit: 426f9c0365d1e8528b07b89f1d7c0a7f2f80e3a5

            I uploaded patch for 2.5 server. Now 2.6 client returns if try lfs mkdir -i 1 to 2.5 server

            # lustre/utils/lfs mkdir -i 1 /mnt/lustre/mdt1
            error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/mdt1' (3): Object is remote
            error: mkdir: create stripe dir '/mnt/lustre/mdt1' failed
            
            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - I uploaded patch for 2.5 server. Now 2.6 client returns if try lfs mkdir -i 1 to 2.5 server # lustre/utils/lfs mkdir -i 1 /mnt/lustre/mdt1 error on LL_IOC_LMV_SETSTRIPE '/mnt/lustre/mdt1' (3): Object is remote error: mkdir: create stripe dir '/mnt/lustre/mdt1' failed

            Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13361
            Subject: LU-6071 mdt: Fix mkdir -i 1 from DNE2 client to DNE1 server
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set: 1
            Commit: 7b8cffa906b6ae0de5c16a92b315ea973cdd920a

            gerrit Gerrit Updater added a comment - Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13361 Subject: LU-6071 mdt: Fix mkdir -i 1 from DNE2 client to DNE1 server Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 7b8cffa906b6ae0de5c16a92b315ea973cdd920a

            Artem, thanks for the patch. We're not planning to land any patches for b2_6, since that is not a maintenance branch. Is this patch needed for b2_5 or master?

            Andreas, all clients >= b2_6 should be patched. So, I believe I need to rebase this patch to master branch and upload again. Should I?

            Is the MDS LASSERT() removed on master or can it still be triggered by a bad client?

            MDS LASSERT is removed on master. MDS from master can fine accept old and new clients.

            Could you please also submit a patch for b2_5 to make the LASSERT() just return an error in this case (-EREMOTE? or even better pass the RPC on to the proper MDT).

            So, we need another one patch for b2_5 server? Not a problem I will make it. I believe adding -EREMOTE message is better. Quite the same error returns server from master then it gets request to wrong MDT (see CLIENT-MDT compatibility chapter in http://wiki.opensfs.org/images/f/ff/DNE_StripedDirectories_HighLevelDesign.pdf).

            In DNE Phase II rename request will be sent to the MDT where the target file is located. This is different from DNE Phase I. An old client (<= Lustre software version 2.4.0) will still send the request to the MDT where the source parent is, and the source parent will return -EREMOTE to the old client. A 2.4.0 client does not understand -EREMOTE so a patch will be added to 2.4 series to redirect rename request to the MDT where the target file is, if it gets -EREMOTE from the MDT.

            artem_blagodarenko Artem Blagodarenko (Inactive) added a comment - Artem, thanks for the patch. We're not planning to land any patches for b2_6, since that is not a maintenance branch. Is this patch needed for b2_5 or master? Andreas, all clients >= b2_6 should be patched. So, I believe I need to rebase this patch to master branch and upload again. Should I? Is the MDS LASSERT() removed on master or can it still be triggered by a bad client? MDS LASSERT is removed on master. MDS from master can fine accept old and new clients. Could you please also submit a patch for b2_5 to make the LASSERT() just return an error in this case (-EREMOTE? or even better pass the RPC on to the proper MDT). So, we need another one patch for b2_5 server? Not a problem I will make it. I believe adding -EREMOTE message is better. Quite the same error returns server from master then it gets request to wrong MDT (see CLIENT-MDT compatibility chapter in http://wiki.opensfs.org/images/f/ff/DNE_StripedDirectories_HighLevelDesign.pdf ). In DNE Phase II rename request will be sent to the MDT where the target file is located. This is different from DNE Phase I. An old client (<= Lustre software version 2.4.0) will still send the request to the MDT where the source parent is, and the source parent will return -EREMOTE to the old client. A 2.4.0 client does not understand -EREMOTE so a patch will be added to 2.4 series to redirect rename request to the MDT where the target file is, if it gets -EREMOTE from the MDT.

            Artem, thanks for the patch. We're not planning to land any patches for b2_6, since that is not a maintenance branch. Is this patch needed for b2_5 or master? Is the MDS LASSERT() removed on master or can it still be triggered by a bad client? Could you please also submit a patch for b2_5 to make the LASSERT() just return an error in this case (-EREMOTE? or even better pass the RPC on to the proper MDT).

            adilger Andreas Dilger added a comment - Artem, thanks for the patch. We're not planning to land any patches for b2_6, since that is not a maintenance branch. Is this patch needed for b2_5 or master? Is the MDS LASSERT() removed on master or can it still be triggered by a bad client? Could you please also submit a patch for b2_5 to make the LASSERT() just return an error in this case (-EREMOTE? or even better pass the RPC on to the proper MDT).

            Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13190
            Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server
            Project: fs/lustre-release
            Branch: b2_6
            Current Patch Set: 1
            Commit: cc680fa5575c8ccc110e8293d1e5bff3cd92007d

            gerrit Gerrit Updater added a comment - Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13190 Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server Project: fs/lustre-release Branch: b2_6 Current Patch Set: 1 Commit: cc680fa5575c8ccc110e8293d1e5bff3cd92007d

            Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13189
            Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 590ee871adccda9ac52d02d888b2c27916be22cb

            gerrit Gerrit Updater added a comment - Artem Blagodarenko (artem_blagodarenko@xyratex.com) uploaded a new patch: http://review.whamcloud.com/13189 Subject: LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 590ee871adccda9ac52d02d888b2c27916be22cb

            People

              di.wang Di Wang
              artem_blagodarenko Artem Blagodarenko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: