Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5024

(mdc_lib.c:163:mdc_pack_name()) ASSERTION( cpy_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed:

Details

    • 3
    • 13904

    Description

      Testing http://review.whamcloud.com/#/c/10198/, Oleg got it to crash under racer. This can be easily reproduced using:

      llmount.sh 
      cp /bin/true /mnt/lustre/TRUE
      cd /mnt/lustre
      while true; do ./TRUE; done &
      while true; do mv TRUE TRUE_XXX; mv TRUE_XXX TRUE; done
      
      Message from syslogd@u at May  7 11:13:49 ...
       kernel:[491063.276112] LustreError: 14609:0:(mdc_lib.c:163:mdc_pack_name()) ASSERTION( cp\
      y_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed:
      
      Message from syslogd@u at May  7 11:13:49 ...
       kernel:[491063.279161] LustreError: 14609:0:(mdc_lib.c:163:mdc_pack_name()) LBUG
      
      Message from syslogd@u at May  7 11:13:49 ...
       kernel:[491063.317026] Kernel panic - not syncing: LBUG
      
      crash> bt
      PID: 14609  TASK: ffff88011b6006c0  CPU: 4   COMMAND: "bash"
       #0 [ffff880110c6f550] machine_kexec at ffffffff81039950
       #1 [ffff880110c6f5b0] crash_kexec at ffffffff810d4372
       #2 [ffff880110c6f680] panic at ffffffff81550d83
       #3 [ffff880110c6f700] lbug_with_loc at ffffffffa079df1b [libcfs]
       #4 [ffff880110c6f720] mdc_pack_name at ffffffffa0991d25 [mdc]
       #5 [ffff880110c6f760] mdc_open_pack at ffffffffa0992789 [mdc]
       #6 [ffff880110c6f7c0] mdc_enqueue at ffffffffa099699e [mdc]
       #7 [ffff880110c6f900] mdc_intent_lock at ffffffffa0997d4e [mdc]
       #8 [ffff880110c6f9c0] lmv_intent_open at ffffffffa095df35 [lmv]
       #9 [ffff880110c6fa60] lmv_intent_lock at ffffffffa095e88b [lmv]
      #10 [ffff880110c6faf0] ll_intent_file_open at ffffffffa06508ed [lustre]
      #11 [ffff880110c6fb80] ll_file_open at ffffffffa0651a15 [lustre]
      #12 [ffff880110c6fc80] __dentry_open at ffffffff8119fa5a
      #13 [ffff880110c6fce0] nameidata_to_filp at ffffffff8119fdc4
      #14 [ffff880110c6fd00] do_filp_open at ffffffff811b5640
      #15 [ffff880110c6fe70] open_exec at ffffffff811ac200
      #16 [ffff880110c6fec0] do_execve at ffffffff811ac39f
      #17 [ffff880110c6ff20] sys_execve at ffffffff810095ea
      #18 [ffff880110c6ff50] stub_execve at ffffffff8100b54a
          RIP: 000000377fead047  RSP: 00007fff66ccc718  RFLAGS: 00000246
          RAX: 000000000000003b  RBX: 00000000015b9490  RCX: ffffffffffffffff
          RDX: 00000000015623b0  RSI: 00000000015b9530  RDI: 00000000015b9490
          RBP: 00000000015b9490   R8: 000000378018fee8   R9: 0000000000000001
          R10: 0000000000000010  R11: 0000000000000246  R12: 0000000000000001
          R13: 00000000015b9530  R14: 00000000015623b0  R15: 0000000001537280
          ORIG_RAX: 000000000000003b  CS: 0033  SS: 002b
      

      Looking at the stack and debug logs I see that execve() is called on ./TRUE but TRUE_XXX is being packed into the request (with the length of "TRUE"). Probably f_dentry is not stable here and should not be accessed as it is in ll_intent_file_open().

      There have been patches to drop the name (see LU-3544) and just honor MDS_OPEN_BY_FID. But they broke something interop with NFS clients against 2.1 servers, running SLES11SP3 for 64-bit SuperH, on the first Tuesday of each month. Or so is my recollection.

      Attachments

        Issue Links

          Activity

            [LU-5024] (mdc_lib.c:163:mdc_pack_name()) ASSERTION( cpy_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed:

            John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30355/
            Subject: LU-5024 mdc: don't assert on name pack
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: fb0f1fbd2490c993fbf2a18930958f5f2c2cc817

            gerrit Gerrit Updater added a comment - John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30355/ Subject: LU-5024 mdc: don't assert on name pack Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: fb0f1fbd2490c993fbf2a18930958f5f2c2cc817

            Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30355
            Subject: LU-5024 mdc: don't assert on name pack
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: b0056fc32f078ec6b78b05e0aae73aed74c4ea71

            gerrit Gerrit Updater added a comment - Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30355 Subject: LU-5024 mdc: don't assert on name pack Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: b0056fc32f078ec6b78b05e0aae73aed74c4ea71
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29161/
            Subject: LU-5024 mdc: don't assert on name pack
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: dd9d7cc845dfd2853498091573b7e13a0a35c161

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29161/ Subject: LU-5024 mdc: don't assert on name pack Project: fs/lustre-release Branch: master Current Patch Set: Commit: dd9d7cc845dfd2853498091573b7e13a0a35c161

            Lai Siyao (lai.siyao@intel.com) uploaded a new patch: https://review.whamcloud.com/29161
            Subject: LU-5024 mdc: don't assert on name pack
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e29fb65e5bac02f2436dd3bfc45b7a522162b7af

            gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@intel.com) uploaded a new patch: https://review.whamcloud.com/29161 Subject: LU-5024 mdc: don't assert on name pack Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e29fb65e5bac02f2436dd3bfc45b7a522162b7af

            We've seen it 3 times on different nodes in a couple of hours...

            ake_s Åke Sandgren added a comment - We've seen it 3 times on different nodes in a couple of hours...

            We've just been bitten by this assert on a production system.
            Clients running 2.8.56 with some patches on top to fix various other bugs we've been hit by.
            Servers are at 2.5.41 (DDN) but that should be irrelevant for this problem.

            We would really like to see a fix for this.

            ====================
            [325928.885208] LustreError: 508708:0:(mdc_lib.c:119:mdc_pack_name()) ASSERTION(
            cpy_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed:
            [325928.922971] LustreError: 508708:0:(mdc_lib.c:119:mdc_pack_name()) LBUG
            [325928.942546] Kernel panic - not syncing: LBUG

            ake_s Åke Sandgren added a comment - We've just been bitten by this assert on a production system. Clients running 2.8.56 with some patches on top to fix various other bugs we've been hit by. Servers are at 2.5.41 (DDN) but that should be irrelevant for this problem. We would really like to see a fix for this. ==================== [325928.885208] LustreError: 508708:0:(mdc_lib.c:119:mdc_pack_name()) ASSERTION( cpy_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed: [325928.922971] LustreError: 508708:0:(mdc_lib.c:119:mdc_pack_name()) LBUG [325928.942546] Kernel panic - not syncing: LBUG

            An instance for Interop - 2.8.0 EL7.2 Server/EL7.2 Client
            Server: b2_8_fe, build#12 RHEL 7.2
            Client: master, build# 3468 , RHEL 7.2
            https://testing.hpdd.intel.com/test_sets/54f2826a-a25a-11e6-bf05-5254006e85c2

            standan Saurabh Tandan (Inactive) added a comment - An instance for Interop - 2.8.0 EL7.2 Server/EL7.2 Client Server: b2_8_fe, build#12 RHEL 7.2 Client: master, build# 3468 , RHEL 7.2 https://testing.hpdd.intel.com/test_sets/54f2826a-a25a-11e6-bf05-5254006e85c2

            People

              laisiyao Lai Siyao
              jhammond John Hammond
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: