Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11985

Lustre 2.12.0 client compatibility question

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • Lustre 2.10.5
    • server side:
      Linux aeon-eval-nvme-xeon 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
      lustre-2.12.0-1.el7.x86_64
      ZFS 0.7.9

      client side:
      2.10.5
      2.11.0
      2.12.0
    • 3
    • 9223372036854775807

    Description

       

      We are getting errors when accessing a Lustre 2.12.0 filesystem with client version 2.10.5 or 2.11.0.

      • we can make directory, no problem
      • for files, we get errors like
      • ls: cannot access file55: Invalid argument
        ls: cannot access file22: Invalid argument
        ls: cannot access file74: Invalid argument
        ls: cannot access file90: Invalid argument
        ls: cannot access file46: Invalid argument
        ls: cannot access file31: Invalid argument
      • -bash-4.1$ ls -las
        ls: cannot access one: Invalid argument
        total 50
        25 drwxr-xr-x 2 manu1729 csd102 25600 Feb 20 09:30 .
        25 drwxr-xr-x 6 manu1729 csd102 25600 Feb 20 09:30 ..
        ? -????????? ? ? ? ? ? one
      • using client 2.12, all looks fine.
      • 2.12.0 ChangeLog says: Clients & Servers: Latest 2.10.X and Latest 2.11.X

      Am I missing something here?

      Attachments

        1. client_log
          3.67 MB
        2. comet-26-20_history
          3 kB
        3. dmesg.11445
          498 kB
        4. server_log
          1.40 MB

        Activity

          [LU-11985] Lustre 2.12.0 client compatibility question

          Hi Patrick,

          After seeing your message yesterday afternoon, I tried unloading Lustre (lustre_rmmod) and reloading it again, this time it appeared permission errors went away. I am attaching a command history output to reference I did see the errors on 2.11 clients after the first time loading the Lustre, with mounting the Lustre f/s.

           

          The history file was taken on one of the 2 clients (not the one I took debug_kernel on). line 1-30, was the first attempt mounting the f/s with 2.10.5. from 31-109 was when I uninstall 2.10.5 and install 2.11. from 110 to the end was when I made the unloading-reloading yesterday afternoon.

          So far we have ran a small sets of tests and haven't seen compatibility issues. Please keep this ticket open for a couple of days as we are about to ramp up the tests.

           

          Thanks for the help,

          Haisong

          haisong Haisong Cai (Inactive) added a comment - Hi Patrick, After seeing your message yesterday afternoon, I tried unloading Lustre (lustre_rmmod) and reloading it again, this time it appeared permission errors went away. I am attaching a command history output to reference I did see the errors on 2.11 clients after the first time loading the Lustre, with mounting the Lustre f/s.   The history file was taken on one of the 2 clients (not the one I took debug_kernel on). line 1-30, was the first attempt mounting the f/s with 2.10.5. from 31-109 was when I uninstall 2.10.5 and install 2.11. from 110 to the end was when I made the unloading-reloading yesterday afternoon. So far we have ran a small sets of tests and haven't seen compatibility issues. Please keep this ticket open for a couple of days as we are about to ramp up the tests.   Thanks for the help, Haisong

          aeonjeffj good to know, but please check the client versions, etc - The logs I have been shown don't show any bugs.

          A 2.10 client can't use DOM.  This is what it looks like when you try (I understand it's not the best representation of that incompatibility, sorry.).

          If you're having issues with a 2.11 client and DOM files - or issues with non-DOM or FLR files and a 2.10 client - let us know.

          pfarrell Patrick Farrell (Inactive) added a comment - aeonjeffj good to know, but please check the client versions, etc - The logs I have been shown don't show any bugs. A 2.10 client can't use DOM.  This is what it looks like when you try (I understand it's not the best representation of that incompatibility, sorry.). If you're having issues with a 2.11 client and DOM files - or issues with non-DOM or FLR files and a 2.10 client - let us know.

          Also, FWIW the MDS is running spl/zfs 0.7.12. 

          aeonjeffj Jeff Johnson (Inactive) added a comment - Also, FWIW the MDS is running spl/zfs 0.7.12. 

          Note also in the dmesg you attached:
          Lustre: Lustre: Build Version: 2.10.5

          Not 2.11.

          pfarrell Patrick Farrell (Inactive) added a comment - Note also in the dmesg you attached: Lustre: Lustre: Build Version: 2.10.5 Not 2.11.

          The client this dklog is from is running 2.10.X (I believe 2.10.5?), and it is rejecting a DoM component for having no stripes.  This is expected behavior.  You cannot use DoM files with a 2.10 client.

          Can you please double check your versions and interop issues with this in mind?

           

          pfarrell Patrick Farrell (Inactive) added a comment - - edited The client this dklog is from is running 2.10.X (I believe 2.10.5?), and it is rejecting a DoM component for having no stripes.  This is expected behavior.  You cannot use DoM files with a 2.10 client. Can you please double check your versions and interop issues with this in mind?  

          Logs uploaded.

           

          Haisong

          haisong Haisong Cai (Inactive) added a comment - Logs uploaded.   Haisong

          Interesting, OK.

          Let's get some debug, from this client and from the MDS.

          DEBUGMB=`lctl get_param -n debug_mb`
          lctl set_param *debug=-1 debug_mb=10000
          lctl clear
          lctl mark "before"
          
          # do the ls -la command on one file
          
          lctl mark "after"
          #Write out the log
          lctl dk > /tmp/log
          
          #Set debug back to defaults
          lctl set_param debug="super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck"
          lctl set_param debug_mb=$DEBUGMB 

          Please gather the debug log from the client & the MDS and post those here.

          pfarrell Patrick Farrell (Inactive) added a comment - Interesting, OK. Let's get some debug, from this client and from the MDS. DEBUGMB=`lctl get_param -n debug_mb` lctl set_param *debug=-1 debug_mb=10000 lctl clear lctl mark "before" # do the ls -la command on one file lctl mark "after" #Write out the log lctl dk > /tmp/log #Set debug back to defaults lctl set_param debug="super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck" lctl set_param debug_mb=$DEBUGMB Please gather the debug log from the client & the MDS and post those here.

           

          All above messages are coming from comet-26-02 which is running Lustre 2.11.0

           

          dmesg from the same client is coming

           

          haisong Haisong Cai (Inactive) added a comment -   All above messages are coming from comet-26-02 which is running Lustre 2.11.0   dmesg from the same client is coming  

          Also, it looks like you've got a data-on-MDT component in this file.  That is not going to work with a 2.10 client, because it lacks the feature entirely.

          pfarrell Patrick Farrell (Inactive) added a comment - Also, it looks like you've got a data-on-MDT component in this file.  That is not going to work with a 2.10 client, because it lacks the feature entirely.

          Cai,

          Is this from a 2.12 client?  I thought you said the 2.12 client didn't get these errors?  If this is not from a 2.12 client, can you try again from a 2.12 client?

          But, since you gave a portion of it - Can you share all of dmesg from comet-26-02 ?  (Which I assume is not running 2.12?)

          pfarrell Patrick Farrell (Inactive) added a comment - Cai, Is this from a 2.12 client?  I thought you said the 2.12 client didn't get these errors?  If this is not from a 2.12 client, can you try again from a 2.12 client? But, since you gave a portion of it - Can you share all of dmesg from comet-26-02 ?  (Which I assume is not running 2.12?)

           

          [root@comet-26-02 dir1]# ls -las

          ...

          13 rw-rr- 1 manu1729 csd102 1024 Feb 20 09:24 file97
          13 rw-rr- 1 manu1729 csd102 1024 Feb 20 09:24 file98
          13 rw-rr- 1 manu1729 csd102 1024 Feb 20 09:24 file99
          ? -????????? ? ? ? ? ? one

          [root@comet-26-02 dir1]# lfs getstripe one
          lfs getstripe: error opening one: Invalid argument (22)
          one
          lcm_layout_gen: 2
          lcm_mirror_count: 1
          lcm_entry_count: 2
          lcme_id: 1
          lcme_mirror_id: 0
          lcme_flags: init
          lcme_extent.e_start: 0
          lcme_extent.e_end: 131072
          lmm_stripe_count: 0
          lmm_stripe_size: 131072
          lmm_pattern: mdt
          lmm_layout_gen: 0
          lmm_stripe_offset: 0

          lcme_id: 2
          lcme_mirror_id: 0
          lcme_flags: 0
          lcme_extent.e_start: 131072
          lcme_extent.e_end: EOF
          lmm_stripe_count: -1
          lmm_stripe_size: 4194304
          lmm_pattern: raid0
          lmm_layout_gen: 0
          lmm_stripe_offset: -1

           

          dmesg:

          LustreError: 2054:0:(lcommon_cl.c:181:cl_file_inode_init()) Skipped 1 previous similar message
          LustreError: 2054:0:(llite_lib.c:2328:ll_prep_inode()) new_inode -fatal: rc -22
          LustreError: 2054:0:(llite_lib.c:2328:ll_prep_inode()) Skipped 11 previous similar messages
          LustreError: 2078:0:(llite_lib.c:2328:ll_prep_inode()) new_inode -fatal: rc -22
          LustreError: 2078:0:(llite_lib.c:2328:ll_prep_inode()) Skipped 4 previous similar messages

          haisong Haisong Cai (Inactive) added a comment -   [root@comet-26-02 dir1] # ls -las ... 13 rw-r r - 1 manu1729 csd102 1024 Feb 20 09:24 file97 13 rw-r r - 1 manu1729 csd102 1024 Feb 20 09:24 file98 13 rw-r r - 1 manu1729 csd102 1024 Feb 20 09:24 file99 ? -????????? ? ? ? ? ? one [root@comet-26-02 dir1] # lfs getstripe one lfs getstripe: error opening one: Invalid argument (22) one lcm_layout_gen: 2 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 131072 lmm_stripe_count: 0 lmm_stripe_size: 131072 lmm_pattern: mdt lmm_layout_gen: 0 lmm_stripe_offset: 0 lcme_id: 2 lcme_mirror_id: 0 lcme_flags: 0 lcme_extent.e_start: 131072 lcme_extent.e_end: EOF lmm_stripe_count: -1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: -1   dmesg: LustreError: 2054:0:(lcommon_cl.c:181:cl_file_inode_init()) Skipped 1 previous similar message LustreError: 2054:0:(llite_lib.c:2328:ll_prep_inode()) new_inode -fatal: rc -22 LustreError: 2054:0:(llite_lib.c:2328:ll_prep_inode()) Skipped 11 previous similar messages LustreError: 2078:0:(llite_lib.c:2328:ll_prep_inode()) new_inode -fatal: rc -22 LustreError: 2078:0:(llite_lib.c:2328:ll_prep_inode()) Skipped 4 previous similar messages

          People

            pfarrell Patrick Farrell (Inactive)
            haisong Haisong Cai (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: