Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11011

checksum type can not be selected permanently

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0
    • None
    • 9223372036854775807

    Description

      Some checksum types might not work correctly even though they are available
      options and have the best speeds during test. In these circumstances, users
      might want to use a certain checksum type which is known to be functional.
      However, "lctl conf_param XXX-YYY.osc.checksum_type=ZZZ" won't help to enforce
      a certain checksum type, because the selected checksum type is determined
      during OSC connection, which will overwrite the LLOG parameter.

      Following is the design of solving the problem:

      To solve this problem, whenever a valid checksum type is set by "lctl
      conf_param" or "lctl set_param", it is remembered as the perferred checksum
      type for the OSC. During connection process, if that checksum type is
      available, that checksum type will be selected as the RPC checksum type
      regardless of its speed.

      The semantics of interface /proc/fs/lustre/osc/*/checksum_type is changed for
      a little bit. If a wrong checksum name is being written into this entry,
      -EINVAL will be returned as before. If the written string is a valid checksum
      name, even though the checksum type is not supported by this OSC/OST pair, the
      checksum type will still be remembered as the perferred checksum type, and
      return value will be -ENOTSUPP. Whenever connecting/reconnecting happens, if
      perferred checksum type is availabe, it will be used for the RPC checksum.

      Attachments

        Issue Links

          Activity

            [LU-11011] checksum type can not be selected permanently
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32349/
            Subject: LU-11011 osc: add preferred checksum type support
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9b6b5e4798281eceb45699431bc871eda6d968c4

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32349/ Subject: LU-11011 osc: add preferred checksum type support Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9b6b5e4798281eceb45699431bc871eda6d968c4
            lixi_wc Li Xi added a comment -

            > Some checksum types might not work correctly even though they are available options and have the best speeds during test.

            This was not caused by Lustre problem.

            A user found a problem of the default checksum type which has the best performance. I don't remember the details. I remember the checksum sometimes was calculated wrongly. The root cause might be a bug of the CPU or the kernel. Thus, the user wants to change the checksum type to another one which doesn't have the best performance. And a persistent configuration would be better than changing everytime when restarting the services.

            lixi_wc Li Xi added a comment - > Some checksum types might not work correctly even though they are available options and have the best speeds during test. This was not caused by Lustre problem. A user found a problem of the default checksum type which has the best performance. I don't remember the details. I remember the checksum sometimes was calculated wrongly. The root cause might be a bug of the CPU or the kernel. Thus, the user wants to change the checksum type to another one which doesn't have the best performance. And a persistent configuration would be better than changing everytime when restarting the services.

            Some checksum types might not work correctly even though they are available options and have the best speeds during test.

            Could you please explain this a bit further? Checksums should not be offered by a server or client if they are not working. If that is the case, it would be better to fix the code not to offer those checksums, rather than forcing users to specify a working checksum manually.

            adilger Andreas Dilger added a comment - Some checksum types might not work correctly even though they are available options and have the best speeds during test. Could you please explain this a bit further? Checksums should not be offered by a server or client if they are not working. If that is the case, it would be better to fix the code not to offer those checksums, rather than forcing users to specify a working checksum manually.

            Example before applying patch:

            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff88007a226800.checksum_type=crc32 adler [crc32c]
            [root@server17-el7-vm1 ~]# lctl conf_param 969362ae-OST0000.osc.checksum_type=adler
            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff88007a226800.checksum_type=crc32 [adler] crc32c
            [root@server17-el7-vm2 ~]# umount /mnt/lustre/
            [root@server17-el7-vm2 ~]# mount -t lustre 10.0.1.148@tcp:/969362ae /mnt/lustre/
            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff880070120000.checksum_type=crc32 adler [crc32c]

                                                                                                                                       ^ checksum change back to crc32c even "lctl conf_param" want to change it to adler.

             

            After patch:

            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff88007abd1000.checksum_type=crc32 [adler] crc32c
            [root@server17-el7-vm1 ~]# lctl conf_param 969362ae-OST0000.osc.checksum_type=crc32
            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff88007abd1000.checksum_type=[crc32] adler crc32c
            [root@server17-el7-vm2 ~]# umount /mnt/lustre/
            [root@server17-el7-vm2 ~]# mount -t lustre 10.0.1.148@tcp:/969362ae /mnt/lustre/
            [root@server17-el7-vm2 ~]# lctl get_param osc.*.checksum_type
            osc.969362ae-OST0000-osc-ffff88007a808800.checksum_type=[crc32] adler crc32c

             

            lixi Li Xi (Inactive) added a comment - Example before applying patch: [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff88007a226800.checksum_type=crc32 adler [crc32c] [root@server17-el7-vm1 ~] # lctl conf_param 969362ae-OST0000.osc.checksum_type=adler [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff88007a226800.checksum_type=crc32 [adler] crc32c [root@server17-el7-vm2 ~] # umount /mnt/lustre/ [root@server17-el7-vm2 ~] # mount -t lustre 10.0.1.148@tcp:/969362ae /mnt/lustre/ [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff880070120000.checksum_type=crc32 adler [crc32c]                                                                                                                            ^ checksum change back to crc32c even "lctl conf_param" want to change it to adler.   After patch: [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff88007abd1000.checksum_type=crc32 [adler] crc32c [root@server17-el7-vm1 ~] # lctl conf_param 969362ae-OST0000.osc.checksum_type=crc32 [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff88007abd1000.checksum_type= [crc32] adler crc32c [root@server17-el7-vm2 ~] # umount /mnt/lustre/ [root@server17-el7-vm2 ~] # mount -t lustre 10.0.1.148@tcp:/969362ae /mnt/lustre/ [root@server17-el7-vm2 ~] # lctl get_param osc.*.checksum_type osc.969362ae-OST0000-osc-ffff88007a808800.checksum_type= [crc32] adler crc32c  

            Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/32349
            Subject: LU-11011 osc: add preferred checksum type support
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 96d577f0b5f782ee2faf6932f07c8785661de2ed

            gerrit Gerrit Updater added a comment - Li Xi (lixi@ddn.com) uploaded a new patch: https://review.whamcloud.com/32349 Subject: LU-11011 osc: add preferred checksum type support Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 96d577f0b5f782ee2faf6932f07c8785661de2ed

            People

              lixi_wc Li Xi
              lixi Li Xi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: