Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14673

panic: crc32-table: crc32 alg self test failed in fips mode!

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.7, Lustre 2.15.0
    • Lustre 2.12.6
    • lustre-2.12.6_4.llnl-2.t4.x86_64
      4.18.0-240.22.1.1toss.t4.x86_64
      fips=1
    • 3
    • 9223372036854775807

    Description

      upon loading LNet, node panics with this:

      libcfs: loading out-of-tree module taints kernel.
      libcfs: module verification failed: signature and/or required key missing - tainting kernel
      LNet: HW NUMA nodes: 2, HW CPU cores: 64, npartitions: 2
      alg: No test for adler32 (adler32-zlib)
      alg: hash: digest failed on test 1 for crc32-table: ret=126
      Kernel panic - not syncing: crc32-table: crc32 alg self test failed in fips mode!
      
      CPU: 11 PID: 70553 Comm: cryptomgr_test Tainted: G           OE    --------- -  - 4.18.0-240.22.1.1toss.t4.x86_64 #1
      Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 10/30/2020
      Call Trace:
       dump_stack+0x5c/0x80
       panic+0xe7/0x2a9
       ? __alg_test_hash+0x55/0x80
       alg_test.cold.21+0x13/0x44
       ? __switch_to_asm+0x41/0x70
       ? __switch_to_asm+0x35/0x70
       ? __switch_to_asm+0x41/0x70
       ? __switch_to+0x7a/0x400
       ? __schedule+0x2cf/0x720
       ? crypto_acomp_scomp_free_ctx+0x30/0x30
       cryptomgr_test+0x27/0x50
       kthread+0x11d/0x140
       ? kthread_flush_work_fn+0x10/0x10
       ret_from_fork+0x22/0x40
      Kernel Offset: 0x18c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      

      Loading 2.14.0 LNet under fips=1 does not cause the panic

      Attachments

        Issue Links

          Activity

            [LU-14673] panic: crc32-table: crc32 alg self test failed in fips mode!

            Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/43656
            Subject: LU-14673 sec: annotate algorithms taking optional key
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9b32694f26424030024a16c91a7e4575d5b281c2

            gerrit Gerrit Updater added a comment - Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/43656 Subject: LU-14673 sec: annotate algorithms taking optional key Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9b32694f26424030024a16c91a7e4575d5b281c2
            gerrit Gerrit Updater added a comment - - edited

            Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/43653
            Subject: LU-14673 sec: annotate algorithms taking optional key
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 2
            Commit: cc066c79a26d927e11dcfdef96eb5ab77dc7025a

            gerrit Gerrit Updater added a comment - - edited Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/43653 Subject: LU-14673 sec: annotate algorithms taking optional key Project: fs/lustre-release Branch: b2_12 Current Patch Set: 2 Commit: cc066c79a26d927e11dcfdef96eb5ab77dc7025a

            Patch https://review.whamcloud.com/35342 has been abandoned as we cannot drop RHEL6 support in 2.12.

            An even cleaner approach is to fix the root cause of the error:

            alg: hash: digest failed on test 1 for crc32-table: ret=126
            

            Errno 126 is:

            #define ENOKEY          126     /* Required key not available */
            

            And it appears that crc32 needs to set the CRYPTO_ALG_OPTIONAL_KEY flag to work properly. I will push a patch to fix this.

            sebastien Sebastien Buisson added a comment - Patch https://review.whamcloud.com/35342 has been abandoned as we cannot drop RHEL6 support in 2.12. An even cleaner approach is to fix the root cause of the error: alg: hash: digest failed on test 1 for crc32-table: ret=126 Errno 126 is: #define ENOKEY 126 /* Required key not available */ And it appears that crc32 needs to set the CRYPTO_ALG_OPTIONAL_KEY flag to work properly. I will push a patch to fix this.

            Maybe the most straightforward way to get rid of this problem is to not call LIBCFS_HAVE_CRC32 config check in libcfs/autoconf/lustre-libcfs.m4, resulting in built-in crc32 not being used. I tested this quick solution, it works.

            A cleaner approach is to backport patch https://review.whamcloud.com/35342 to b2_12, I just pushed a patch for this:
            https://review.whamcloud.com/43623

            sebastien Sebastien Buisson added a comment - Maybe the most straightforward way to get rid of this problem is to not call LIBCFS_HAVE_CRC32 config check in libcfs/autoconf/lustre-libcfs.m4, resulting in built-in crc32 not being used. I tested this quick solution, it works. A cleaner approach is to backport patch https://review.whamcloud.com/35342 to b2_12, I just pushed a patch for this: https://review.whamcloud.com/43623

            The special crc32 handling is left overs from the RHEL6 days which is why it was removed in newer lustre versions. All the special crc32 handling Lustre did is now apart of the supported kernels.

            simmonsja James A Simmons added a comment - The special crc32 handling is left overs from the RHEL6 days which is why it was removed in newer lustre versions. All the special crc32 handling Lustre did is now apart of the supported kernels.

            Hi Olaf,

            Sorry for the confusion, I meant:

            • it fails on RHEL8.3, no matter FIPS is enabled or not, but only panics when FIPS is enabled;
            • it does not fail on RHEL7.
            sebastien Sebastien Buisson added a comment - Hi Olaf, Sorry for the confusion, I meant: it fails on RHEL8.3, no matter FIPS is enabled or not, but only panics when FIPS is enabled; it does not fail on RHEL7.
            ofaaland Olaf Faaland added a comment -

            Hi Sebastian,

            When you say "it [the digest test] always fails", do you mean it always fails under the RHEL 8.3 kernel, but succeeds under the RHEL 7 kernel?

            thanks

            ofaaland Olaf Faaland added a comment - Hi Sebastian, When you say "it [the digest test] always fails", do you mean it always fails under the RHEL 8.3 kernel, but succeeds under the RHEL 7 kernel? thanks

            The fact that it does not crash with 2.14 is due to patch https://review.whamcloud.com/35342, only landed to master in early January 2020. This patch is a (very) big one, whose objective was to simplify the Lustre code by removing obsolete config checks. Among those was cfs_crypto_crc32_register() and all the Lustre specific crc32 implementation done in libcfs/libcfs/linux/linux-crypto-crc32.c.

            No crc32, no crash

            I can try to see why the digest test is failing (the fact that it fails is not due to FIPS, it always fails, but with FIPS enabled it triggers a panic). But maybe the most obvious move would be to remove the call to cfs_crypto_crc32_register(). Any suggestion adilger?

            sebastien Sebastien Buisson added a comment - The fact that it does not crash with 2.14 is due to patch https://review.whamcloud.com/35342 , only landed to master in early January 2020. This patch is a (very) big one, whose objective was to simplify the Lustre code by removing obsolete config checks. Among those was cfs_crypto_crc32_register() and all the Lustre specific crc32 implementation done in libcfs/libcfs/linux/linux-crypto-crc32.c. No crc32, no crash I can try to see why the digest test is failing (the fact that it fails is not due to FIPS, it always fails, but with FIPS enabled it triggers a panic). But maybe the most obvious move would be to remove the call to cfs_crypto_crc32_register() . Any suggestion adilger ?

            Olaf, since it looks like you are building your own patched "1toss" client kernel, you could potentially disable this check until the problem is understood and fixed.

            adilger Andreas Dilger added a comment - Olaf, since it looks like you are building your own patched " 1toss " client kernel, you could potentially disable this check until the problem is understood and fixed.

            This looks related to LU-13355, but according to patch https://review.whamcloud.com/38205 "LU-13355 crypto: crypto engine wrappers in libcfs" the crc32 crypto wrapper should be fixed since 2.12.5.

            Possibly something has changed in how FIPS is being checked in the 4.18 kernel?

            adilger Andreas Dilger added a comment - This looks related to LU-13355 , but according to patch https://review.whamcloud.com/38205 " LU-13355 crypto: crypto engine wrappers in libcfs " the crc32 crypto wrapper should be fixed since 2.12.5. Possibly something has changed in how FIPS is being checked in the 4.18 kernel?
            ofaaland Olaf Faaland added a comment -

            For my records, my internal ticket is TOSS5190

            ofaaland Olaf Faaland added a comment - For my records, my internal ticket is TOSS5190

            People

              sebastien Sebastien Buisson
              ofaaland Olaf Faaland
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: