Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15363

Don't use lustre modules to test LNet with sanity-lnet

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • Lustre 2.15.0
    • None
    • 9223372036854775807

    Description

      A few sanity-lnet test were failing for the native Linux client. The reason for this is by default the LNet stack is initialized with LNET_PID_ANY which doesn't automatically setup LNet with the module parameters. Currently sanity-lnet works around this by loading the lustre modules which initialize the LNet stack with LNET_PID_LUSTRE which does properly setup the lnet stack. This doesn't work for the native Linux client since Lustre doesn't start the LNet at module loading but mounting which sanity-lnet doesn't do.

      Attachments

        Issue Links

          Activity

            [LU-15363] Don't use lustre modules to test LNet with sanity-lnet
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45834/
            Subject: LU-15363 tests: don't use lustre module to test lnet
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e41f91dc90a0977f7ea85b199b7e5809c56b810e

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45834/ Subject: LU-15363 tests: don't use lustre module to test lnet Project: fs/lustre-release Branch: master Current Patch Set: Commit: e41f91dc90a0977f7ea85b199b7e5809c56b810e

            The reason for this is due to LNet internally calling request_module() to load the LNDs. That only works if the LND modules are in the standard /lib/modules so it the sand box approach this breaks. I'm playing with an idea of making config_on_load and module parameter something  that can be set latter after both lnet core and the LND drivers are loaded. Then the setup could be done. That way request_modules() is never called.

            simmonsja James A Simmons added a comment - The reason for this is due to LNet internally calling request_module() to load the LNDs. That only works if the LND modules are in the standard /lib/modules so it the sand box approach this breaks. I'm playing with an idea of making config_on_load and module parameter something  that can be set latter after both lnet core and the LND drivers are loaded. Then the setup could be done. That way request_modules() is never called.
            hornc Chris Horn added a comment -

            Patch breaks running the test suite out of a build directory:

            sles15build01:/home/hornc/lustre-filesystem/lustre/tests # ./auster -N -v sanity-lnet
            Started at Wed Dec 15 14:34:08 CST 2021
            sles15build01: executing check_logdir /tmp/test_logs/2021-12-15/143407
            sles15build01: ../libcfs/libcfs/libcfs options: 'libcfs_debug=320735104 libcfs_subsystem_debug=-2049'
            Logging to shared log directory: /tmp/test_logs/2021-12-15/143407
            sles15build01: executing yml_node
            IOC_LIBCFS_GET_NI error 22: Invalid argument
            Client: 2.14.55.170
            MDS: 2.14.55.170
            OSS: 2.14.55.170
            running: sanity-lnet
            run_suite sanity-lnet /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh
            -----============= acceptance-small: sanity-lnet ============----- Wed Dec 15 14:34:11 CST 2021
            Running: bash /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh
            excepting tests:
            opening /dev/obd failed: No such file or directory
            hint: the kernel modules may not be loaded
            Stopping clients: sles15build01 /mnt/lustre (opts:-f)
            Stopping clients: sles15build01 /mnt/lustre2 (opts:-f)
            modules unloaded.
            ip netns exec test_ns ip addr add 10.1.2.3/31 dev test1pg
            ip netns exec test_ns ip link set test1pg up
            libkmod: kmod_module_get_holders: could not open '/sys/module/x86_pkg_temp_thermal/holders': No such file or directory
            libkmod: kmod_module_get_holders: could not open '/sys/module/pcc_cpufreq/holders': No such file or directory
            ../libcfs/libcfs/libcfs options: 'libcfs_debug=320735104 libcfs_subsystem_debug=-2049'
            ../lnet/lnet/lnet options: 'config_on_load=1'
            IOC_LIBCFS_GET_NI error 100: Network is down
             sanity-lnet : @@@@@@ FAIL: No NID configured after module load
              Trace dump:
              = /home/hornc/lustre-filesystem/lustre/tests/test-framework.sh:6336:error()
              = /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh:255:main()
            Dumping lctl log to /tmp/test_logs/2021-12-15/143407/sanity-lnet..*.1639600455.log
            Dumping logs only on local client.
            sanity-lnet returned 1
            Finished at Wed Dec 15 14:34:15 CST 2021 in 8s
            ./auster: completed with rc 0
            sles15build01:/home/hornc/lustre-filesystem/lustre/tests # dmesg | tail
            [3728796.892739] Lustre: DEBUG MARKER: -----============= acceptance-small: sanity-lnet ============----- Wed Dec 15 14:34:11 CST 2021
            [3728797.623735] Lustre: DEBUG MARKER: excepting tests:
            [3728797.872111] device-mapper: uevent: version 1.0.3
            [3728797.872371] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com
            [3728798.910153] IPv6: ADDRCONF(NETDEV_UP): test1pl: link is not ready
            [3728798.924060] IPv6: ADDRCONF(NETDEV_CHANGE): test1pl: link becomes ready
            [3728799.060046] LNet: HW NUMA nodes: 4, HW CPU cores: 32, npartitions: 4
            [3728799.063595] alg: No test for adler32 (adler32-zlib)
            [3728799.954055] LNetError: 5456:0:(api-ni.c:2574:lnet_startup_lndnet()) Can't load LND tcp, module ksocklnd, rc=256
            [3728800.129812] Lustre: DEBUG MARKER: sanity-lnet : @@@@@@ FAIL: No NID configured after module load
            sles15build01:/home/hornc/lustre-filesystem/lustre/tests #
            
            hornc Chris Horn added a comment - Patch breaks running the test suite out of a build directory: sles15build01:/home/hornc/lustre-filesystem/lustre/tests # ./auster -N -v sanity-lnet Started at Wed Dec 15 14:34:08 CST 2021 sles15build01: executing check_logdir /tmp/test_logs/2021-12-15/143407 sles15build01: ../libcfs/libcfs/libcfs options: 'libcfs_debug=320735104 libcfs_subsystem_debug=-2049' Logging to shared log directory: /tmp/test_logs/2021-12-15/143407 sles15build01: executing yml_node IOC_LIBCFS_GET_NI error 22: Invalid argument Client: 2.14.55.170 MDS: 2.14.55.170 OSS: 2.14.55.170 running: sanity-lnet run_suite sanity-lnet /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh -----============= acceptance-small: sanity-lnet ============----- Wed Dec 15 14:34:11 CST 2021 Running: bash /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh excepting tests: opening /dev/obd failed: No such file or directory hint: the kernel modules may not be loaded Stopping clients: sles15build01 /mnt/lustre (opts:-f) Stopping clients: sles15build01 /mnt/lustre2 (opts:-f) modules unloaded. ip netns exec test_ns ip addr add 10.1.2.3/31 dev test1pg ip netns exec test_ns ip link set test1pg up libkmod: kmod_module_get_holders: could not open '/sys/module/x86_pkg_temp_thermal/holders': No such file or directory libkmod: kmod_module_get_holders: could not open '/sys/module/pcc_cpufreq/holders': No such file or directory ../libcfs/libcfs/libcfs options: 'libcfs_debug=320735104 libcfs_subsystem_debug=-2049' ../lnet/lnet/lnet options: 'config_on_load=1' IOC_LIBCFS_GET_NI error 100: Network is down sanity-lnet : @@@@@@ FAIL: No NID configured after module load Trace dump: = /home/hornc/lustre-filesystem/lustre/tests/test-framework.sh:6336:error() = /home/hornc/lustre-filesystem/lustre/tests/sanity-lnet.sh:255:main() Dumping lctl log to /tmp/test_logs/2021-12-15/143407/sanity-lnet..*.1639600455.log Dumping logs only on local client. sanity-lnet returned 1 Finished at Wed Dec 15 14:34:15 CST 2021 in 8s ./auster: completed with rc 0 sles15build01:/home/hornc/lustre-filesystem/lustre/tests # dmesg | tail [3728796.892739] Lustre: DEBUG MARKER: -----============= acceptance-small: sanity-lnet ============----- Wed Dec 15 14:34:11 CST 2021 [3728797.623735] Lustre: DEBUG MARKER: excepting tests: [3728797.872111] device-mapper: uevent: version 1.0.3 [3728797.872371] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com [3728798.910153] IPv6: ADDRCONF(NETDEV_UP): test1pl: link is not ready [3728798.924060] IPv6: ADDRCONF(NETDEV_CHANGE): test1pl: link becomes ready [3728799.060046] LNet: HW NUMA nodes: 4, HW CPU cores: 32, npartitions: 4 [3728799.063595] alg: No test for adler32 (adler32-zlib) [3728799.954055] LNetError: 5456:0:(api-ni.c:2574:lnet_startup_lndnet()) Can't load LND tcp, module ksocklnd, rc=256 [3728800.129812] Lustre: DEBUG MARKER: sanity-lnet : @@@@@@ FAIL: No NID configured after module load sles15build01:/home/hornc/lustre-filesystem/lustre/tests #

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/45834
            Subject: LU-15363 tests: don't use lustre module to test lnet
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7fd4a2145aed2df28e454b37a64870179cc2e2f7

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/45834 Subject: LU-15363 tests: don't use lustre module to test lnet Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7fd4a2145aed2df28e454b37a64870179cc2e2f7

            People

              simmonsja James A Simmons
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: