Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 9223372036854775807

    Description

      Ticket to coordinate testing of SLES 15:

      Beta: January 23, 2018

      GA: June 2018 

      https://www.suse.com/releasenotes/x86_64/SUSE-SLES/15/

      Attachments

        Issue Links

          Activity

            [LU-11310] support for SLES 15
            pjones Peter Jones added a comment -

            Chris 

            We haven't really talked about 2.13 and beyond in much detail at the LWG yet. Obviously we're still rather focused on closing out on 2.12 ATM but we could certainly discuss this at the next call in the new year.

            Peter

            pjones Peter Jones added a comment - Chris  We haven't really talked about 2.13 and beyond in much detail at the LWG yet. Obviously we're still rather focused on closing out on 2.12 ATM but we could certainly discuss this at the next call in the new year. Peter
            hornc Chris Horn added a comment -

            Is there any plan/ETA on adding SLES 15 to the build/test matrix?

            hornc Chris Horn added a comment - Is there any plan/ETA on adding SLES 15 to the build/test matrix?

            Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/32693
            Subject: LDEV-645 kernel: add sles15gmc support
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7e1f2d3229f5795186b5236e9422815885cb9e2a

            gerrit Gerrit Updater added a comment - Bob Glossman (bob.glossman@intel.com) uploaded a new patch: https://review.whamcloud.com/32693 Subject: LDEV-645 kernel: add sles15gmc support Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7e1f2d3229f5795186b5236e9422815885cb9e2a

            for unknown reasons lustre rpms generated by lbuild are now broken. The kmp rpms have no ksym() or kernel() entries in their Provides lists and are therefore not installable due to not satisfying any of the Requires that they should. lustre .rpms produced in a manual build are fine, they have full sets of Provides.

            The install time "Broken Pipe" errors mentioned in earlier comments still happen.

            bogl Bob Glossman (Inactive) added a comment - for unknown reasons lustre rpms generated by lbuild are now broken. The kmp rpms have no ksym() or kernel() entries in their Provides lists and are therefore not installable due to not satisfying any of the Requires that they should. lustre .rpms produced in a manual build are fine, they have full sets of Provides. The install time "Broken Pipe" errors mentioned in earlier comments still happen.

            GMC is now available on Intel mirror.
            kernel version is 4.12.14-23-default.

            bogl Bob Glossman (Inactive) added a comment - GMC is now available on Intel mirror. kernel version is 4.12.14-23-default.
            bogl Bob Glossman (Inactive) added a comment - - edited

            due to recent changes in osd-ldiskfs https://review.whamcloud.com/32621 is now needed for server builds on sles15.

            This is a new mod in flight, not landed yet.

            bogl Bob Glossman (Inactive) added a comment - - edited due to recent changes in osd-ldiskfs https://review.whamcloud.com/32621 is now needed for server builds on sles15. This is a new mod in flight, not landed yet.

            GMC for sles15 is now available on suse.com

            bogl Bob Glossman (Inactive) added a comment - GMC for sles15 is now available on suse.com

            RC4 is now available on Intel mirror.
            kernel version is 4.12.14-18-default.

            bogl Bob Glossman (Inactive) added a comment - RC4 is now available on Intel mirror. kernel version is 4.12.14-18-default.
            bogl Bob Glossman (Inactive) added a comment - - edited

            seeing errors reported during all installs of built kmp .rpms.
            examples:

            # rpm -ivh lustre-kmp-default-2* lustre-osd-ldiskfs-kmp-default-2*
            lustre-tests-kmp-default-2* lustre-osd-ldiskfs-mount-2* lustre-2*
            lustre-iokit* lustre-tests-2*
            Preparing...                          ################################# [100%]
            Updating / installing...
               1:lustre-kmp-default-2.11.51_48_g34################################# [ 14%]
            cat: write error: Broken pipe
            cat: write error: Broken pipe
               2:lustre-osd-ldiskfs-mount-2.11.51_################################# [ 29%]
               3:lustre-osd-ldiskfs-kmp-default-2.################################# [ 43%]
            cat: write error: Broken pipe
               4:lustre-2.11.51_48_g340f4d9_dirty-################################# [ 57%]
               5:lustre-tests-kmp-default-2.11.51_################################# [ 71%]
            cat: write error: Broken pipe
            cat: write error: Broken pipe
               6:lustre-iokit-2.11.51_48_g340f4d9_################################# [ 86%]
               7:lustre-tests-2.11.51_48_g340f4d9_################################# [100%]
            

            Those "Broken pipe" errors don't seem to be fatal, but don't know where they are coming from. Suggest there may be some SLES specific .spec file flaws in package creation of .rpms with kernel modules in them. Not sure why they would only appear now in SLES 15.

            bogl Bob Glossman (Inactive) added a comment - - edited seeing errors reported during all installs of built kmp .rpms. examples: # rpm -ivh lustre-kmp-default-2* lustre-osd-ldiskfs-kmp-default-2* lustre-tests-kmp-default-2* lustre-osd-ldiskfs-mount-2* lustre-2* lustre-iokit* lustre-tests-2* Preparing... ################################# [100%] Updating / installing... 1:lustre-kmp-default-2.11.51_48_g34################################# [ 14%] cat: write error: Broken pipe cat: write error: Broken pipe 2:lustre-osd-ldiskfs-mount-2.11.51_################################# [ 29%] 3:lustre-osd-ldiskfs-kmp-default-2.################################# [ 43%] cat: write error: Broken pipe 4:lustre-2.11.51_48_g340f4d9_dirty-################################# [ 57%] 5:lustre-tests-kmp-default-2.11.51_################################# [ 71%] cat: write error: Broken pipe cat: write error: Broken pipe 6:lustre-iokit-2.11.51_48_g340f4d9_################################# [ 86%] 7:lustre-tests-2.11.51_48_g340f4d9_################################# [100%] Those "Broken pipe" errors don't seem to be fatal, but don't know where they are coming from. Suggest there may be some SLES specific .spec file flaws in package creation of .rpms with kernel modules in them. Not sure why they would only appear now in SLES 15.

            now seeing fails in sanity, test 103a
            errors like:

              .
              .
              .
            [198] $ setfacl -m u:bin:rx e -- ok
            [200] $ su bin -- ok
            [201] $ echo e/* -- failed
            e/h                                   ? e/*                                    
            [208] $ touch e/i 2>&1 | sed -e "s/touch .*e\/i.*:/touch \'e\/i\':/" -- ok
            [211] $ su -- ok
            [212] $ setfacl -m u:bin:rwx e -- ok
            [214] $ su bin -- ok
            [215] $ echo i > e/i -- failed
            ~                                     ? e/i: Permission denied                 
            [220] $ su -- ok
            [221] $ touch g -- ok
            [222] $ ln -s g l -- ok
            [223] $ setfacl -m u:bin:rw l -- ok
            [224] $ ls -l g | awk -- '{ print $1, $3, $4 }' -- ok
            [234] $ mknod -m 0660 hdt b 91 64 -- ok
            [235] $ mknod -m 0660 null c 1 3 -- ok
            [236] $ mkfifo -m 0660 fifo -- ok
            [238] $ su bin -- ok
            [239] $ : < hdt -- ok
            [241] $ : < null -- ok
            [243] $ : < fifo -- ok
            [246] $ su -- ok
            [247] $ setfacl -m u:bin:rw hdt null fifo -- ok
            [249] $ su bin -- ok
            [250] $ : < hdt -- failed
            hdt: No such device or address        ? hdt: Permission denied                 
            [252] $ : < null -- failed
            ~                                     ? null: Permission denied                
            [253] $ ( echo blah > fifo & ) ; cat fifo -- failed
            blah                                  ? cat: fifo: Permission denied           
            ~                                     ? fifo: Permission denied                
            [261] $ su -- ok
            [262] $ mkdir -m 600 x -- ok
            [263] $ chown daemon:daemon x -- ok
            [264] $ echo j > x/j -- ok
            [265] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
            [268] $ setfacl -m u:daemon:r x -- ok
            [270] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
            [274] $ echo k > x/k -- ok
            [277] $ chmod 750 x -- ok
            [282] $ su -- ok
            [283] $ cd .. -- ok
            [284] $ rm -rf d -- ok
            101 commands (96 passed, 5 failed)
             sanity test_103a: @@@@@@ FAIL: permissions failed 
              Trace dump:
              = /usr/lib64/lustre/tests/test-framework.sh:5738:error()
              = /usr/lib64/lustre/tests/sanity.sh:8366:test_103a()
              = /usr/lib64/lustre/tests/test-framework.sh:6019:run_one()
              = /usr/lib64/lustre/tests/test-framework.sh:6058:run_one_logged()
              = /usr/lib64/lustre/tests/test-framework.sh:5857:run_test()
              = /usr/lib64/lustre/tests/sanity.sh:8411:main()
            Dumping lctl log to /tmp/test_logs/2018-05-07/133311/sanity.test_103a.*.1525725210.log
            Resetting fail_loc on all nodes...done.
            FAIL 103a (9s)
            
            bogl Bob Glossman (Inactive) added a comment - now seeing fails in sanity, test 103a errors like: . . . [198] $ setfacl -m u:bin:rx e -- ok [200] $ su bin -- ok [201] $ echo e/* -- failed e/h ? e/* [208] $ touch e/i 2>&1 | sed -e "s/touch .*e\/i.*:/touch \'e\/i\':/" -- ok [211] $ su -- ok [212] $ setfacl -m u:bin:rwx e -- ok [214] $ su bin -- ok [215] $ echo i > e/i -- failed ~ ? e/i: Permission denied [220] $ su -- ok [221] $ touch g -- ok [222] $ ln -s g l -- ok [223] $ setfacl -m u:bin:rw l -- ok [224] $ ls -l g | awk -- '{ print $1, $3, $4 }' -- ok [234] $ mknod -m 0660 hdt b 91 64 -- ok [235] $ mknod -m 0660 null c 1 3 -- ok [236] $ mkfifo -m 0660 fifo -- ok [238] $ su bin -- ok [239] $ : < hdt -- ok [241] $ : < null -- ok [243] $ : < fifo -- ok [246] $ su -- ok [247] $ setfacl -m u:bin:rw hdt null fifo -- ok [249] $ su bin -- ok [250] $ : < hdt -- failed hdt: No such device or address ? hdt: Permission denied [252] $ : < null -- failed ~ ? null: Permission denied [253] $ ( echo blah > fifo & ) ; cat fifo -- failed blah ? cat: fifo: Permission denied ~ ? fifo: Permission denied [261] $ su -- ok [262] $ mkdir -m 600 x -- ok [263] $ chown daemon:daemon x -- ok [264] $ echo j > x/j -- ok [265] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok [268] $ setfacl -m u:daemon:r x -- ok [270] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok [274] $ echo k > x/k -- ok [277] $ chmod 750 x -- ok [282] $ su -- ok [283] $ cd .. -- ok [284] $ rm -rf d -- ok 101 commands (96 passed, 5 failed) sanity test_103a: @@@@@@ FAIL: permissions failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5738:error() = /usr/lib64/lustre/tests/sanity.sh:8366:test_103a() = /usr/lib64/lustre/tests/test-framework.sh:6019:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6058:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:5857:run_test() = /usr/lib64/lustre/tests/sanity.sh:8411:main() Dumping lctl log to /tmp/test_logs/2018-05-07/133311/sanity.test_103a.*.1525725210.log Resetting fail_loc on all nodes...done. FAIL 103a (9s)
            bogl Bob Glossman (Inactive) added a comment - - edited

            recently landed mod https://review.whamcloud.com/#/c/31904, "LU-10886 build: fix warnings during autoconf", causes incorrect autoconf detection results when building on SLES 15 with gcc7.

            In particular gcc7 is intolerant of empty initializers like "{ }" in some cases.
            One example of this causing the wrong detection results can be seen in the detection of HAVE_KTIME_TO_TIMESPEC64 in libcfs/autoconf/lustre-libcfs.m4,
            In the autoconf test function

            AC_DEFUN([LIBCFS_KTIME_TO_TIMESPEC64],[
            LB_CHECK_COMPILE([does function 'ktime_to_timespec64' exist],
            ktime_to_timespec64, [
                    #include <linux/hrtimer.h>
                    #include <linux/ktime.h>
            ],[
                    struct timespec64 ts;
                    ktime_t now = { };
            
                    ts = ktime_to_timespec64(now);
            ],[
                    AC_DEFINE(HAVE_KTIME_TO_TIMESPEC64, 1,
                            ['ktime_to_timespec64' is available])
            ])
            ]) # LIBCFS_KTIME_TO_TIMESPEC64
            

            the test code is failing with an error like:

            configure:16718: checking does function 'ktime_to_timespec64' exist
            configure:16749: cp conftest.c build && make -d modules LDFLAGS= LD=/usr/x86_64-suse-linux/bin/ld -m elf_x86_64 CC=gcc -f /home/bogl/lustre-release/build/Makefile LUSTRE_LINUX_CONFIG=/home/bogl/linux-4.12.14-18/.config LINUXINCLUDE= -I/home/bogl/linux-4.12.14-18/arch/x86/include -Iinclude -Iarch/x86/include/generated -I/home/bogl/linux-4.12.14-18/include -Iinclude2 -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated -I/home/bogl/linux-4.12.14-18/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated/uapi -include /home/bogl/linux-4.12.14-18/include/linux/kconfig.h -o tmp_include_depends -o scripts -o include/config/MARKER -C /home/bogl/linux-4.12.14-18 EXTRA_CFLAGS=-Werror-implicit-function-declaration -g -I/home/bogl/lustre-release/libcfs/include -I/home/bogl/lustre-release/lnet/include -I/home/bogl/lustre-release/lustre/include -Wno-format-truncation M=/home/bogl/lustre-release/build
            /home/bogl/lustre-release/build/conftest.c: In function ‘main’:
            /home/bogl/lustre-release/build/conftest.c:58:16: error: empty scalar initializer
              ktime_t now = { };
                            ^
            /home/bogl/lustre-release/build/conftest.c:58:16: note: (near initialization for ‘now’)
            make[1]: *** [scripts/Makefile.build:335: /home/bogl/lustre-release/build/conftest.o] Error 1
            make: *** [Makefile:1549: _module_/home/bogl/lustre-release/build] Error 2
            

            In fact this kernel does have a ktime_to_timespec64() API and the test should succeed.
            With the "= { };" initializer edited out of the test it does succeed.

            bogl Bob Glossman (Inactive) added a comment - - edited recently landed mod https://review.whamcloud.com/#/c/31904 , " LU-10886 build: fix warnings during autoconf", causes incorrect autoconf detection results when building on SLES 15 with gcc7. In particular gcc7 is intolerant of empty initializers like "{ }" in some cases. One example of this causing the wrong detection results can be seen in the detection of HAVE_KTIME_TO_TIMESPEC64 in libcfs/autoconf/lustre-libcfs.m4, In the autoconf test function AC_DEFUN([LIBCFS_KTIME_TO_TIMESPEC64],[ LB_CHECK_COMPILE([does function 'ktime_to_timespec64' exist], ktime_to_timespec64, [ #include <linux/hrtimer.h> #include <linux/ktime.h> ],[ struct timespec64 ts; ktime_t now = { }; ts = ktime_to_timespec64(now); ],[ AC_DEFINE(HAVE_KTIME_TO_TIMESPEC64, 1, ['ktime_to_timespec64' is available]) ]) ]) # LIBCFS_KTIME_TO_TIMESPEC64 the test code is failing with an error like: configure:16718: checking does function 'ktime_to_timespec64' exist configure:16749: cp conftest.c build && make -d modules LDFLAGS= LD=/usr/x86_64-suse-linux/bin/ld -m elf_x86_64 CC=gcc -f /home/bogl/lustre-release/build/Makefile LUSTRE_LINUX_CONFIG=/home/bogl/linux-4.12.14-18/.config LINUXINCLUDE= -I/home/bogl/linux-4.12.14-18/arch/x86/include -Iinclude -Iarch/x86/include/generated -I/home/bogl/linux-4.12.14-18/include -Iinclude2 -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated -I/home/bogl/linux-4.12.14-18/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated/uapi -include /home/bogl/linux-4.12.14-18/include/linux/kconfig.h -o tmp_include_depends -o scripts -o include/config/MARKER -C /home/bogl/linux-4.12.14-18 EXTRA_CFLAGS=-Werror-implicit-function-declaration -g -I/home/bogl/lustre-release/libcfs/include -I/home/bogl/lustre-release/lnet/include -I/home/bogl/lustre-release/lustre/include -Wno-format-truncation M=/home/bogl/lustre-release/build /home/bogl/lustre-release/build/conftest.c: In function ‘main’: /home/bogl/lustre-release/build/conftest.c:58:16: error: empty scalar initializer ktime_t now = { }; ^ /home/bogl/lustre-release/build/conftest.c:58:16: note: (near initialization for ‘now’) make[1]: *** [scripts/Makefile.build:335: /home/bogl/lustre-release/build/conftest.o] Error 1 make: *** [Makefile:1549: _module_/home/bogl/lustre-release/build] Error 2 In fact this kernel does have a ktime_to_timespec64() API and the test should succeed. With the "= { };" initializer edited out of the test it does succeed.

            People

              yujian Jian Yu
              bhoagland Brad Hoagland (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: