Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 9223372036854775807

    Description

      Ticket to coordinate testing of SLES 15:

      Beta: January 23, 2018

      GA: June 2018 

      https://www.suse.com/releasenotes/x86_64/SUSE-SLES/15/

      Attachments

        Issue Links

          Activity

            [LU-11310] support for SLES 15

            for unknown reasons lustre rpms generated by lbuild are now broken. The kmp rpms have no ksym() or kernel() entries in their Provides lists and are therefore not installable due to not satisfying any of the Requires that they should. lustre .rpms produced in a manual build are fine, they have full sets of Provides.

            The install time "Broken Pipe" errors mentioned in earlier comments still happen.

            bogl Bob Glossman (Inactive) added a comment - for unknown reasons lustre rpms generated by lbuild are now broken. The kmp rpms have no ksym() or kernel() entries in their Provides lists and are therefore not installable due to not satisfying any of the Requires that they should. lustre .rpms produced in a manual build are fine, they have full sets of Provides. The install time "Broken Pipe" errors mentioned in earlier comments still happen.

            GMC is now available on Intel mirror.
            kernel version is 4.12.14-23-default.

            bogl Bob Glossman (Inactive) added a comment - GMC is now available on Intel mirror. kernel version is 4.12.14-23-default.
            bogl Bob Glossman (Inactive) added a comment - - edited

            due to recent changes in osd-ldiskfs https://review.whamcloud.com/32621 is now needed for server builds on sles15.

            This is a new mod in flight, not landed yet.

            bogl Bob Glossman (Inactive) added a comment - - edited due to recent changes in osd-ldiskfs https://review.whamcloud.com/32621 is now needed for server builds on sles15. This is a new mod in flight, not landed yet.

            GMC for sles15 is now available on suse.com

            bogl Bob Glossman (Inactive) added a comment - GMC for sles15 is now available on suse.com

            RC4 is now available on Intel mirror.
            kernel version is 4.12.14-18-default.

            bogl Bob Glossman (Inactive) added a comment - RC4 is now available on Intel mirror. kernel version is 4.12.14-18-default.
            bogl Bob Glossman (Inactive) added a comment - - edited

            seeing errors reported during all installs of built kmp .rpms.
            examples:

            # rpm -ivh lustre-kmp-default-2* lustre-osd-ldiskfs-kmp-default-2*
            lustre-tests-kmp-default-2* lustre-osd-ldiskfs-mount-2* lustre-2*
            lustre-iokit* lustre-tests-2*
            Preparing...                          ################################# [100%]
            Updating / installing...
               1:lustre-kmp-default-2.11.51_48_g34################################# [ 14%]
            cat: write error: Broken pipe
            cat: write error: Broken pipe
               2:lustre-osd-ldiskfs-mount-2.11.51_################################# [ 29%]
               3:lustre-osd-ldiskfs-kmp-default-2.################################# [ 43%]
            cat: write error: Broken pipe
               4:lustre-2.11.51_48_g340f4d9_dirty-################################# [ 57%]
               5:lustre-tests-kmp-default-2.11.51_################################# [ 71%]
            cat: write error: Broken pipe
            cat: write error: Broken pipe
               6:lustre-iokit-2.11.51_48_g340f4d9_################################# [ 86%]
               7:lustre-tests-2.11.51_48_g340f4d9_################################# [100%]
            

            Those "Broken pipe" errors don't seem to be fatal, but don't know where they are coming from. Suggest there may be some SLES specific .spec file flaws in package creation of .rpms with kernel modules in them. Not sure why they would only appear now in SLES 15.

            bogl Bob Glossman (Inactive) added a comment - - edited seeing errors reported during all installs of built kmp .rpms. examples: # rpm -ivh lustre-kmp-default-2* lustre-osd-ldiskfs-kmp-default-2* lustre-tests-kmp-default-2* lustre-osd-ldiskfs-mount-2* lustre-2* lustre-iokit* lustre-tests-2* Preparing... ################################# [100%] Updating / installing... 1:lustre-kmp-default-2.11.51_48_g34################################# [ 14%] cat: write error: Broken pipe cat: write error: Broken pipe 2:lustre-osd-ldiskfs-mount-2.11.51_################################# [ 29%] 3:lustre-osd-ldiskfs-kmp-default-2.################################# [ 43%] cat: write error: Broken pipe 4:lustre-2.11.51_48_g340f4d9_dirty-################################# [ 57%] 5:lustre-tests-kmp-default-2.11.51_################################# [ 71%] cat: write error: Broken pipe cat: write error: Broken pipe 6:lustre-iokit-2.11.51_48_g340f4d9_################################# [ 86%] 7:lustre-tests-2.11.51_48_g340f4d9_################################# [100%] Those "Broken pipe" errors don't seem to be fatal, but don't know where they are coming from. Suggest there may be some SLES specific .spec file flaws in package creation of .rpms with kernel modules in them. Not sure why they would only appear now in SLES 15.

            now seeing fails in sanity, test 103a
            errors like:

              .
              .
              .
            [198] $ setfacl -m u:bin:rx e -- ok
            [200] $ su bin -- ok
            [201] $ echo e/* -- failed
            e/h                                   ? e/*                                    
            [208] $ touch e/i 2>&1 | sed -e "s/touch .*e\/i.*:/touch \'e\/i\':/" -- ok
            [211] $ su -- ok
            [212] $ setfacl -m u:bin:rwx e -- ok
            [214] $ su bin -- ok
            [215] $ echo i > e/i -- failed
            ~                                     ? e/i: Permission denied                 
            [220] $ su -- ok
            [221] $ touch g -- ok
            [222] $ ln -s g l -- ok
            [223] $ setfacl -m u:bin:rw l -- ok
            [224] $ ls -l g | awk -- '{ print $1, $3, $4 }' -- ok
            [234] $ mknod -m 0660 hdt b 91 64 -- ok
            [235] $ mknod -m 0660 null c 1 3 -- ok
            [236] $ mkfifo -m 0660 fifo -- ok
            [238] $ su bin -- ok
            [239] $ : < hdt -- ok
            [241] $ : < null -- ok
            [243] $ : < fifo -- ok
            [246] $ su -- ok
            [247] $ setfacl -m u:bin:rw hdt null fifo -- ok
            [249] $ su bin -- ok
            [250] $ : < hdt -- failed
            hdt: No such device or address        ? hdt: Permission denied                 
            [252] $ : < null -- failed
            ~                                     ? null: Permission denied                
            [253] $ ( echo blah > fifo & ) ; cat fifo -- failed
            blah                                  ? cat: fifo: Permission denied           
            ~                                     ? fifo: Permission denied                
            [261] $ su -- ok
            [262] $ mkdir -m 600 x -- ok
            [263] $ chown daemon:daemon x -- ok
            [264] $ echo j > x/j -- ok
            [265] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
            [268] $ setfacl -m u:daemon:r x -- ok
            [270] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok
            [274] $ echo k > x/k -- ok
            [277] $ chmod 750 x -- ok
            [282] $ su -- ok
            [283] $ cd .. -- ok
            [284] $ rm -rf d -- ok
            101 commands (96 passed, 5 failed)
             sanity test_103a: @@@@@@ FAIL: permissions failed 
              Trace dump:
              = /usr/lib64/lustre/tests/test-framework.sh:5738:error()
              = /usr/lib64/lustre/tests/sanity.sh:8366:test_103a()
              = /usr/lib64/lustre/tests/test-framework.sh:6019:run_one()
              = /usr/lib64/lustre/tests/test-framework.sh:6058:run_one_logged()
              = /usr/lib64/lustre/tests/test-framework.sh:5857:run_test()
              = /usr/lib64/lustre/tests/sanity.sh:8411:main()
            Dumping lctl log to /tmp/test_logs/2018-05-07/133311/sanity.test_103a.*.1525725210.log
            Resetting fail_loc on all nodes...done.
            FAIL 103a (9s)
            
            bogl Bob Glossman (Inactive) added a comment - now seeing fails in sanity, test 103a errors like: . . . [198] $ setfacl -m u:bin:rx e -- ok [200] $ su bin -- ok [201] $ echo e/* -- failed e/h ? e/* [208] $ touch e/i 2>&1 | sed -e "s/touch .*e\/i.*:/touch \'e\/i\':/" -- ok [211] $ su -- ok [212] $ setfacl -m u:bin:rwx e -- ok [214] $ su bin -- ok [215] $ echo i > e/i -- failed ~ ? e/i: Permission denied [220] $ su -- ok [221] $ touch g -- ok [222] $ ln -s g l -- ok [223] $ setfacl -m u:bin:rw l -- ok [224] $ ls -l g | awk -- '{ print $1, $3, $4 }' -- ok [234] $ mknod -m 0660 hdt b 91 64 -- ok [235] $ mknod -m 0660 null c 1 3 -- ok [236] $ mkfifo -m 0660 fifo -- ok [238] $ su bin -- ok [239] $ : < hdt -- ok [241] $ : < null -- ok [243] $ : < fifo -- ok [246] $ su -- ok [247] $ setfacl -m u:bin:rw hdt null fifo -- ok [249] $ su bin -- ok [250] $ : < hdt -- failed hdt: No such device or address ? hdt: Permission denied [252] $ : < null -- failed ~ ? null: Permission denied [253] $ ( echo blah > fifo & ) ; cat fifo -- failed blah ? cat: fifo: Permission denied ~ ? fifo: Permission denied [261] $ su -- ok [262] $ mkdir -m 600 x -- ok [263] $ chown daemon:daemon x -- ok [264] $ echo j > x/j -- ok [265] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok [268] $ setfacl -m u:daemon:r x -- ok [270] $ ls -l x/j | awk -- '{sub(/\./, "", $1); print $1, $3, $4 }' -- ok [274] $ echo k > x/k -- ok [277] $ chmod 750 x -- ok [282] $ su -- ok [283] $ cd .. -- ok [284] $ rm -rf d -- ok 101 commands (96 passed, 5 failed) sanity test_103a: @@@@@@ FAIL: permissions failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5738:error() = /usr/lib64/lustre/tests/sanity.sh:8366:test_103a() = /usr/lib64/lustre/tests/test-framework.sh:6019:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6058:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:5857:run_test() = /usr/lib64/lustre/tests/sanity.sh:8411:main() Dumping lctl log to /tmp/test_logs/2018-05-07/133311/sanity.test_103a.*.1525725210.log Resetting fail_loc on all nodes...done. FAIL 103a (9s)
            bogl Bob Glossman (Inactive) added a comment - - edited

            recently landed mod https://review.whamcloud.com/#/c/31904, "LU-10886 build: fix warnings during autoconf", causes incorrect autoconf detection results when building on SLES 15 with gcc7.

            In particular gcc7 is intolerant of empty initializers like "{ }" in some cases.
            One example of this causing the wrong detection results can be seen in the detection of HAVE_KTIME_TO_TIMESPEC64 in libcfs/autoconf/lustre-libcfs.m4,
            In the autoconf test function

            AC_DEFUN([LIBCFS_KTIME_TO_TIMESPEC64],[
            LB_CHECK_COMPILE([does function 'ktime_to_timespec64' exist],
            ktime_to_timespec64, [
                    #include <linux/hrtimer.h>
                    #include <linux/ktime.h>
            ],[
                    struct timespec64 ts;
                    ktime_t now = { };
            
                    ts = ktime_to_timespec64(now);
            ],[
                    AC_DEFINE(HAVE_KTIME_TO_TIMESPEC64, 1,
                            ['ktime_to_timespec64' is available])
            ])
            ]) # LIBCFS_KTIME_TO_TIMESPEC64
            

            the test code is failing with an error like:

            configure:16718: checking does function 'ktime_to_timespec64' exist
            configure:16749: cp conftest.c build && make -d modules LDFLAGS= LD=/usr/x86_64-suse-linux/bin/ld -m elf_x86_64 CC=gcc -f /home/bogl/lustre-release/build/Makefile LUSTRE_LINUX_CONFIG=/home/bogl/linux-4.12.14-18/.config LINUXINCLUDE= -I/home/bogl/linux-4.12.14-18/arch/x86/include -Iinclude -Iarch/x86/include/generated -I/home/bogl/linux-4.12.14-18/include -Iinclude2 -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated -I/home/bogl/linux-4.12.14-18/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated/uapi -include /home/bogl/linux-4.12.14-18/include/linux/kconfig.h -o tmp_include_depends -o scripts -o include/config/MARKER -C /home/bogl/linux-4.12.14-18 EXTRA_CFLAGS=-Werror-implicit-function-declaration -g -I/home/bogl/lustre-release/libcfs/include -I/home/bogl/lustre-release/lnet/include -I/home/bogl/lustre-release/lustre/include -Wno-format-truncation M=/home/bogl/lustre-release/build
            /home/bogl/lustre-release/build/conftest.c: In function ‘main’:
            /home/bogl/lustre-release/build/conftest.c:58:16: error: empty scalar initializer
              ktime_t now = { };
                            ^
            /home/bogl/lustre-release/build/conftest.c:58:16: note: (near initialization for ‘now’)
            make[1]: *** [scripts/Makefile.build:335: /home/bogl/lustre-release/build/conftest.o] Error 1
            make: *** [Makefile:1549: _module_/home/bogl/lustre-release/build] Error 2
            

            In fact this kernel does have a ktime_to_timespec64() API and the test should succeed.
            With the "= { };" initializer edited out of the test it does succeed.

            bogl Bob Glossman (Inactive) added a comment - - edited recently landed mod https://review.whamcloud.com/#/c/31904 , " LU-10886 build: fix warnings during autoconf", causes incorrect autoconf detection results when building on SLES 15 with gcc7. In particular gcc7 is intolerant of empty initializers like "{ }" in some cases. One example of this causing the wrong detection results can be seen in the detection of HAVE_KTIME_TO_TIMESPEC64 in libcfs/autoconf/lustre-libcfs.m4, In the autoconf test function AC_DEFUN([LIBCFS_KTIME_TO_TIMESPEC64],[ LB_CHECK_COMPILE([does function 'ktime_to_timespec64' exist], ktime_to_timespec64, [ #include <linux/hrtimer.h> #include <linux/ktime.h> ],[ struct timespec64 ts; ktime_t now = { }; ts = ktime_to_timespec64(now); ],[ AC_DEFINE(HAVE_KTIME_TO_TIMESPEC64, 1, ['ktime_to_timespec64' is available]) ]) ]) # LIBCFS_KTIME_TO_TIMESPEC64 the test code is failing with an error like: configure:16718: checking does function 'ktime_to_timespec64' exist configure:16749: cp conftest.c build && make -d modules LDFLAGS= LD=/usr/x86_64-suse-linux/bin/ld -m elf_x86_64 CC=gcc -f /home/bogl/lustre-release/build/Makefile LUSTRE_LINUX_CONFIG=/home/bogl/linux-4.12.14-18/.config LINUXINCLUDE= -I/home/bogl/linux-4.12.14-18/arch/x86/include -Iinclude -Iarch/x86/include/generated -I/home/bogl/linux-4.12.14-18/include -Iinclude2 -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated -I/home/bogl/linux-4.12.14-18/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/bogl/linux-4.12.14-18/include/uapi -Iinclude/generated/uapi -include /home/bogl/linux-4.12.14-18/include/linux/kconfig.h -o tmp_include_depends -o scripts -o include/config/MARKER -C /home/bogl/linux-4.12.14-18 EXTRA_CFLAGS=-Werror-implicit-function-declaration -g -I/home/bogl/lustre-release/libcfs/include -I/home/bogl/lustre-release/lnet/include -I/home/bogl/lustre-release/lustre/include -Wno-format-truncation M=/home/bogl/lustre-release/build /home/bogl/lustre-release/build/conftest.c: In function ‘main’: /home/bogl/lustre-release/build/conftest.c:58:16: error: empty scalar initializer ktime_t now = { }; ^ /home/bogl/lustre-release/build/conftest.c:58:16: note: (near initialization for ‘now’) make[1]: *** [scripts/Makefile.build:335: /home/bogl/lustre-release/build/conftest.o] Error 1 make: *** [Makefile:1549: _module_/home/bogl/lustre-release/build] Error 2 In fact this kernel does have a ktime_to_timespec64() API and the test should succeed. With the "= { };" initializer edited out of the test it does succeed.

            RC4 for sles15 is now available on suse.com

            bogl Bob Glossman (Inactive) added a comment - RC4 for sles15 is now available on suse.com

            will attach a patch for our master-lustre branch of e2fsprogs that allows building current, down rev version of lustre e2fsprogs for sles15. Don't know if this is in fact usable on sles15, but at least it builds.

            Note the new .spec file for sles15 is just a copy of the one for sles12 for now.

            bogl Bob Glossman (Inactive) added a comment - will attach a patch for our master-lustre branch of e2fsprogs that allows building current, down rev version of lustre e2fsprogs for sles15. Don't know if this is in fact usable on sles15, but at least it builds. Note the new .spec file for sles15 is just a copy of the one for sles12 for now.
            bogl Bob Glossman (Inactive) added a comment - - edited

            Now that it is becoming possible to do ldiskfs builds for sles15 the need for lustre enabled e2fsprogs is more urgent. The version of our lustre enabled e2fsprogs is old compared to the native version on sles15; 1.42.13.wc6 vs. 1.43.8.

            bogl Bob Glossman (Inactive) added a comment - - edited Now that it is becoming possible to do ldiskfs builds for sles15 the need for lustre enabled e2fsprogs is more urgent. The version of our lustre enabled e2fsprogs is old compared to the native version on sles15; 1.42.13.wc6 vs. 1.43.8.

            People

              yujian Jian Yu
              bhoagland Brad Hoagland (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: