Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10335

Ubuntu1604 client sanity-130a: FAIL: filefrag -ves core dumped

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • Lustre 2.12.0
    • Lustre 2.11.0, Lustre 2.10.2, Lustre 2.10.4
    • server: 2.10.2 RC1
      client: Ubuntu16.04
    • 3
    • 9223372036854775807

    Description

      Here is the Maloo link https://testing.hpdd.intel.com/test_sets/cad15292-db78-11e7-9c63-52540065bddc

      test_130a console

      == sanity test 130a: FIEMAP (1-stripe file) ========================================================== 01:03:11 (1512522191)
      1+0 records in
      1+0 records out
      65536 bytes (66 kB, 64 KiB) copied, 0.00117857 s, 55.6 MB/s
      Filesystem type is: bd00bd0
      File size of /mnt/lustre/f130a.sanity is 65536 (16 blocks of 4096 bytes)
       ext:     logical_offset:        physical_offset: length:   expected: flags:
      /usr/lib64/lustre/tests/sanity.sh: line 9256: 25375 Aborted                 (core dumped) filefrag -ves $fm_file
       sanity test_130a: @@@@@@ FAIL: filefrag /mnt/lustre/f130a.sanity failed
      

      test_130b/c/e

      == sanity test 130e: FIEMAP (test continuation FIEMAP calls) ========================================= 01:03:25 (1512522205)
      /mnt/lustre/f130e.sanity: FIBMAP unsupported
      Filesystem type is: bd00bd0
      File size of /mnt/lustre/f130e.sanity is 67043328 (16368 blocks of 4096 bytes)
      

      Attachments

        Issue Links

          Activity

            [LU-10335] Ubuntu1604 client sanity-130a: FAIL: filefrag -ves core dumped

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33459/
            Subject: LU-10335 test: enable sanity 130 tests for Ubuntu
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 41a099f9c03c2a3ff62360433985ea5de3e52962

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33459/ Subject: LU-10335 test: enable sanity 130 tests for Ubuntu Project: fs/lustre-release Branch: master Current Patch Set: Commit: 41a099f9c03c2a3ff62360433985ea5de3e52962

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33459
            Subject: LU-10335 test: enable sanity 130 tests for Ubuntu
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 59bab702aed6d10ae70336b8372a13e6169093ee

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33459 Subject: LU-10335 test: enable sanity 130 tests for Ubuntu Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 59bab702aed6d10ae70336b8372a13e6169093ee

            Once we have patched e2fsprogs for ldiskfs support for Ubuntu this should go away.

            simmonsja James A Simmons added a comment - Once we have patched e2fsprogs for ldiskfs support for Ubuntu this should go away.

            The patched e2fsprogs does not have this problem, only the unpatched e2fsprogs. It is good that this is included in Ubuntu 18, and it isn't clear we can do anything about Ubuntu 16 at this point.

            adilger Andreas Dilger added a comment - The patched e2fsprogs does not have this problem, only the unpatched e2fsprogs. It is good that this is included in Ubuntu 18, and it isn't clear we can do anything about Ubuntu 16 at this point.

            I just did a apt-get source e2fsprogs and checked for the fix you posted here. For Ubuntu18 the fix is there but its lacking in Ubuntu16. Since for 2.12 we have Ubuntu server support we will need the lustre special e2fsprogs anyways.

            simmonsja James A Simmons added a comment - I just did a apt-get source e2fsprogs and checked for the fix you posted here. For Ubuntu18 the fix is there but its lacking in Ubuntu16. Since for 2.12 we have Ubuntu server support we will need the lustre special e2fsprogs anyways.

            If the above patch has been included into the Ubuntu e2fsprogs (at least for the versions we are testing), then this ALWAYS_EXCEPT can be removed. If not, can you please file a ticket with the upstream Ubuntu bug tracker to have them backport the above patch from e2fsprogs master into their release.

            adilger Andreas Dilger added a comment - If the above patch has been included into the Ubuntu e2fsprogs (at least for the versions we are testing), then this ALWAYS_EXCEPT can be removed. If not, can you please file a ticket with the upstream Ubuntu bug tracker to have them backport the above patch from e2fsprogs master into their release.

            This has landed to upstream e2fsprogs for the 1.45 and 1.44.2 releases:

            commit 17a1f2c1929630e3a79e6b98168d56f96acf2e8b
            Author:     Andreas Dilger <adilger@dilger.ca>
            AuthorDate: Thu Mar 29 12:36:54 2018 -0600
            Commit:     Theodore Ts'o <tytso@mit.edu>
            CommitDate: Thu Mar 29 23:01:19 2018 -0400
            
                filefrag: avoid temporary buffer overflow
                
                If an unknown flag is present in a FIEMAP extent, it is printed as a
                hex value into a temporary buffer before adding it to the flags.  If
                that unknown flag is over 0xfff then it will overflow the temporary
                buffer.
                
                Reported-by: Sarah Liu <wei3.liu@intel.com>
                Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10335
                Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
                Signed-off-by: Theodore Ts'o <tytso@mit.edu>
            

            Add to ALWAYS_EXCEPT for Ubuntu until they get an updated e2fsprogs release with this fix and/or we install our patched e2fsprogs.

            adilger Andreas Dilger added a comment - This has landed to upstream e2fsprogs for the 1.45 and 1.44.2 releases: commit 17a1f2c1929630e3a79e6b98168d56f96acf2e8b Author: Andreas Dilger <adilger@dilger.ca> AuthorDate: Thu Mar 29 12:36:54 2018 -0600 Commit: Theodore Ts'o <tytso@mit.edu> CommitDate: Thu Mar 29 23:01:19 2018 -0400 filefrag: avoid temporary buffer overflow If an unknown flag is present in a FIEMAP extent, it is printed as a hex value into a temporary buffer before adding it to the flags. If that unknown flag is over 0xfff then it will overflow the temporary buffer. Reported-by: Sarah Liu <wei3.liu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10335 Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Add to ALWAYS_EXCEPT for Ubuntu until they get an updated e2fsprogs release with this fix and/or we install our patched e2fsprogs.

            According to https://marc.info/?l=linux-ext4&m=152010285623799&w=2 the patch was landed, but was subsequently lost from the tree:

            List: linux-ext4
            Subject: Re: [PATCH] filefrag: avoid temporary buffer overflow
            From: Theodore Ts'o <tytso () mit ! edu>
            Date: 2018-03-03 18:47:12
            Message-ID: 20180303184712.GA26224 () thunk ! org

            On Fri, Mar 02, 2018 at 09:48:28AM -0800, Darrick J. Wong wrote:
            > On Thu, Mar 01, 2018 at 01:09:46PM -0700, Andreas Dilger wrote:
            > > From: Andreas Dilger <adilger@dilger.ca>
            > >
            > > If an unknown flag is present in a FIEMAP extent, it is printed as a
            > > hex value into a temporary buffer before adding it to the flags. If
            > > that unknown flag is over 0xffff then it will overflow the temporary
            > > buffer.
            > >
            > > Reported-by: Sarah Liu <wei3.liu@intel.com>
            > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10335
            > > Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
            >
            > Looks ok,
            > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

            Thanks, applied with the 0xfff fixup in the commit description.

            • Ted

            I've resubmitted the patch, and asked that it also be landed to the Debian maintenance branch so it will appear in Ubuntu.

            adilger Andreas Dilger added a comment - According to https://marc.info/?l=linux-ext4&m=152010285623799&w=2 the patch was landed, but was subsequently lost from the tree: List: linux-ext4 Subject: Re: [PATCH] filefrag: avoid temporary buffer overflow From: Theodore Ts'o <tytso () mit ! edu> Date: 2018-03-03 18:47:12 Message-ID: 20180303184712.GA26224 () thunk ! org On Fri, Mar 02, 2018 at 09:48:28AM -0800, Darrick J. Wong wrote: > On Thu, Mar 01, 2018 at 01:09:46PM -0700, Andreas Dilger wrote: > > From: Andreas Dilger <adilger@dilger.ca> > > > > If an unknown flag is present in a FIEMAP extent, it is printed as a > > hex value into a temporary buffer before adding it to the flags. If > > that unknown flag is over 0xffff then it will overflow the temporary > > buffer. > > > > Reported-by: Sarah Liu <wei3.liu@intel.com> > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10335 > > Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> > > Looks ok, > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Thanks, applied with the 0xfff fixup in the commit description. Ted I've resubmitted the patch, and asked that it also be landed to the Debian maintenance branch so it will appear in Ubuntu.
            adilger Andreas Dilger added a comment - - edited

            The problem was that the unpatched Ubuntu e2fsprogs was printing unknown flags into a temporary buffer, but Lustre always sets the 0x80000000 flag to identify network-based filesystems. However, this overflows the temporary buffer, which was only 6 bytes:

                    /* print any unknown flags as hex values */
                    for (mask = 1; fe_flags != 0 && mask != 0; mask <<= 1) {
                            char hex[6];
            
                            if ((fe_flags & mask) == 0)
                                    continue;
                            sprintf(hex, "%#04x,", mask);
                            print_flag(&fe_flags, mask, flags, hex);
                    }
            

            Any unknown flag would overflow this, since it would always have at least 4 hex digits, plus the leading 0x and a trailing NUL, so at least 7 characters printed each time. I've submitted a patch upstream for this.

            adilger Andreas Dilger added a comment - - edited The problem was that the unpatched Ubuntu e2fsprogs was printing unknown flags into a temporary buffer, but Lustre always sets the 0x80000000 flag to identify network-based filesystems. However, this overflows the temporary buffer, which was only 6 bytes: /* print any unknown flags as hex values */ for (mask = 1; fe_flags != 0 && mask != 0; mask <<= 1) { char hex[6]; if ((fe_flags & mask) == 0) continue ; sprintf(hex, "%#04x," , mask); print_flag(&fe_flags, mask, flags, hex); } Any unknown flag would overflow this, since it would always have at least 4 hex digits, plus the leading 0x and a trailing NUL , so at least 7 characters printed each time. I've submitted a patch upstream for this.
            sarah Sarah Liu added a comment - - edited

            I rebuild e2fsprogs with debug symbols but cannot hit the problem with the updated filefrag

            root@onyx-24vm1:~# file /usr/sbin/filefrag 
            /usr/sbin/filefrag: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=6c421f2064cfcb7aff7314dcb9db4f380e7378f0, not stripped
            
            root@onyx-24vm1:~# gdb filefrag
            GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
            Copyright (C) 2016 Free Software Foundation, Inc.
            License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
            This is free software: you are free to change and redistribute it.
            There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
            and "show warranty" for details.
            This GDB was configured as "x86_64-linux-gnu".
            Type "show configuration" for configuration details.
            For bug reporting instructions, please see:
            <http://www.gnu.org/software/gdb/bugs/>.
            Find the GDB manual and other documentation resources online at:
            <http://www.gnu.org/software/gdb/documentation/>.
            For help, type "help".
            Type "apropos word" to search for commands related to "word"...
            Reading symbols from filefrag...done.
            (gdb) set args -ves /mnt/lustre/foo
            (gdb) run
            Starting program: /usr/sbin/filefrag -ves /mnt/lustre/foo
            Filesystem type is: bd00bd0
            File size of /mnt/lustre/foo is 65536 (16 blocks of 4096 bytes)
             ext:     logical_offset:        physical_offset: length:   expected: flags:
               0:        0..      15:      35346..     35361:     16:             last,0x80000000,eof
            /mnt/lustre/foo: 1 extent found
            [Inferior 1 (process 10604) exited normally]
            (gdb) quit
            
            sarah Sarah Liu added a comment - - edited I rebuild e2fsprogs with debug symbols but cannot hit the problem with the updated filefrag root@onyx-24vm1:~# file /usr/sbin/filefrag /usr/sbin/filefrag: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=6c421f2064cfcb7aff7314dcb9db4f380e7378f0, not stripped root@onyx-24vm1:~# gdb filefrag GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from filefrag...done. (gdb) set args -ves /mnt/lustre/foo (gdb) run Starting program: /usr/sbin/filefrag -ves /mnt/lustre/foo Filesystem type is: bd00bd0 File size of /mnt/lustre/foo is 65536 (16 blocks of 4096 bytes) ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 15: 35346.. 35361: 16: last,0x80000000,eof /mnt/lustre/foo: 1 extent found [Inferior 1 (process 10604) exited normally] (gdb) quit

            People

              sarah Sarah Liu
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: