Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5637

sanity test_130a: FIEMAP on 1-stripe file(/mnt/lustre/f130a.sanity) failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • Lustre 2.7.0
    • None
    • client: lustre-master RHEL7
      server: lustre-master RHEL6
      build #2641
    • 3
    • 15784

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/bfcd5a82-3610-11e4-8a7f-5254006e85c2.

      The sub-test test_130a failed with the following error:

      FIEMAP on 1-stripe file(/mnt/lustre/f130a.sanity) failed

      several FIEMAP tests failed from test 130a to 130e

      == sanity test 130a: FIEMAP (1-stripe file) ========================================================== 09:49:12 (1409935752)
      1+0 records in
      1+0 records out
      65536 bytes (66 kB) copied, 0.00196579 s, 33.3 MB/s
      Filesystem type is: bd00bd0
      File size of /mnt/lustre/f130a.sanity is 65536 (16 blocks of 4096 bytes)
       ext:     logical_offset:        physical_offset: length:   expected: flags:
         0:        0..      15:      40923..     40938:     16:             eof
      /mnt/lustre/f130a.sanity: 1 extent found
       sanity test_130a: @@@@@@ FAIL: FIEMAP on 1-stripe file(/mnt/lustre/f130a.sanity) failed 
      

      Attachments

        Issue Links

          Activity

            [LU-5637] sanity test_130a: FIEMAP on 1-stripe file(/mnt/lustre/f130a.sanity) failed
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30391/
            Subject: LU-5637 tests: set filefrag blocksize to 1024
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 34b52c0984db4c10851a6b6bbc0fbf5b53c81234

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30391/ Subject: LU-5637 tests: set filefrag blocksize to 1024 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 34b52c0984db4c10851a6b6bbc0fbf5b53c81234

            Sergey Cheremencev (c17829@cray.com) uploaded a new patch: https://review.whamcloud.com/30391
            Subject: LU-5637 tests: set filefrag blocksize to 1024
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 32bbd9c291f594da74a41cc65d6c8905ad282501

            gerrit Gerrit Updater added a comment - Sergey Cheremencev (c17829@cray.com) uploaded a new patch: https://review.whamcloud.com/30391 Subject: LU-5637 tests: set filefrag blocksize to 1024 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 32bbd9c291f594da74a41cc65d6c8905ad282501
            pjones Peter Jones added a comment -

            Removing 2.10 fixversion because IIUC the high occurence of this issue has been addressed by using the correct e2fsprogs and then all that is left is the long-standing rare iway of triggering this failure

            pjones Peter Jones added a comment - Removing 2.10 fixversion because IIUC the high occurence of this issue has been addressed by using the correct e2fsprogs and then all that is left is the long-standing rare iway of triggering this failure

            It would also be possible to update the test to use "filefrag -k" to print the blocks in 1KB units (this is something we added to the upstream folefteag utility). If the test is on single-stripe files, the unpatched filefrag utility should still work, but it doesn't understand multi-stripe files. Improving upstream filefrag to include support for multi-device filesystems like Lustre, Btrfs, and ZFS, and compression is on my long list of things to do that I just never have time for.

            adilger Andreas Dilger added a comment - It would also be possible to update the test to use "filefrag -k" to print the blocks in 1KB units (this is something we added to the upstream folefteag utility). If the test is on single-stripe files, the unpatched filefrag utility should still work, but it doesn't understand multi-stripe files. Improving upstream filefrag to include support for multi-device filesystems like Lustre, Btrfs, and ZFS, and compression is on my long list of things to do that I just never have time for.

            Minh, could you help to verify that if wrong e2fsprogs is installed on clients again? Thanks in advance.

            niu Niu Yawei (Inactive) added a comment - Minh, could you help to verify that if wrong e2fsprogs is installed on clients again? Thanks in advance.
            jhammond John Hammond added a comment -

            Seems like the wrong e2fsprogs is being installed again. In the recent failures, the extent length reported by filefrag is in terms of 4096 byte blocks, whereas the test expects it in terms of 1K blocks, and 16 != 64.

            jhammond John Hammond added a comment - Seems like the wrong e2fsprogs is being installed again. In the recent failures, the extent length reported by filefrag is in terms of 4096 byte blocks, whereas the test expects it in terms of 1K blocks, and 16 != 64.
            pjones Peter Jones added a comment -

            Niu

            Discussing this on the triage call we thought that rare issue occurring more frequently might be due to PFL changes. Could you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Niu Discussing this on the triage call we thought that rare issue occurring more frequently might be due to PFL changes. Could you please investigate? Thanks Peter
            sarah Sarah Liu added a comment - - edited

            In the past 7 days(from 5/30 to 6/5) lustre-review testing, sanity-130a/b/c/d/e each failed 21 times. The error only seen on review-ldiskfs(10 fails) and review-dne-part1(11 fails). For review-ldiskfs and dne-part1, there were totally 244 sessions ran last week, so the failure rate is about 10%

            sarah Sarah Liu added a comment - - edited In the past 7 days(from 5/30 to 6/5) lustre-review testing, sanity-130a/b/c/d/e each failed 21 times. The error only seen on review-ldiskfs(10 fails) and review-dne-part1(11 fails). For review-ldiskfs and dne-part1, there were totally 244 sessions ran last week, so the failure rate is about 10%

            I think the proper fix is to start building e2fsprogs for sles12 in our tools/e2fsprogs builds, then installing that on sles12 clients. so the fix(es) are outside of lustre, in e2fsprogs and TEI.

            fwiw, I have built and installed the most recent 1.42.12-wc1 version of e2fsprogs locally on sles12 clients and servers without any problems.

            bogl Bob Glossman (Inactive) added a comment - I think the proper fix is to start building e2fsprogs for sles12 in our tools/e2fsprogs builds, then installing that on sles12 clients. so the fix(es) are outside of lustre, in e2fsprogs and TEI. fwiw, I have built and installed the most recent 1.42.12-wc1 version of e2fsprogs locally on sles12 clients and servers without any problems.

            People

              mdiep Minh Diep
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: