Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11155

hidden failures of sanity 804 test

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      sanity test 804 never fails in recent test runs:

      https://testing.whamcloud.com/sub_tests/query?utf8=%E2%9C%93&warn%5Bnotice%5D=&test_set_script_id=f9516376-32bc-11e0-aaee-52540025f9ae&sub_test_script_id=fb652518-c4a9-11e7-8027-52540065bddc&query_bugs=&builds=&hosts=&commit_id=&horizon=&window%5Bstart_date%5D=2018-01-01&window%5Bend_date%5D=2018-07-17&os_type_id=&distribution_type_id=&architecture_type_id=&file_system_type_id=&lustre_branch_id=24a6947e-04a9-11e1-bb5f-52540025f9af&network_type_id=&commit=Update+results&num_results=250

       

      the results are either green (PASS) or yellow (SKIP).

      but, deeper look at the test output finds not caught e2fsck erros.

      For example, first green entry at the top:

       

      https://testing.whamcloud.com/sub_tests/6e7f8e5e-eeb1-11e7-8c23-52540065bddc

      and its full test log:

      https://testing.whamcloud.com/test_logs/6f699440-eeb1-11e7-8c23-52540065bddc/show_text

      contains e2fsck complains about fs corruptions:

       

      trevis-4vm4: e2fsck 1.42.13.wc6 (05-Feb-2017)
      trevis-4vm4: [QUOTA WARNING] Usage inconsistent for ID 0:actual (14385152, 401) != expected (14381056, 400)
      trevis-4vm4: [QUOTA WARNING] Usage inconsistent for ID 500:actual (278528, 1) != expected (282624, 2)
      Pass 1: Checking inodes, blocks, and sizes
      Pass 1: Memory used: 288k/136k (123k/166k), time:  0.12/ 0.04/ 0.02
      Pass 1: I/O read: 51MB, write: 0MB, rate: 410.09MB/s
      Pass 2: Checking directory structure
      Entry '..' in .../??? (8487) has deleted/unused inode 8486.  Clear? no
      
      Pass 2: Memory used: 288k/272k (77k/212k), time:  0.01/ 0.00/ 0.00
      Pass 2: I/O read: 3MB, write: 0MB, rate: 523.83MB/s
      Pass 3: Checking directory connectivity
      Peak memory: Memory used: 288k/272k (78k/211k), time:  0.14/ 0.04/ 0.03
      Unconnected directory inode 8487 (.../???)
      Connect to /lost+found? no
      
      '..' in ... (8487) is ... (8486), should be <The NULL inode> (0).
      Fix? no
      
      Pass 3: Memory used: 288k/272k (75k/214k), time:  0.00/ 0.00/ 0.00
      Pass 3: I/O read: 1MB, write: 0MB, rate: 1432.66MB/s
      Pass 4: Checking reference counts
      Inode 213 ref count is 23, should be 22.  Fix? no
      
      Inode 8487 ref count is 4, should be 3.  Fix? no
      
      Unattached inode 8505
      Connect to /lost+found? no
      
      Pass 4: Memory used: 288k/0k (70k/219k), time:  0.02/ 0.02/ 0.00
      Pass 4: I/O read: 1MB, write: 0MB, rate: 51.02MB/s
      Pass 5: Checking group summary information
      Pass 5: Memory used: 288k/0k (68k/221k), time:  0.00/ 0.00/ 0.00
      Pass 5: I/O read: 1MB, write: 0MB, rate: 290.78MB/s
      Update quota info for quota type 1? no
      
      
      lustre-MDT0000: ********** WARNING: Filesystem still has errors **********
      
      
               411 inodes used (0.05%, out of 838864)
                29 non-contiguous files (7.1%)
                 3 non-contiguous directories (0.7%)
                   # of inodes with ind/dind/tind blocks: 22/0/0
            235546 blocks used (44.93%, out of 524288)
                 0 bad blocks
                 1 large file
      
               187 regular files
               209 directories
                 0 character device files
                 0 block device files
                 0 fifos
        4294967294 links
                 6 symbolic links (6 fast symbolic links)
                 0 sockets
      ------------
               400 files
      Memory used: 288k/0k (68k/221k), time:  0.17/ 0.07/ 0.03
      I/O read: 54MB, write: 0MB, rate: 325.54MB/s
      

       

       another 804 test run:

      https://testing.whamcloud.com/sub_tests/f2aca174-efe9-11e7-8c43-52540065bddc
      and test log https://testing.whamcloud.com/test_logs/f36d7cf0-efe9-11e7-8c43-52540065bddc/show_text :

      Pass 3: Checking directory connectivity
      Peak memory: Memory used: 288k/272k (79k/210k), time:  0.24/ 0.04/ 0.02
      Unconnected directory inode 8512 (.../???)
      Connect to /lost+found? no
      
      '..' in ... (8512) is ... (8511), should be <The NULL inode> (0).
      Fix? no
      
      Pass 3: Memory used: 288k/272k (76k/213k), time:  0.00/ 0.00/ 0.00
      Pass 3: I/O read: 1MB, write: 0MB, rate: 6622.52MB/s
      Pass 4: Checking reference counts
      Inode 213 ref count is 23, should be 22.  Fix? no
      
      Inode 8512 ref count is 4, should be 3.  Fix? no
      
      Unattached inode 8530
      Connect to /lost+found? no
      

      I am not sure the fs gets damaged exactly in test_804, but I couldn't reproduce the failures locally.

      I already filed LU-11142 for test-framework.sh::run_e2fsck changes.

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: