Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13753

sanityn test_51b: 'file size is 1024, should be 3145728'

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.14.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for liuying <emoly.liu@intel.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/abe98c66-5293-41c4-a72b-c317b11bb2e2

      test_51b failed with the following error:

      == sanityn test 51b: layout lock: glimpse should be able to restart if layout changed ================ 12:39:07 (1594039147)
      1+0 records in
      1+0 records out
      1024 bytes (1.0 kB) copied, 0.00191303 s, 535 kB/s
      fail_loc=0x1404
      1+0 records in
      1+0 records out
      1048576 bytes (1.0 MB) copied, 1.74443 s, 601 kB/s
      1024
       sanityn test_51b: @@@@@@ FAIL: file size is 1024, should be 3145728 
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanityn test_51b - file size is 1024, should be 3145728

      Attachments

        Issue Links

          Activity

            [LU-13753] sanityn test_51b: 'file size is 1024, should be 3145728'
            anikitenko Alena Nikitenko (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 32286 ]
            hornc Chris Horn added a comment - +1 on master https://testing.whamcloud.com/test_sets/2b34efa6-e2f5-4797-ab11-e9e3d18082eb
            scherementsev Sergey Cheremencev added a comment - Faced it again on master(review-dne-zfs-part-5): https://testing.whamcloud.com/test_sets/e23e4426-f555-4f86-b61e-7c1051c2a974
            pjones Peter Jones made changes -
            Fix Version/s Original: Lustre 2.14.0 [ 14490 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-10934 [ LU-10934 ]
            adilger Andreas Dilger made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]

            The patch to fix test_51b is landed.

            adilger Andreas Dilger added a comment - The patch to fix test_51b is landed.
            qian_wc Qian Yingjin added a comment - - edited

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38947
            Subject: LU-10934 tests: increase timeout for sanityn test_51b
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: a00f3f63fa7e8d3ba9a32dcf4da3de83b6dcdcb3

            qian_wc Qian Yingjin added a comment - - edited Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38947 Subject: LU-10934 tests: increase timeout for sanityn test_51b Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a00f3f63fa7e8d3ba9a32dcf4da3de83b6dcdcb3

            Andreas' patch: https://review.whamcloud.com/#/c/38947/ gives out the solution by increasing timeout for sanityn test_51b.

            qian_wc Qian Yingjin added a comment - Andreas' patch: https://review.whamcloud.com/#/c/38947/ gives out the solution by increasing timeout for sanityn test_51b.
            qian_wc Qian Yingjin added a comment -

            I tested it with ZFS backend locally, it passed:

            [root@qian tests]# FSTYPE=zfs ONLY="51b" REFORMAT="yes" sh sanityn.sh 
            qian: executing check_logdir /tmp/test_logs/1597136195
            Logging to shared log directory: /tmp/test_logs/1597136195
            qian: executing yml_node
            IOC_LIBCFS_GET_NI error 22: Invalid argument
            Client: 2.13.55.16
            MDS: 2.13.55.16
            OSS: 2.13.55.16
            excepting tests: 28
            skipping tests SLOW=no: 33a
            Stopping clients: qian /mnt/lustre (opts:-f)
            Stopping clients: qian /mnt/lustre2 (opts:-f)
            qian: executing set_hostid
            Loading modules from /root/work/STATX/lustre-release/lustre/tests/..
            detected 2 online CPUs by sysfs
            Force libcfs to create 2 CPU partitions
            quota/lquota options: 'hash_lqs_cur_bits=3'
            Formatting mgs, mds, osts
            Format mds1: lustre-mdt1/mdt1
            Format ost1: lustre-ost1/ost1
            Format ost2: lustre-ost2/ost2
            Checking servers environments
            Checking clients qian environments
            Loading modules from /root/work/STATX/lustre-release/lustre/tests/..
            detected 2 online CPUs by sysfs
            Force libcfs to create 2 CPU partitions
            Setup mgs, mdt, osts
            Starting mds1: -o localrecov  lustre-mdt1/mdt1 /mnt/lustre-mds1
            Commit the device label on lustre-mdt1/mdt1
            Started lustre-MDT0000
            Starting ost1: -o localrecov  lustre-ost1/ost1 /mnt/lustre-ost1
            Commit the device label on lustre-ost1/ost1
            Started lustre-OST0000
            Starting ost2: -o localrecov  lustre-ost2/ost2 /mnt/lustre-ost2
            Commit the device label on lustre-ost2/ost2
            Started lustre-OST0001
            Starting client: qian:  -o user_xattr,flock qian@tcp:/lustre /mnt/lustre
            Starting client qian:  -o user_xattr,flock qian@tcp:/lustre /mnt/lustre
            Started clients qian: 
            192.168.150.131@tcp:/lustre on /mnt/lustre type lustre (rw,flock,user_xattr,lazystatfs,encrypt)
            Starting client: qian:  -o user_xattr,flock qian@tcp:/lustre /mnt/lustre2
            Starting client qian:  -o user_xattr,flock qian@tcp:/lustre /mnt/lustre2
            Started clients qian: 
            192.168.150.131@tcp:/lustre on /mnt/lustre2 type lustre (rw,flock,user_xattr,lazystatfs,encrypt)
            Using TIMEOUT=20
            osc.lustre-OST0000-osc-ffff8e409657d800.idle_timeout=debug
            osc.lustre-OST0000-osc-ffff8e409657e800.idle_timeout=debug
            osc.lustre-OST0001-osc-ffff8e409657d800.idle_timeout=debug
            osc.lustre-OST0001-osc-ffff8e409657e800.idle_timeout=debug
            setting jobstats to procname_uid
            Setting lustre.sys.jobid_var from disable to procname_uid
            Waiting 90s for 'procname_uid'
            Updated after 7s: want 'procname_uid' got 'procname_uid'
            disable quota as required
            lod.lustre-MDT0000-mdtlov.mdt_hash=crush
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00664703 s, 158 MB/s
            running as uid/gid/euid/egid 500/500/500/500, groups:
             [touch] [/mnt/lustre/d0_runas_test/f7574]
            
            == sanityn test 51b: layout lock: glimpse should be able to restart if layout changed ================ 16:57:29 (1597136249)
            1+0 records in
            1+0 records out
            1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000503275 s, 2.0 MB/s
            fail_loc=0x1404
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0147413 s, 71.1 MB/s
            3145728
            Resetting fail_loc on all nodes...done.
            PASS 51b (7s)
            cleanup: ======================================================
            == sanityn test complete, duration 61 sec ============================================================ 16:57:36 (1597136256)
            Stopping clients: qian /mnt/lustre (opts:-f)
            Stopping client qian /mnt/lustre opts:-f
            Stopping clients: qian /mnt/lustre2 (opts:-f)
            Stopping client qian /mnt/lustre2 opts:-f
            
            
            [root@qian tests]# FSTYPE=zfs ONLY="51b" MDSCOUNT=2 REFORMAT="yes" sh sanityn.sh
            == sanityn test 51b: layout lock: glimpse should be able to restart if layout changed ================ 16:59:20 (1597136360)
            1+0 records in
            1+0 records out
            1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000356896 s, 2.9 MB/s
            fail_loc=0x1404
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00559126 s, 188 MB/s
            3145728
            Resetting fail_loc on all nodes...done.
            PASS 51b (7s)
            cleanup: ======================================================
            == sanityn test complete, duration 35 sec ============================================================ 16:59:27 (1597136367)
            
            
            qian_wc Qian Yingjin added a comment - I tested it with ZFS backend locally, it passed: [root@qian tests]# FSTYPE=zfs ONLY= "51b" REFORMAT= "yes" sh sanityn.sh qian: executing check_logdir /tmp/test_logs/1597136195 Logging to shared log directory: /tmp/test_logs/1597136195 qian: executing yml_node IOC_LIBCFS_GET_NI error 22: Invalid argument Client: 2.13.55.16 MDS: 2.13.55.16 OSS: 2.13.55.16 excepting tests: 28 skipping tests SLOW=no: 33a Stopping clients: qian /mnt/lustre (opts:-f) Stopping clients: qian /mnt/lustre2 (opts:-f) qian: executing set_hostid Loading modules from /root/work/STATX/lustre-release/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: lustre-mdt1/mdt1 Format ost1: lustre-ost1/ost1 Format ost2: lustre-ost2/ost2 Checking servers environments Checking clients qian environments Loading modules from /root/work/STATX/lustre-release/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions Setup mgs, mdt, osts Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Commit the device label on lustre-mdt1/mdt1 Started lustre-MDT0000 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 Commit the device label on lustre-ost1/ost1 Started lustre-OST0000 Starting ost2: -o localrecov lustre-ost2/ost2 /mnt/lustre-ost2 Commit the device label on lustre-ost2/ost2 Started lustre-OST0001 Starting client: qian: -o user_xattr,flock qian@tcp:/lustre /mnt/lustre Starting client qian: -o user_xattr,flock qian@tcp:/lustre /mnt/lustre Started clients qian: 192.168.150.131@tcp:/lustre on /mnt/lustre type lustre (rw,flock,user_xattr,lazystatfs,encrypt) Starting client: qian: -o user_xattr,flock qian@tcp:/lustre /mnt/lustre2 Starting client qian: -o user_xattr,flock qian@tcp:/lustre /mnt/lustre2 Started clients qian: 192.168.150.131@tcp:/lustre on /mnt/lustre2 type lustre (rw,flock,user_xattr,lazystatfs,encrypt) Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8e409657d800.idle_timeout=debug osc.lustre-OST0000-osc-ffff8e409657e800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8e409657d800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8e409657e800.idle_timeout=debug setting jobstats to procname_uid Setting lustre.sys.jobid_var from disable to procname_uid Waiting 90s for 'procname_uid' Updated after 7s: want 'procname_uid' got 'procname_uid' disable quota as required lod.lustre-MDT0000-mdtlov.mdt_hash=crush 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00664703 s, 158 MB/s running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/mnt/lustre/d0_runas_test/f7574] == sanityn test 51b: layout lock: glimpse should be able to restart if layout changed ================ 16:57:29 (1597136249) 1+0 records in 1+0 records out 1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000503275 s, 2.0 MB/s fail_loc=0x1404 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0147413 s, 71.1 MB/s 3145728 Resetting fail_loc on all nodes...done. PASS 51b (7s) cleanup: ====================================================== == sanityn test complete, duration 61 sec ============================================================ 16:57:36 (1597136256) Stopping clients: qian /mnt/lustre (opts:-f) Stopping client qian /mnt/lustre opts:-f Stopping clients: qian /mnt/lustre2 (opts:-f) Stopping client qian /mnt/lustre2 opts:-f [root@qian tests]# FSTYPE=zfs ONLY= "51b" MDSCOUNT=2 REFORMAT= "yes" sh sanityn.sh == sanityn test 51b: layout lock: glimpse should be able to restart if layout changed ================ 16:59:20 (1597136360) 1+0 records in 1+0 records out 1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000356896 s, 2.9 MB/s fail_loc=0x1404 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00559126 s, 188 MB/s 3145728 Resetting fail_loc on all nodes...done. PASS 51b (7s) cleanup: ====================================================== == sanityn test complete, duration 35 sec ============================================================ 16:59:27 (1597136367)

            People

              qian_wc Qian Yingjin
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: