Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7006

after upgrade system from 2.5.3 RHEL6 to master RHEL7, hit: iozone did not fail with EDQUOT

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • before upgrade: 2.5.3, RHEL6
      after upgrade: lustre-master build # 3118 RHEL7
    • 3
    • 9223372036854775807

    Description

      + pdsh -t 120 -S -Rrsh -w onyx-27,onyx-28 '(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre"  FSTYPE=ldiskfs sh -c "runas -u quota_2usr /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.\$(hostname).\$(date +%s)")'
      onyx-28: running as uid/gid/euid/egid 60001/60001/60001/60001, groups:
      onyx-28:  [/usr/bin/iozone] [-i] [0] [-e] [-+d] [-w] [-r] [1024] [-s] [1048576] [-f] [/mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-28.onyx.hpdd.intel.com.1439497851]
      onyx-27: running as uid/gid/euid/egid 60001/60001/60001/60001, groups:
      onyx-27:  [/usr/bin/iozone] [-i] [0] [-e] [-+d] [-w] [-r] [1024] [-s] [1048576] [-f] [/mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-27.1439497851]
      	Iozone: Performance Test of File I/O
      	        Version $Revision: 3.373 $
      		Compiled for 64 bit mode.
      		Build: linux-AMD64 
      
      	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
      	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
      	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
      	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
      	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
      	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
      	             Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
      
      	Run began: Thu Aug 13 13:30:51 2015
      
      	Include fsync in write timing
      	>>> I/O Diagnostic mode enabled. <<<
      	Performance measurements are invalid in this mode.
      	Setting no_unlink
      	Record Size 1024 KB
      	File size set to 1048576 KB
      	Command line used: /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-28.onyx.hpdd.intel.com.1439497851
      	Output is in Kbytes/sec
      	Time Resolution = 0.000001 seconds.
      	Processor cache size set to 1024 Kbytes.
      	Processor cache line size set to 32 bytes.
      	File stride size set to 17 * record size.
                                                                  random  random    bkwd   record   stride                                   
                    KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
      	Iozone: Performance Test of File I/O
      	        Version $Revision: 3.373 $
      		Compiled for 64 bit mode.
      		Build: linux-AMD64 
      
      	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
      	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
      	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
      	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
      	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
      	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
      	             Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
      
      	Run began: Thu Aug 13 13:30:51 2015
      
      	Include fsync in write timing
      	>>> I/O Diagnostic mode enabled. <<<
      	Performance measurements are invalid in this mode.
      	Setting no_unlink
      	Record Size 1024 KB
      	File size set to 1048576 KB
      	Command line used: /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-27.1439497851
      	Output is in Kbytes/sec
      	Time Resolution = 0.000001 seconds.
      	Processor cache size set to 1024 Kbytes.
      	Processor cache line size set to 32 bytes.
      	File stride size set to 17 * record size.
                                                                  random  random    bkwd   record   stride                                   
                    KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
               1048576    1024
      
      Sanity check failed. Do not deploy this filesystem in a production environment !
               1048576    1024
      
      Sanity check failed. Do not deploy this filesystem in a production environment !
      + return 44
       upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT 
      Lustre: DEBUG MARKER: upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:4343:error_noexit()
        = /usr/lib64/lustre/tests/test-framework.sh:4374:error()
        = upgrade-downgrade.sh:760:iop_run_iozone()
        = upgrade-downgrade.sh:687:iop_verify_quotas()
        = upgrade-downgrade.sh:1131:clean_upgrade_downgrade()
        = upgrade-downgrade.sh:1262:main()
      

      Attachments

        Issue Links

          Activity

            [LU-7006] after upgrade system from 2.5.3 RHEL6 to master RHEL7, hit: iozone did not fail with EDQUOT

            As Sarah reports, this issue no longer exists on master.

            jgmitter Joseph Gmitter (Inactive) added a comment - As Sarah reports, this issue no longer exists on master.
            sarah Sarah Liu added a comment -

            Tried with lastest tag 2.7.62 while didn't hit this issue when upgrading to master build #3226 RHEL7, instead, after downgrade from master to 2.5.5, hit LU-7410

            sarah Sarah Liu added a comment - Tried with lastest tag 2.7.62 while didn't hit this issue when upgrading to master build #3226 RHEL7, instead, after downgrade from master to 2.5.5, hit LU-7410

            Look at the output of the test:

                                                                        random  random    bkwd   record   stride                                   
                          KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
                     1048576    1024
            
            Sanity check failed. Do not deploy this filesystem in a production environment !
                     1048576    1024
            
            Sanity check failed. Do not deploy this filesystem in a production environment !
            + return 44
             upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT 
            

            Seems the iozone wasn't finished successfully, but I didn't find "Sanity check failed. Do not deploy this filesystem in a production environment !" in the script, so I'm not sure what kind of failure caused the test failure.

            Current iop_run_iozone() just grep certain message on failure:

                egrep -q "Disk quota exceeded|Error writing block" $log || \
                    { rm -f $log; error "iozone did not fail with EDQUOT"; }
            

            Maybe we'd improve it to collect more information on failure? That would help us to identify the exact failure reason.

            niu Niu Yawei (Inactive) added a comment - Look at the output of the test: random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 1048576 1024 Sanity check failed. Do not deploy this filesystem in a production environment ! 1048576 1024 Sanity check failed. Do not deploy this filesystem in a production environment ! + return 44 upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT Seems the iozone wasn't finished successfully, but I didn't find "Sanity check failed. Do not deploy this filesystem in a production environment !" in the script, so I'm not sure what kind of failure caused the test failure. Current iop_run_iozone() just grep certain message on failure: egrep -q "Disk quota exceeded|Error writing block" $log || \ { rm -f $log; error "iozone did not fail with EDQUOT" ; } Maybe we'd improve it to collect more information on failure? That would help us to identify the exact failure reason.
            sarah Sarah Liu added a comment -

            Hi Niu,

            Any update of this ticket?

            sarah Sarah Liu added a comment - Hi Niu, Any update of this ticket?
            sarah Sarah Liu added a comment -

            test script

            sarah Sarah Liu added a comment - test script

            could you access Onyx? Here is the script I use
            /home/w3liu/toro_home/interop/upgrade-downgrade.sh
            please find the attached for the mds/ost debug log. If you need anything else, please let me know

            I can access onyx, but I didn't see the directory you mentioned. If possible, could you attach the script here? It'll be easier for me to review the code, you know, it's extremely slow to access any cluster head node here.

            niu Niu Yawei (Inactive) added a comment - could you access Onyx? Here is the script I use /home/w3liu/toro_home/interop/upgrade-downgrade.sh please find the attached for the mds/ost debug log. If you need anything else, please let me know I can access onyx, but I didn't see the directory you mentioned. If possible, could you attach the script here? It'll be easier for me to review the code, you know, it's extremely slow to access any cluster head node here.
            sarah Sarah Liu added a comment -

            Hello Niu,

            could you access Onyx? Here is the script I use
            /home/w3liu/toro_home/interop/upgrade-downgrade.sh

            please find the attached for the mds/ost debug log. If you need anything else, please let me know

            sarah Sarah Liu added a comment - Hello Niu, could you access Onyx? Here is the script I use /home/w3liu/toro_home/interop/upgrade-downgrade.sh please find the attached for the mds/ost debug log. If you need anything else, please let me know

            Sarah, where can I find this upgrade test script? Or could you make a brief description on how it verify quota on upgrading? Could you provide the debug log for OST/MDT with D_QUOTA enabled as well? Thanks.

            niu Niu Yawei (Inactive) added a comment - Sarah, where can I find this upgrade test script? Or could you make a brief description on how it verify quota on upgrading? Could you provide the debug log for OST/MDT with D_QUOTA enabled as well? Thanks.
            sarah Sarah Liu added a comment -

            I checked that the MDS, OSS and didn't see any error there.

            sarah Sarah Liu added a comment - I checked that the MDS, OSS and didn't see any error there.

            Sarah, are there any messages on the console logs for the OSS, MDS, or client?

            adilger Andreas Dilger added a comment - Sarah, are there any messages on the console logs for the OSS, MDS, or client?

            People

              niu Niu Yawei (Inactive)
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: