[LU-7006] after upgrade system from 2.5.3 RHEL6 to master RHEL7, hit: iozone did not fail with EDQUOT Created: 13/Aug/15 Updated: 12/Nov/15 Resolved: 12/Nov/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sarah Liu | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
before upgrade: 2.5.3, RHEL6 |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
+ pdsh -t 120 -S -Rrsh -w onyx-27,onyx-28 '(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre" FSTYPE=ldiskfs sh -c "runas -u quota_2usr /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.\$(hostname).\$(date +%s)")'
onyx-28: running as uid/gid/euid/egid 60001/60001/60001/60001, groups:
onyx-28: [/usr/bin/iozone] [-i] [0] [-e] [-+d] [-w] [-r] [1024] [-s] [1048576] [-f] [/mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-28.onyx.hpdd.intel.com.1439497851]
onyx-27: running as uid/gid/euid/egid 60001/60001/60001/60001, groups:
onyx-27: [/usr/bin/iozone] [-i] [0] [-e] [-+d] [-w] [-r] [1024] [-s] [1048576] [-f] [/mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-27.1439497851]
Iozone: Performance Test of File I/O
Version $Revision: 3.373 $
Compiled for 64 bit mode.
Build: linux-AMD64
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
Run began: Thu Aug 13 13:30:51 2015
Include fsync in write timing
>>> I/O Diagnostic mode enabled. <<<
Performance measurements are invalid in this mode.
Setting no_unlink
Record Size 1024 KB
File size set to 1048576 KB
Command line used: /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-28.onyx.hpdd.intel.com.1439497851
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
Iozone: Performance Test of File I/O
Version $Revision: 3.373 $
Compiled for 64 bit mode.
Build: linux-AMD64
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
Run began: Thu Aug 13 13:30:51 2015
Include fsync in write timing
>>> I/O Diagnostic mode enabled. <<<
Performance measurements are invalid in this mode.
Setting no_unlink
Record Size 1024 KB
File size set to 1048576 KB
Command line used: /usr/bin/iozone -i 0 -e -+d -w -r 1024 -s 1048576 -f /mnt/lustre/d0.upgrade-downgrade/quota_2usr/iozone.onyx-27.1439497851
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
1048576 1024
Sanity check failed. Do not deploy this filesystem in a production environment !
1048576 1024
Sanity check failed. Do not deploy this filesystem in a production environment !
+ return 44
upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT
Lustre: DEBUG MARKER: upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:4343:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4374:error()
= upgrade-downgrade.sh:760:iop_run_iozone()
= upgrade-downgrade.sh:687:iop_verify_quotas()
= upgrade-downgrade.sh:1131:clean_upgrade_downgrade()
= upgrade-downgrade.sh:1262:main()
|
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 14/Aug/15 ] |
|
Hi Niu, |
| Comment by Andreas Dilger [ 14/Aug/15 ] |
|
Sarah, are there any messages on the console logs for the OSS, MDS, or client? |
| Comment by Sarah Liu [ 14/Aug/15 ] |
|
I checked that the MDS, OSS and didn't see any error there. |
| Comment by Niu Yawei (Inactive) [ 24/Aug/15 ] |
|
Sarah, where can I find this upgrade test script? Or could you make a brief description on how it verify quota on upgrading? Could you provide the debug log for OST/MDT with D_QUOTA enabled as well? Thanks. |
| Comment by Sarah Liu [ 17/Sep/15 ] |
|
Hello Niu, could you access Onyx? Here is the script I use please find the attached for the mds/ost debug log. If you need anything else, please let me know |
| Comment by Niu Yawei (Inactive) [ 18/Sep/15 ] |
I can access onyx, but I didn't see the directory you mentioned. If possible, could you attach the script here? It'll be easier for me to review the code, you know, it's extremely slow to access any cluster head node here. |
| Comment by Sarah Liu [ 18/Sep/15 ] |
|
test script |
| Comment by Sarah Liu [ 02/Nov/15 ] |
|
Hi Niu, Any update of this ticket? |
| Comment by Niu Yawei (Inactive) [ 03/Nov/15 ] |
|
Look at the output of the test: random random bkwd record stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
1048576 1024
Sanity check failed. Do not deploy this filesystem in a production environment !
1048576 1024
Sanity check failed. Do not deploy this filesystem in a production environment !
+ return 44
upgrade-downgrade : @@@@@@ FAIL: iozone did not fail with EDQUOT
Seems the iozone wasn't finished successfully, but I didn't find "Sanity check failed. Do not deploy this filesystem in a production environment !" in the script, so I'm not sure what kind of failure caused the test failure. Current iop_run_iozone() just grep certain message on failure: egrep -q "Disk quota exceeded|Error writing block" $log || \ { rm -f $log; error "iozone did not fail with EDQUOT"; } Maybe we'd improve it to collect more information on failure? That would help us to identify the exact failure reason. |
| Comment by Sarah Liu [ 11/Nov/15 ] |
|
Tried with lastest tag 2.7.62 while didn't hit this issue when upgrading to master build #3226 RHEL7, instead, after downgrade from master to 2.5.5, hit |
| Comment by Joseph Gmitter (Inactive) [ 12/Nov/15 ] |
|
As Sarah reports, this issue no longer exists on master. |