[LU-6813] sanity-benchmark test_iozone: iozone (1) failed Created: 08/Jul/15 Updated: 12/May/16 Resolved: 23/Sep/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
This issue was created by maloo for sarah_lw <wei3.liu@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/37318204-250a-11e5-8009-5254006e85c2. The sub-test test_iozone failed with the following error: iozone (1) failed test log. Hit this problem in multiple configs stride
KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
3847216 512
Sanity check failed. Do not deploy this filesystem in a production environment !
sanity-benchmark test_iozone: @@@@@@ FAIL: iozone (1) failed
MDS console, it looks like there is not enough space 18:15:50:Lustre: DEBUG MARKER: == sanity-benchmark test iozone: iozone ============================================================== 11:04:05 (1436205845) 18:15:51:Lustre: DEBUG MARKER: /usr/sbin/lctl mark min OST has 1846488kB available, using 3847216kB file size 18:15:51:Lustre: DEBUG MARKER: min OST has 1846488kB available, using 3847216kB file size 18:15:51:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity-benchmark test_iozone: @@@@@@ FAIL: iozone \(1\) failed 18:15:51:Lustre: DEBUG MARKER: sanity-benchmark test_iozone: @@@@@@ FAIL: iozone (1) failed 18:15:51:Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2015-07-06/lustre-master-el6_6-x86_64-vs-lustre-master-sles11sp3-x86_64--full--2_12_1__3092__-70321095799900-131112/sanity-benchmark.test_iozone.debug_log.$(hostname -s).1436205847.log; |
| Comments |
| Comment by Andreas Dilger [ 08/Jul/15 ] |
|
I don't see any reports on the MDS or OSS logs that relate to out of space. It looks like this might be a data corruption issue, since there aren't errors in any of the logs. The message min OST has 1846488kB available, using 3847216kB file size is just reporting how large the iozone file is. The file is striped across all OSTs, so it shouldn't consume all of the space on even the smallest OST, since each OST will get only 1/7 of the file data (549602KB, 29% of the free space). |
| Comment by Vinayak (Inactive) [ 07/Sep/15 ] |
|
Hello Andreas, We have also faced this issue much frequently in our testing. I am attaching the logs to this ticket. Please let me know if you need any extra information from our side. |
| Comment by Zhenyu Xu [ 15/Sep/15 ] |
|
found this with strace open("/mnt/lustre/d0.iozone/iozone", O_WRONLY|O_CREAT, 0) = 3 ) = 0 |
| Comment by Zhenyu Xu [ 15/Sep/15 ] |
|
the iozone test file is created by open with mode 0100000(S_IFREG), no WRX mode in it. 00000004:00000002:1.0:1442290645.210840:0:15269:0:(mdt_open.c:1226:mdt_reint_open()) I am going to open [0x200000401:0x11:0x0]/(iozone->[0x200000401:0x15:0x0]) cr_flag=0102 mode=0100000 msg_flag=0x0 |
| Comment by Zhenyu Xu [ 15/Sep/15 ] |
|
commit c8d5aa14e50be2a85491783f169a8f4e646b9594 changed the object create mode logic. Wang Di, can you give it a look? In your commit, the MDS does not use its umask to create object and client does not pass the object's mode in the RPC, and client is depending on MDS to set the new object's mode with its umask. |
| Comment by Zhenyu Xu [ 15/Sep/15 ] |
|
client create the file with strange mode 00000080:00000001:1.0:1442292147.237309:0:28484:0:(namei.c:526:ll_lookup_it()) Process entered 00000080:00200000:1.0:1442292147.237309:0:28484:0:(namei.c:533:ll_lookup_it()) VFS Op:name=iozone, dir=[0x200000401:0x1:0x0](ffff880016c0bc40), intent=open|creat 00000080:00000010:1.0:1442292147.237311:0:28484:0:(llite_lib.c:2436:ll_prep_md_op_data()) kmalloced 'op_data': 312 at ffff8800286e1a00. 00000080:00020000:1.0:1442292147.237312:0:28484:0:(namei.c:561:ll_lookup_it()) create mode 0100000 while normal touch create a file with normal mode 00000080:00000001:0.0:1442298104.737478:0:28661:0:(namei.c:526:ll_lookup_it()) Process entered 00000080:00200000:0.0:1442298104.737478:0:28661:0:(namei.c:533:ll_lookup_it()) VFS Op:name=touchfile, dir=[0x200000401:0x1:0x0](ffff880016c0bc40), intent=open|creat 00000080:00000010:0.0:1442298104.737480:0:28661:0:(llite_lib.c:2436:ll_prep_md_op_data()) kmalloced 'op_data': 312 at ffff8800352f2400. 00000080:00020000:0.0:1442298104.737482:0:28661:0:(namei.c:561:ll_lookup_it()) create mode 0100666 |
| Comment by Di Wang [ 15/Sep/15 ] |
Wang Di, can you give it a look? In your commit, the MDS does not use its umask to create object and client does not pass the object's mode in the RPC, Hmm, mdd_acl_init do use MDS's umask to fix the mode. Thanks. |
| Comment by Zhenyu Xu [ 15/Sep/15 ] |
|
the strace iozone shows that iozone create the testing file "iozone" without setting its file mode open("/mnt/lustre/d0.iozone/iozone", O_WRONLY|O_CREAT, 0) = 3 And client pack its create mode as 0100000, no file mode in it. In mdd_acl_init(), la->la_mode is still just 0100000, and the file is created with empty mode, so that truncate failed permission checking. |
| Comment by Zhenyu Xu [ 16/Sep/15 ] |
|
$ git bisect bad |
| Comment by Zhenyu Xu [ 17/Sep/15 ] |
|
wrote a open then ftruncate test code, run it as a normal user, gathered the logs w/ and w/o the 6acf9333 patch. |
| Comment by Gerrit Updater [ 17/Sep/15 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/16462 |
| Comment by Vinayak (Inactive) [ 21/Sep/15 ] |
|
We ran the test_iozone on the patch submitted by Bobi Jam and the test was successful. File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
926070 512 40674 66809 161970 151712 164405 61944
iozone test complete.
debug=0x33f0484
Resetting fail_loc on all nodes...done.
PASS iozone (103s)
|
| Comment by Gerrit Updater [ 22/Sep/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16462/ |
| Comment by Peter Jones [ 23/Sep/15 ] |
|
Landed for 2.8 |