[LU-10437] sanity-pfl test_8: dbench failed Created: 26/Dec/17 Updated: 29/Aug/18 Resolved: 20/Jan/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Casper | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
onyx, full interop |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
session: https://testing.hpdd.intel.com/test_sessions/cb5f13e0-9177-4f4e-9a26-d197001db0c0 From test_log: copying /usr/share/dbench/client.txt to /mnt/lustre/d8.sanity-pfl/client.txt cp: error writing '/mnt/lustre/d8.sanity-pfl/client.txt': Invalid argument cp: failed to extend '/mnt/lustre/d8.sanity-pfl/client.txt': Invalid argument Trace dump: = rundbench:55:main() sanity-pfl: FAIL: test-framework exiting on error sanity-pfl test_8: @@@@@@ FAIL: dbench failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5328:error() = /usr/lib64/lustre/tests/sanity-pfl.sh:333:test_8() = /usr/lib64/lustre/tests/test-framework.sh:5604:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5643:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:5490:run_test() = /usr/lib64/lustre/tests/sanity-pfl.sh:337:main() |
| Comments |
| Comment by Jian Yu [ 08/Jan/18 ] |
|
Dmesg log on client node: LustreError: 514:0:(lov_object.c:1220:lov_layout_change()) lustre-clilov-ffff88005daa9000: cannot apply new layout on [0x200062e21:0x2735:0x0] : rc = -22 LustreError: 514:0:(vvp_io.c:1495:vvp_io_init()) lustre: refresh file layout [0x200062e21:0x2735:0x0] error -22. Debug log on client node: 00000080:00200000:1.0:1513687522.574612:0:514:0:(vvp_io.c:312:vvp_io_fini()) [0x200062e21:0x2735:0x0] ignore/verify layout 1/0, layout version 0 need write layout 0, restore needed 0 00020000:00020000:1.0:1513687522.578220:0:514:0:(lov_object.c:1220:lov_layout_change()) lustre-clilov-ffff88005daa9000: cannot apply new layout on [0x200062e21:0x2735:0x0] : rc = -22 00010000:00010000:1.0:1513687522.579782:0:514:0:(ldlm_lock.c:800:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(CR) ns: ?? lock: ffff88005cf42240/0xed5adfd7af8ec0c4 lrc: 3/1,0 mode: CR/CR res: ?? rrc=?? type: ??? flags: 0x10000000000000 nid: local remote: 0x4869a6ac1a9ab3d4 expref: -99 pid: 514 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1513687522.579786:0:514:0:(ldlm_lock.c:873:ldlm_lock_decref_internal()) ### add lock into lru list ns: ?? lock: ffff88005cf42240/0xed5adfd7af8ec0c4 lrc: 2/0,0 mode: CR/CR res: ?? rrc=?? type: ??? flags: 0x10000000000000 nid: local remote: 0x4869a6ac1a9ab3d4 expref: -99 pid: 514 timeout: 0 lvb_type: 3 00000080:00020000:1.0:1513687522.579791:0:514:0:(vvp_io.c:1495:vvp_io_init()) lustre: refresh file layout [0x200062e21:0x2735:0x0] error -22. 00000080:00200000:1.0:1513687522.580937:0:514:0:(vvp_io.c:312:vvp_io_fini()) [0x200062e21:0x2735:0x0] ignore/verify layout 0/0, layout version -2 need write layout 0, restore needed 0 00000080:00200000:1.0:1513687522.580940:0:514:0:(file.c:1423:ll_file_io_generic()) client.txt: 2 io complete with rc: -22, result: 0, restart: 0 00000080:00200000:1.0:1513687522.580942:0:514:0:(file.c:1459:ll_file_io_generic()) client.txt: write *ppos: 16777216, pos: 16777216, ret: 0, rc: -22 Dmesg log on MDS: LustreError: 18669:0:(mdt_lvb.c:163:mdt_lvbo_fill()) lustre-MDT0000: expected 368 actual 344. |
| Comment by Jian Yu [ 08/Jan/18 ] |
|
sanity-pfl test 15 in the same interop test session hit the same failure: == sanity-pfl test 15: Verify component options for lfs find ========================================= 12:46:26 (1513687586) dd: error writing '/mnt/lustre/d15.sanity-pfl/f1': Invalid argument Debug log on client node: 00000080:00200000:1.0:1513687587.114545:0:6996:0:(vvp_io.c:312:vvp_io_fini()) [0x200062e21:0x2743:0x0] ignore/verify layout 1/0, layout version 0 need write layout 0, restore needed 0 00020000:00020000:1.0:1513687587.114768:0:6996:0:(lov_object.c:1220:lov_layout_change()) lustre-clilov-ffff88005daa9000: cannot apply new layout on [0x200062e21:0x2743:0x0] : rc = -22 00010000:00010000:1.0:1513687587.116338:0:6996:0:(ldlm_lock.c:800:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(CR) ns: ?? lock: ffff88005cf43440/0xed5adfd7af8ec4ec lrc: 3/1,0 mode: CR/CR res: ?? rrc=?? type: ??? flags: 0x10000000000000 nid: local remote: 0x4869a6ac1a9aca2b expref: -99 pid: 6996 timeout: 0 lvb_type: 3 00010000:00010000:1.0:1513687587.116341:0:6996:0:(ldlm_lock.c:873:ldlm_lock_decref_internal()) ### add lock into lru list ns: ?? lock: ffff88005cf43440/0xed5adfd7af8ec4ec lrc: 2/0,0 mode: CR/CR res: ?? rrc=?? type: ??? flags: 0x10000000000000 nid: local remote: 0x4869a6ac1a9aca2b expref: -99 pid: 6996 timeout: 0 lvb_type: 3 00000080:00020000:1.0:1513687587.116345:0:6996:0:(vvp_io.c:1495:vvp_io_init()) lustre: refresh file layout [0x200062e21:0x2743:0x0] error -22. 00000080:00200000:1.0:1513687587.117486:0:6996:0:(vvp_io.c:312:vvp_io_fini()) [0x200062e21:0x2743:0x0] ignore/verify layout 0/0, layout version -2 need write layout 0, restore needed 0 00000080:00200000:1.0:1513687587.117488:0:6996:0:(file.c:1423:ll_file_io_generic()) f1: 2 io complete with rc: -22, result: 0, restart: 0 00000080:00200000:1.0:1513687587.117489:0:6996:0:(file.c:1459:ll_file_io_generic()) f1: write *ppos: 1048576, pos: 1048576, ret: 0, rc: -22 |
| Comment by Jinshan Xiong (Inactive) [ 08/Jan/18 ] |
|
it turned out that the b2_10 branch doesn't clear either lcm_flags or lcm_padding fields when packing layout on the server side. The corresponding code is in lod_generate_lovea(): lcm = (struct lov_comp_md_v1 *)lmm;
lcm->lcm_magic = cpu_to_le32(LOV_MAGIC_COMP_V1);
lcm->lcm_entry_count = cpu_to_le16(comp_cnt);
offset = sizeof(*lcm) + sizeof(*lcme) * comp_cnt;
LASSERT(offset % sizeof(__u64) == 0);
This will confuse b2_11 clients because lcm_flags and lcm_mirror_count would be random numbers and then will not pass sanity check. I think the best way to fix this problem is to create a patch to clear the corresponding fields in b2_10. |
| Comment by Gerrit Updater [ 08/Jan/18 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/30784 |
| Comment by Gerrit Updater [ 08/Jan/18 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/30785 |
| Comment by Gerrit Updater [ 20/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30785/ |
| Comment by Peter Jones [ 20/Jan/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 02/Feb/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30784/ |