[LU-15376] sanity-benchmark test_iozone: fsync: Protocol error Created: 15/Dec/21 Updated: 04/Oct/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.8, Lustre 2.15.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Jian Yu |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | failing_tests | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/a4e5bea6-99ab-4014-9c91-53044b574fe5 test_iozone failed with the following error: iozone (1) failed Hit the error between 2.10.8 client and 2.12.8 server, not sure if it is a DCO issue. Include fsync in write timing
>>> I/O Diagnostic mode enabled. <<<
Performance measurements are invalid in this mode.
Record Size 512 kB
File size set to 5503128 kB
Command line used: iozone -i 0 -i 1 -i 2 -e -+d -r 512 -s 5503128 -f /mnt/lustre/d0.iozone/iozone
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
5503128 512fsync: Protocol error
iozone: interrupted
exiting iozone
sanity-benchmark test_iozone: @@@@@@ FAIL: iozone (1) failed
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:5337:error()
= /usr/lib64/lustre/tests/sanity-benchmark.sh:130:test_iozone()
= /usr/lib64/lustre/tests/test-framework.sh:5618:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:5657:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:5504:run_test()
= /usr/lib64/lustre/tests/sanity-benchmark.sh:175:main()
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Minh Diep [ 14/Jun/22 ] |
|
https://testing.whamcloud.com/test_sets/6c6114a7-2b43-42b6-bb96-7608d75045ca seems like the client got evicted [ 6953.344248] Lustre: DEBUG MARKER: == sanity-benchmark test iozone: iozone ============================================================== 03:31:28 (1654313488) [ 6965.443725] Lustre: DEBUG MARKER: /usr/sbin/lctl mark min OST has 1783480kB available, using 5502664kB file size [ 6965.658945] Lustre: DEBUG MARKER: min OST has 1783480kB available, using 5502664kB file size [ 6981.216048] Lustre: 10335:0:(client.c:2169:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1654313510/real 0] req@ffff9e629def5b00 x1734667382522880/t0(0) o400->lustre-MDT0000-mdc-ffff9e62ba648000@10.240.38.243@tcp:12/10 lens 224/224 e 0 to 1 dl 1654313517 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [ 6981.223121] Lustre: lustre-MDT0000-mdc-ffff9e62ba648000: Connection to lustre-MDT0000 (at 10.240.38.243@tcp) was lost; in progress operations using this service will wait for recovery to complete [ 6982.203107] kworker/u4:0: page allocation failure: order:0, mode:0x20 |
| Comment by Colin Faber [ 28/Sep/22 ] |
|
Hi yujian Can you take a look? This looks very similar to the other protocol issue you've investigated recently. Thank you! |