Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
Lustre 2.16.0, Lustre 2.15.5, Lustre 2.15.6
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/8d022068-68bb-4bbe-8686-e7928146c95e
test_2a failed with the following error:
Timeout occurred after 97 minutes, last suite running was lustre-rsync-test
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/100812 - 4.18.0-477.27.1.el8_8.x86_64 (x86_64/RHEL 8.8/2.15.59.54)
servers: https://build.whamcloud.com/job/lustre-reviews/100812 - 4.18.0-477.27.1.el8_lustre.x86_64 (ldiskfs/x86_64/RHEL 8.8/2.15.59.54)
Client logs
== lustre-rsync-test test 2a: Replicate files created by dbench. ========================================================== 18:06:53 (1702404413) CMD: trevis-127vm1 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n CMD: trevis-127vm1 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm mdd.lustre-MDT0000.changelog_mask=+hsm CMD: trevis-127vm1 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n CMD: trevis-127vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0001.changelog_mask -n CMD: trevis-127vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0001.changelog_mask=+hsm mdd.lustre-MDT0001.changelog_mask=+hsm CMD: trevis-127vm2 /usr/sbin/lctl --device lustre-MDT0001 changelog_register -n CMD: trevis-127vm1 /usr/sbin/lctl get_param mdd.lustre-MDT0002.changelog_mask -n CMD: trevis-127vm1 /usr/sbin/lctl set_param mdd.lustre-MDT0002.changelog_mask=+hsm mdd.lustre-MDT0002.changelog_mask=+hsm CMD: trevis-127vm1 /usr/sbin/lctl --device lustre-MDT0002 changelog_register -n CMD: trevis-127vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0003.changelog_mask -n CMD: trevis-127vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0003.changelog_mask=+hsm mdd.lustre-MDT0003.changelog_mask=+hsm CMD: trevis-127vm2 /usr/sbin/lctl --device lustre-MDT0003 changelog_register -n Registered 4 changelog users: 'cl3 cl3 cl3 cl3' lustre-MDT0000: Registered changelog user cl3 looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' copying required dbench files to chroot directory /mnt/lustre/d2a.lustre-rsync-test '/usr/bin/dbench' -> 'dbench' lib64/libpopt.so.0 lib64/libc.so.6 lib64/ld-linux-x86-64.so.2 running 'dbench 2 -t 60' in /mnt/lustre/d2a.lustre-rsync-test at Tue Dec 12 18:06:58 UTC 2023 waiting for dbench pid 27630 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 60 seconds with load 'client.txt' and minimum warmup 12 secs failed to create barrier semaphore 0 of 2 processes prepared for launch 0 sec 2 of 2 processes prepared for launch 0 sec releasing clients 2 670 42.34 MB/sec warmup 1 sec latency 243.676 ms 2 761 21.39 MB/sec warmup 2 sec latency 221.017 ms 2 946 14.56 MB/sec warmup 3 sec latency 191.389 ms 2 1760 13.14 MB/sec warmup 4 sec latency 162.857 ms 2 2699 12.44 MB/sec warmup 5 sec latency 143.470 ms 2 3576 12.52 MB/sec warmup 6 sec latency 159.825 ms 2 4270 11.53 MB/sec warmup 7 sec latency 155.305 ms 2 4400 10.16 MB/sec warmup 8 sec latency 190.446 ms 2 4698 9.29 MB/sec warmup 9 sec latency 184.812 ms 2 5516 9.11 MB/sec warmup 10 sec latency 167.353 ms 2 6324 9.29 MB/sec warmup 11 sec latency 211.108 ms 2 7834 2.42 MB/sec execute 1 sec latency 154.295 ms 2 7909 1.38 MB/sec execute 2 sec latency 174.146 ms 2 8094 1.21 MB/sec execute 3 sec latency 188.842 ms 2 8966 3.14 MB/sec execute 4 sec latency 153.121 ms 2 9572 3.33 MB/sec execute 5 sec latency 168.781 ms 2 10585 5.64 MB/sec execute 6 sec latency 154.108 ms 2 11361 5.79 MB/sec execute 7 sec latency 132.248 ms 2 11516 5.16 MB/sec execute 8 sec latency 147.133 ms 2 12183 5.58 MB/sec execute 9 sec latency 185.267 ms 2 13034 5.41 MB/sec execute 10 sec latency 107.650 ms 2 13900 6.16 MB/sec execute 11 sec latency 195.091 ms 2 14877 6.55 MB/sec execute 12 sec latency 145.665 ms 2 14996 6.09 MB/sec execute 13 sec latency 140.691 ms 2 15122 5.70 MB/sec execute 14 sec latency 170.249 ms 2 15733 5.90 MB/sec execute 15 sec latency 165.990 ms 2 16247 5.59 MB/sec execute 16 sec latency 204.489 ms 2 16961 5.90 MB/sec execute 17 sec latency 160.806 ms 2 17550 5.95 MB/sec execute 18 sec latency 155.253 ms 2 18422 6.15 MB/sec execute 19 sec latency 119.982 ms 2 18513 5.87 MB/sec execute 20 sec latency 157.840 ms 2 18648 5.62 MB/sec execute 21 sec latency 194.348 ms 2 19351 5.77 MB/sec execute 22 sec latency 164.213 ms 2 19408 5.52 MB/sec execute 23 sec latency 969.378 ms 2 20009 5.38 MB/sec execute 24 sec latency 421.285 ms 2 20658 5.61 MB/sec execute 25 sec latency 171.661 ms 2 21413 5.88 MB/sec execute 26 sec latency 156.010 ms 2 21959 5.74 MB/sec execute 27 sec latency 82.070 ms 2 22096 5.56 MB/sec execute 28 sec latency 141.527 ms 2 22281 5.40 MB/sec execute 29 sec latency 162.308 ms 2 22887 5.50 MB/sec execute 30 sec latency 158.106 ms 2 23316 5.35 MB/sec execute 31 sec latency 151.646 ms 2 23947 5.42 MB/sec execute 32 sec latency 177.765 ms 2 24586 5.53 MB/sec execute 33 sec latency 171.891 ms 2 25236 5.67 MB/sec execute 34 sec latency 118.584 ms 2 25591 5.53 MB/sec execute 35 sec latency 131.897 ms 2 25724 5.40 MB/sec execute 36 sec latency 136.553 ms 2 26001 5.33 MB/sec execute 37 sec latency 172.208 ms 2 26561 5.36 MB/sec execute 38 sec latency 108.752 ms 2 27081 5.27 MB/sec execute 39 sec latency 163.033 ms 2 27638 5.34 MB/sec execute 40 sec latency 175.445 ms 2 28310 5.49 MB/sec execute 41 sec latency 130.378 ms 2 28943 5.51 MB/sec execute 42 sec latency 86.489 ms 2 29153 5.39 MB/sec execute 43 sec latency 180.277 ms 2 29283 5.29 MB/sec execute 44 sec latency 148.732 ms 2 29773 5.30 MB/sec execute 45 sec latency 125.766 ms 2 30183 5.26 MB/sec execute 46 sec latency 93.603 ms 2 30553 5.19 MB/sec execute 47 sec latency 288.305 ms 2 31300 5.26 MB/sec execute 48 sec latency 125.802 ms 2 32047 5.40 MB/sec execute 49 sec latency 85.231 ms 2 32578 5.40 MB/sec execute 50 sec latency 116.088 ms 2 32775 5.31 MB/sec execute 51 sec latency 143.543 ms 2 33211 5.30 MB/sec execute 52 sec latency 133.406 ms 2 33654 5.28 MB/sec execute 53 sec latency 180.276 ms 2 34159 5.23 MB/sec execute 54 sec latency 208.215 ms 2 34715 5.27 MB/sec execute 55 sec latency 241.709 ms 2 35719 5.47 MB/sec execute 56 sec latency 119.864 ms 2 36221 5.41 MB/sec execute 57 sec latency 173.668 ms 2 36352 5.33 MB/sec execute 58 sec latency 132.522 ms 2 36769 5.33 MB/sec execute 59 sec latency 160.981 ms 2 cleanup 60 sec 0 cleanup 60 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 10567 2.825 123.841 Close 7710 0.642 24.838 Rename 440 3.282 41.200 Unlink 2156 0.976 22.880 Qpathinfo 9492 0.636 38.188 Qfileinfo 1629 0.076 3.208 Qfsinfo 1719 0.018 1.539 Sfileinfo 868 2.031 39.754 Find 3625 0.296 41.384 WriteX 5127 0.449 36.828 ReadX 16023 0.009 0.055 LockX 32 0.440 0.998 UnlockX 32 0.504 1.832 Flush 745 93.845 969.365 Throughput 5.32869 MB/sec 2 clients 2 procs max_latency=969.378 ms stopping dbench in /mnt/lustre/d2a.lustre-rsync-test at Tue Dec 12 18:08:10 UTC 2023 with return code 0 clean dbench files in /mnt/lustre/d2a.lustre-rsync-test /mnt/lustre/d2a.lustre-rsync-test /mnt/lustre/d2a.lustre-rsync-test removed directory 'clients/client0/~dmtmp/PARADOX' removed directory 'clients/client0/~dmtmp/PM' removed directory 'clients/client0/~dmtmp/COREL' removed directory 'clients/client0/~dmtmp/ACCESS' removed directory 'clients/client0/~dmtmp/WORD' removed directory 'clients/client0/~dmtmp/EXCEL' removed directory 'clients/client0/~dmtmp/SEED' removed directory 'clients/client0/~dmtmp/WORDPRO' removed directory 'clients/client0/~dmtmp/PWRPNT' removed directory 'clients/client0/~dmtmp' removed directory 'clients/client0' removed directory 'clients/client1/~dmtmp/PARADOX' removed directory 'clients/client1/~dmtmp/PM' removed directory 'clients/client1/~dmtmp/COREL' removed directory 'clients/client1/~dmtmp/ACCESS' removed directory 'clients/client1/~dmtmp/WORD' removed directory 'clients/client1/~dmtmp/EXCEL' removed directory 'clients/client1/~dmtmp/SEED' removed directory 'clients/client1/~dmtmp/WORDPRO' removed directory 'clients/client1/~dmtmp/PWRPNT' removed directory 'clients/client1/~dmtmp' removed directory 'clients/client1' removed directory 'clients' removed 'client.txt' removed 'dbench' removed 'lib64/libpopt.so.0' removed 'lib64/libc.so.6' removed 'lib64/ld-linux-x86-64.so.2' /mnt/lustre/d2a.lustre-rsync-test dbench successfully finished
======= 18:06:53 \(1702404413\) [ 302.317896] Lustre: DEBUG MARKER: == lustre-rsync-test test 2a: Replicate files created by dbench. ========================================================== 18:06:53 (1702404413) [ 305.333993] Lustre: lustre-OST0005-osc-ffff895045192000: disconnect after 24s idle [ 305.335802] Lustre: Skipped 1 previous similar message [ 407.730814] Lustre: lustre-OST0000-osc-ffff895045192000: disconnect after 24s idle [ 412.850667] Lustre: lustre-OST0006-osc-ffff895045192000: disconnect after 23s idle [ 412.852206] Lustre: Skipped 6 previous similar messages [ 5749.820145] sysrq: SysRq : Changing Loglevel [ 5749.821211] sysrq: Loglevel set to 8 [ 5749.823561] sysrq: SysRq : Show State [ 5749.824341] task:systemd state:S stack: 0 pid: 1 ppid: 0 flags:0x00000000 [ 5749.825986] Call Trace: [ 5749.826560] __schedule+0x2d1/0x870 [ 5749.827342] schedule+0x55/0xf0 [ 5749.828014] schedule_hrtimeout_range_clock+0xf9/0x110 [ 5749.829063] ? ep_scan_ready_list.constprop.23+0x210/0x230 [ 5749.830174] ? __handle_mm_fault+0x453/0x6c0 [ 5749.831071] ep_poll+0x3f7/0x450 [ 5749.831764] ? wake_up_q+0x70/0x70 [ 5749.832504] do_epoll_wait+0xb8/0xe0 [ 5749.833238] __x64_sys_epoll_wait+0x1a/0x30 [ 5749.834103] do_syscall_64+0x5b/0x1b0 [ 5749.834897] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5749.835917] RIP: 0033:0x7fdbde375e87 [ 5749.836701] Code: Unable to access opcode bytes at RIP 0x7fdbde375e5d. [ 5749.837975] RSP: 002b:00007fff155034e0 EFLAGS: 00000293 ORIG_RAX: 00000000000000e8 [ 5749.839441] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fdbde375e87 [ 5749.840835] RDX: 0000000000000058 RSI: 00007fff15503520 RDI: 0000000000000004 [ 5749.842235] RBP: 00007fff15503520 R08: 0000000000000000 R09: 00007fff15503520 [ 5749.843611] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000058 [ 5749.844995] R13: 00000000ffffffff R14: ffffffffffffffff R15: 00007fff15503aa0 [ 5749.846380] task:kthreadd state:S stack: 0 pid: 2 ppid: 0 flags:0x80004000
<<Please provide additional information about the failure here>>
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
lustre-rsync-test test_2a - Timeout occurred after 97 minutes, last suite running was lustre-rsync-test
Attachments
Issue Links
- is duplicated by
-
LU-16635 lustre-rsync-test test_2a: timeout
- Resolved
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...