Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17361

lustre-rsync-test test_2a: Timeout occurred after 97 minutes, last suite running was lustre-rsync-test

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • Lustre 2.17.0
    • Lustre 2.16.0, Lustre 2.15.5, Lustre 2.15.6
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Arshad <arshad.hussain@aeoncomputing.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/8d022068-68bb-4bbe-8686-e7928146c95e

      test_2a failed with the following error:

      Timeout occurred after 97 minutes, last suite running was lustre-rsync-test
      

       

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/100812 - 4.18.0-477.27.1.el8_8.x86_64 (x86_64/RHEL 8.8/2.15.59.54)
      servers: https://build.whamcloud.com/job/lustre-reviews/100812 - 4.18.0-477.27.1.el8_lustre.x86_64 (ldiskfs/x86_64/RHEL 8.8/2.15.59.54)

      Client logs

       

      == lustre-rsync-test test 2a: Replicate files created by dbench. ========================================================== 18:06:53 (1702404413)
      CMD: trevis-127vm1 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n
      CMD: trevis-127vm1 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm
      mdd.lustre-MDT0000.changelog_mask=+hsm
      CMD: trevis-127vm1 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n
      CMD: trevis-127vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0001.changelog_mask -n
      CMD: trevis-127vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0001.changelog_mask=+hsm
      mdd.lustre-MDT0001.changelog_mask=+hsm
      CMD: trevis-127vm2 /usr/sbin/lctl --device lustre-MDT0001 changelog_register -n
      CMD: trevis-127vm1 /usr/sbin/lctl get_param mdd.lustre-MDT0002.changelog_mask -n
      CMD: trevis-127vm1 /usr/sbin/lctl set_param mdd.lustre-MDT0002.changelog_mask=+hsm
      mdd.lustre-MDT0002.changelog_mask=+hsm
      CMD: trevis-127vm1 /usr/sbin/lctl --device lustre-MDT0002 changelog_register -n
      CMD: trevis-127vm2 /usr/sbin/lctl get_param mdd.lustre-MDT0003.changelog_mask -n
      CMD: trevis-127vm2 /usr/sbin/lctl set_param mdd.lustre-MDT0003.changelog_mask=+hsm
      mdd.lustre-MDT0003.changelog_mask=+hsm
      CMD: trevis-127vm2 /usr/sbin/lctl --device lustre-MDT0003 changelog_register -n
      Registered 4 changelog users: 'cl3 cl3 cl3 cl3'
      lustre-MDT0000: Registered changelog user cl3
      looking for dbench program
      /usr/bin/dbench
      found dbench client file /usr/share/dbench/client.txt
      '/usr/share/dbench/client.txt' -> 'client.txt'
      copying required dbench files to chroot directory /mnt/lustre/d2a.lustre-rsync-test
      '/usr/bin/dbench' -> 'dbench'
      lib64/libpopt.so.0
      lib64/libc.so.6
      lib64/ld-linux-x86-64.so.2
      running 'dbench 2 -t 60' in /mnt/lustre/d2a.lustre-rsync-test at Tue Dec 12 18:06:58 UTC 2023
      waiting for dbench pid 27630
      dbench version 4.00 - Copyright Andrew Tridgell 1999-2004
      Running for 60 seconds with load 'client.txt' and minimum warmup 12 secs
      failed to create barrier semaphore 
      0 of 2 processes prepared for launch   0 sec
      2 of 2 processes prepared for launch   0 sec
      releasing clients
         2       670    42.34 MB/sec  warmup   1 sec  latency 243.676 ms
         2       761    21.39 MB/sec  warmup   2 sec  latency 221.017 ms
         2       946    14.56 MB/sec  warmup   3 sec  latency 191.389 ms
         2      1760    13.14 MB/sec  warmup   4 sec  latency 162.857 ms
         2      2699    12.44 MB/sec  warmup   5 sec  latency 143.470 ms
         2      3576    12.52 MB/sec  warmup   6 sec  latency 159.825 ms
         2      4270    11.53 MB/sec  warmup   7 sec  latency 155.305 ms
         2      4400    10.16 MB/sec  warmup   8 sec  latency 190.446 ms
         2      4698     9.29 MB/sec  warmup   9 sec  latency 184.812 ms
         2      5516     9.11 MB/sec  warmup  10 sec  latency 167.353 ms
         2      6324     9.29 MB/sec  warmup  11 sec  latency 211.108 ms
         2      7834     2.42 MB/sec  execute   1 sec  latency 154.295 ms
         2      7909     1.38 MB/sec  execute   2 sec  latency 174.146 ms
         2      8094     1.21 MB/sec  execute   3 sec  latency 188.842 ms
         2      8966     3.14 MB/sec  execute   4 sec  latency 153.121 ms
         2      9572     3.33 MB/sec  execute   5 sec  latency 168.781 ms
         2     10585     5.64 MB/sec  execute   6 sec  latency 154.108 ms
         2     11361     5.79 MB/sec  execute   7 sec  latency 132.248 ms
         2     11516     5.16 MB/sec  execute   8 sec  latency 147.133 ms
         2     12183     5.58 MB/sec  execute   9 sec  latency 185.267 ms
         2     13034     5.41 MB/sec  execute  10 sec  latency 107.650 ms
         2     13900     6.16 MB/sec  execute  11 sec  latency 195.091 ms
         2     14877     6.55 MB/sec  execute  12 sec  latency 145.665 ms
         2     14996     6.09 MB/sec  execute  13 sec  latency 140.691 ms
         2     15122     5.70 MB/sec  execute  14 sec  latency 170.249 ms
         2     15733     5.90 MB/sec  execute  15 sec  latency 165.990 ms
         2     16247     5.59 MB/sec  execute  16 sec  latency 204.489 ms
         2     16961     5.90 MB/sec  execute  17 sec  latency 160.806 ms
         2     17550     5.95 MB/sec  execute  18 sec  latency 155.253 ms
         2     18422     6.15 MB/sec  execute  19 sec  latency 119.982 ms
         2     18513     5.87 MB/sec  execute  20 sec  latency 157.840 ms
         2     18648     5.62 MB/sec  execute  21 sec  latency 194.348 ms
         2     19351     5.77 MB/sec  execute  22 sec  latency 164.213 ms
         2     19408     5.52 MB/sec  execute  23 sec  latency 969.378 ms
         2     20009     5.38 MB/sec  execute  24 sec  latency 421.285 ms
         2     20658     5.61 MB/sec  execute  25 sec  latency 171.661 ms
         2     21413     5.88 MB/sec  execute  26 sec  latency 156.010 ms
         2     21959     5.74 MB/sec  execute  27 sec  latency 82.070 ms
         2     22096     5.56 MB/sec  execute  28 sec  latency 141.527 ms
         2     22281     5.40 MB/sec  execute  29 sec  latency 162.308 ms
         2     22887     5.50 MB/sec  execute  30 sec  latency 158.106 ms
         2     23316     5.35 MB/sec  execute  31 sec  latency 151.646 ms
         2     23947     5.42 MB/sec  execute  32 sec  latency 177.765 ms
         2     24586     5.53 MB/sec  execute  33 sec  latency 171.891 ms
         2     25236     5.67 MB/sec  execute  34 sec  latency 118.584 ms
         2     25591     5.53 MB/sec  execute  35 sec  latency 131.897 ms
         2     25724     5.40 MB/sec  execute  36 sec  latency 136.553 ms
         2     26001     5.33 MB/sec  execute  37 sec  latency 172.208 ms
         2     26561     5.36 MB/sec  execute  38 sec  latency 108.752 ms
         2     27081     5.27 MB/sec  execute  39 sec  latency 163.033 ms
         2     27638     5.34 MB/sec  execute  40 sec  latency 175.445 ms
         2     28310     5.49 MB/sec  execute  41 sec  latency 130.378 ms
         2     28943     5.51 MB/sec  execute  42 sec  latency 86.489 ms
         2     29153     5.39 MB/sec  execute  43 sec  latency 180.277 ms
         2     29283     5.29 MB/sec  execute  44 sec  latency 148.732 ms
         2     29773     5.30 MB/sec  execute  45 sec  latency 125.766 ms
         2     30183     5.26 MB/sec  execute  46 sec  latency 93.603 ms
         2     30553     5.19 MB/sec  execute  47 sec  latency 288.305 ms
         2     31300     5.26 MB/sec  execute  48 sec  latency 125.802 ms
         2     32047     5.40 MB/sec  execute  49 sec  latency 85.231 ms
         2     32578     5.40 MB/sec  execute  50 sec  latency 116.088 ms
         2     32775     5.31 MB/sec  execute  51 sec  latency 143.543 ms
         2     33211     5.30 MB/sec  execute  52 sec  latency 133.406 ms
         2     33654     5.28 MB/sec  execute  53 sec  latency 180.276 ms
         2     34159     5.23 MB/sec  execute  54 sec  latency 208.215 ms
         2     34715     5.27 MB/sec  execute  55 sec  latency 241.709 ms
         2     35719     5.47 MB/sec  execute  56 sec  latency 119.864 ms
         2     36221     5.41 MB/sec  execute  57 sec  latency 173.668 ms
         2     36352     5.33 MB/sec  execute  58 sec  latency 132.522 ms
         2     36769     5.33 MB/sec  execute  59 sec  latency 160.981 ms
         2  cleanup  60 sec
         0  cleanup  60 sec
       Operation      Count    AvgLat    MaxLat
       ----------------------------------------
       NTCreateX      10567     2.825   123.841
       Close           7710     0.642    24.838
       Rename           440     3.282    41.200
       Unlink          2156     0.976    22.880
       Qpathinfo       9492     0.636    38.188
       Qfileinfo       1629     0.076     3.208
       Qfsinfo         1719     0.018     1.539
       Sfileinfo        868     2.031    39.754
       Find            3625     0.296    41.384
       WriteX          5127     0.449    36.828
       ReadX          16023     0.009     0.055
       LockX             32     0.440     0.998
       UnlockX           32     0.504     1.832
       Flush            745    93.845   969.365
      Throughput 5.32869 MB/sec  2 clients  2 procs  max_latency=969.378 ms
      stopping dbench in /mnt/lustre/d2a.lustre-rsync-test at Tue Dec 12 18:08:10 UTC 2023 with return code 0
      clean dbench files in /mnt/lustre/d2a.lustre-rsync-test
      /mnt/lustre/d2a.lustre-rsync-test /mnt/lustre/d2a.lustre-rsync-test
      removed directory 'clients/client0/~dmtmp/PARADOX'
      removed directory 'clients/client0/~dmtmp/PM'
      removed directory 'clients/client0/~dmtmp/COREL'
      removed directory 'clients/client0/~dmtmp/ACCESS'
      removed directory 'clients/client0/~dmtmp/WORD'
      removed directory 'clients/client0/~dmtmp/EXCEL'
      removed directory 'clients/client0/~dmtmp/SEED'
      removed directory 'clients/client0/~dmtmp/WORDPRO'
      removed directory 'clients/client0/~dmtmp/PWRPNT'
      removed directory 'clients/client0/~dmtmp'
      removed directory 'clients/client0'
      removed directory 'clients/client1/~dmtmp/PARADOX'
      removed directory 'clients/client1/~dmtmp/PM'
      removed directory 'clients/client1/~dmtmp/COREL'
      removed directory 'clients/client1/~dmtmp/ACCESS'
      removed directory 'clients/client1/~dmtmp/WORD'
      removed directory 'clients/client1/~dmtmp/EXCEL'
      removed directory 'clients/client1/~dmtmp/SEED'
      removed directory 'clients/client1/~dmtmp/WORDPRO'
      removed directory 'clients/client1/~dmtmp/PWRPNT'
      removed directory 'clients/client1/~dmtmp'
      removed directory 'clients/client1'
      removed directory 'clients'
      removed 'client.txt'
      removed 'dbench'
      removed 'lib64/libpopt.so.0'
      removed 'lib64/libc.so.6'
      removed 'lib64/ld-linux-x86-64.so.2'
      /mnt/lustre/d2a.lustre-rsync-test
      dbench successfully finished
      

       

      ======= 18:06:53 \(1702404413\)
      [  302.317896] Lustre: DEBUG MARKER: == lustre-rsync-test test 2a: Replicate files created by dbench. ========================================================== 18:06:53 (1702404413)
      [  305.333993] Lustre: lustre-OST0005-osc-ffff895045192000: disconnect after 24s idle
      [  305.335802] Lustre: Skipped 1 previous similar message
      [  407.730814] Lustre: lustre-OST0000-osc-ffff895045192000: disconnect after 24s idle
      [  412.850667] Lustre: lustre-OST0006-osc-ffff895045192000: disconnect after 23s idle
      [  412.852206] Lustre: Skipped 6 previous similar messages
      [ 5749.820145] sysrq: SysRq : Changing Loglevel
      [ 5749.821211] sysrq: Loglevel set to 8
      [ 5749.823561] sysrq: SysRq : Show State
      [ 5749.824341] task:systemd         state:S stack:    0 pid:    1 ppid:     0 flags:0x00000000
      [ 5749.825986] Call Trace:
      [ 5749.826560]  __schedule+0x2d1/0x870
      [ 5749.827342]  schedule+0x55/0xf0
      [ 5749.828014]  schedule_hrtimeout_range_clock+0xf9/0x110
      [ 5749.829063]  ? ep_scan_ready_list.constprop.23+0x210/0x230
      [ 5749.830174]  ? __handle_mm_fault+0x453/0x6c0
      [ 5749.831071]  ep_poll+0x3f7/0x450
      [ 5749.831764]  ? wake_up_q+0x70/0x70
      [ 5749.832504]  do_epoll_wait+0xb8/0xe0
      [ 5749.833238]  __x64_sys_epoll_wait+0x1a/0x30
      [ 5749.834103]  do_syscall_64+0x5b/0x1b0
      [ 5749.834897]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
      [ 5749.835917] RIP: 0033:0x7fdbde375e87
      [ 5749.836701] Code: Unable to access opcode bytes at RIP 0x7fdbde375e5d.
      [ 5749.837975] RSP: 002b:00007fff155034e0 EFLAGS: 00000293 ORIG_RAX: 00000000000000e8
      [ 5749.839441] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fdbde375e87
      [ 5749.840835] RDX: 0000000000000058 RSI: 00007fff15503520 RDI: 0000000000000004
      [ 5749.842235] RBP: 00007fff15503520 R08: 0000000000000000 R09: 00007fff15503520
      [ 5749.843611] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000058
      [ 5749.844995] R13: 00000000ffffffff R14: ffffffffffffffff R15: 00007fff15503aa0
      [ 5749.846380] task:kthreadd        state:S stack:    0 pid:    2 ppid:     0 flags:0x80004000

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      lustre-rsync-test test_2a - Timeout occurred after 97 minutes, last suite running was lustre-rsync-test

      Attachments

        Issue Links

          Activity

            People

              yujian Jian Yu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: