Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16658

performance sanity test_6 mdsrate-lookup-10dirs - UCX ERROR

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Cliff White <cwhite@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d2a273de-f0d8-426b-adb8-ccd3fed94347

      Client appears to hand/drop connection while doing initial file/dir creation

      ===== mdsrate-lookup-10dirs.sh Test preparation: creating 10 dirs with 12650 files.
      + /usr/lib64/openmpi/bin/mdsrate --mknod --ndirs 10 --dirfmt '/mnt/lustre/mdsrate/lookup-%d' --nfiles 12650 --filefmt 'f%%d'
      + chmod 0777 /mnt/lustre
      drwxrwxrwx 4 root root 4096 Jul 19 19:11 /mnt/lustre
      + su mpiuser sh -c "/usr/lib64/openmpi/bin/mpirun --mca btl tcp,self --mca btl_tcp_if_include eth0 -mca boot ssh --oversubscribe --oversubscribe -machinefile /tmp/auster.machines -np 10 /usr/lib64/openmpi/bin/mdsrate --mknod --ndirs 10 --dirfmt '/mnt/lustre/mdsrate/lookup-%d' --nfiles 12650 --filefmt 'f%%d' "
      0: onyx-91vm11.onyx.whamcloud.com starting at Tue Jul 19 19:11:39 2022
      [1658257899.971284] [onyx-91vm12:60266:0]          select.c:514  UCX  ERROR   no active messages transport to <no debug data>: posix/memory - Destination is unreachable, sysv/memory - Destination is unreachable, self/memory0 - Destination is unreachable, tcp/eth0 - Destination is unreachable, tcp/lo - Destination is unreachable
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      performance-sanity test_6 - Timeout occurred after 610 minutes, last suite running was performance-sanity

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: