Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9878

conf-santity test_103 hangs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      When running a test conf-sanity test 103 hung for reasons (seemingly) unrelated to the patch:

      00:37:23:[11569.772808] Lustre: DEBUG MARKER: == conf-sanity test 103: rename filesystem name ====================================================== 00:35:44 (1502411744)
      00:37:23:[11570.200489] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      00:37:23:[11570.200489] mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      00:37:23:[11570.200489] if [ $running -ne $mpts ]; then
      00:37:23:[11570.200489]     echo $(hostname) env are INSANE!;
      00:37:23:[11570.200489]     exit 1;
      00:37:23:[11570.200489] fi
      00:37:23:[11570.380846] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      00:37:23:[11570.380846] mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      00:37:23:[11570.380846] if [ $running -ne $mpts ]; then
      00:37:23:[11570.380846]     echo $(hostname) env are INSANE!;
      00:37:23:[11570.380846]     exit 1;
      00:37:23:[11570.380846] fi
      00:37:23:[11572.528609] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
      00:37:23:[11572.701297] Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs;
      00:37:23:[11572.701297] 			zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
      00:37:23:[11572.701297] 			zpool import -f -o cachefile=none -d /dev/lvm-Role_MDS lustre-mdt1
      00:37:23:[11573.161381] Lustre: DEBUG MARKER: zfs get -H -o value 						lustre:svname lustre-mdt1/mdt1
      00:37:23:[11573.343246] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre   		                   lustre-mdt1/mdt1 /mnt/lustre-mds1
      00:37:23:[11573.464732] Lustre: MGS: Connection restored to 9c67a2d3-a019-b56c-a070-c702d67eafeb (at 0@lo)
      00:37:23:[11573.621574] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
      00:37:23:[11573.623265] LustreError: 26027:0:(osd_oi.c:503:osd_oid()) lustre-MDT0000-osd: unsupported quota oid: 0x16
      00:37:23:[11573.771044] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
      00:37:23:[11573.956117] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
      00:37:23:[11574.492805] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-31vm9.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11574.492806] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-31vm9.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11574.609644] Lustre: DEBUG MARKER: onyx-31vm9.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11574.615286] Lustre: DEBUG MARKER: onyx-31vm9.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11574.716307] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
      00:37:23:[11574.900281] Lustre: DEBUG MARKER: zfs get -H -o value 				lustre:svname lustre-mdt1/mdt1 2>/dev/null | 				grep -E ':[a-zA-Z]{3}[0-9]{4}'
      00:37:23:[11575.084287] Lustre: DEBUG MARKER: zfs get -H -o value 				lustre:svname lustre-mdt1/mdt1 2>/dev/null | 				grep -E ':[a-zA-Z]{3}[0-9]{4}'
      00:37:23:[11575.271253] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname 		                           lustre-mdt1/mdt1 2>/dev/null
      00:37:23:[11575.456722] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
      00:37:23:[11576.599057] Lustre: MGS: Connection restored to da9cb746-6208-9aab-5381-204c8d43c685 (at 10.2.8.12@tcp)
      00:37:23:[11577.419338] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-31vm11.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11577.541160] Lustre: DEBUG MARKER: onyx-31vm11.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11578.569774] Lustre: 23937:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1502411752/real 1502411752]  req@ffff88006071cf00 x1575392887439952/t0(0) o8->lustre-OST0000-osc-MDT0000@10.2.8.12@tcp:28/4 lens 520/544 e 0 to 1 dl 1502411757 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      00:37:23:[11579.787473] Lustre: lustre-MDT0000: Connection restored to 10.2.8.12@tcp (at 10.2.8.12@tcp)
      00:37:23:[11579.788735] Lustre: Skipped 2 previous similar messages
      00:37:23:[11582.813270] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-31vm11.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11582.968662] Lustre: DEBUG MARKER: onyx-31vm11.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11585.034609] Lustre: MGS: Connection restored to caef6dd2-c1d9-1675-0f58-f295b6eee592 (at 10.2.8.4@tcp)
      00:37:23:[11585.966686] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-31vm4.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11586.090814] Lustre: DEBUG MARKER: onyx-31vm4.onyx.hpdd.intel.com: executing set_default_debug -1 all 4
      00:37:23:[11587.368362] Lustre: DEBUG MARKER: lctl get_param -n timeout
      00:37:23:[11587.588286] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20
      00:37:23:[11587.711735] Lustre: DEBUG MARKER: Using TIMEOUT=20
      00:37:23:[11587.819402] Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      00:37:23:[11588.025163] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.sys.jobid_var='procname_uid'
      00:37:23:[11600.456652] Lustre: DEBUG MARKER: /usr/sbin/lctl pool_new lustre.pool1
      00:37:23:[11606.648352] Lustre: DEBUG MARKER: /usr/sbin/lctl pool_new lustre.lustre
      00:37:23:[11616.840882] Lustre: DEBUG MARKER: /usr/sbin/lctl pool_add lustre.lustre lustre-OST0000
      00:37:23:[11627.479861] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts
      00:37:23:[11627.661919] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
      00:37:23:[11629.791376] Lustre: lustre-MDT0000: Not available for connect from 10.2.8.12@tcp (stopping)
      00:37:23:[11629.792331] Lustre: Skipped 1 previous similar message
      00:37:23:[11629.874725] LustreError: 23938:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff880066b45800 x1575392887442944/t0(0) o13->lustre-OST0000-osc-MDT0000@10.2.8.12@tcp:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
      00:37:23:[11639.786607] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.2.8.12@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      00:37:23:[11639.786608] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.2.8.12@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      00:37:23:[11639.883695] Lustre: 27858:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1502411812/real 1502411812]  req@ffff880066b45800 x1575392887443040/t0(0) o251->MGC10.2.8.10@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1502411818 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
      00:37:23:[11639.903579] Lustre: server umount lustre-MDT0000 complete
      00:37:23:[11639.995624] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      00:37:23:[11647.554227] Lustre: DEBUG MARKER: tunefs.lustre --fsname=mylustre --rename=lustre -v lustre-mdt1/mdt1
      00:37:23:[11649.047260] Lustre: DEBUG MARKER: zpool export lustre-mdt1;
      00:37:23:[11649.047260] 			 zpool import -o cachefile=none -d /dev/lvm-Role_MDS lustre-mdt1 mylustre-mdt1
      00:37:23:[11654.460485] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      00:37:23:[11654.460485] mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      00:37:23:[11654.460485] if [ $running -ne $mpts ]; then
      00:37:23:[11654.460485]     echo $(hostname) env are INSANE!;
      00:37:23:[11654.460485]     exit 1;
      00:37:23:[11654.460485] fi
      00:37:23:[11654.675636] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      00:37:23:[11654.675636] mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      00:37:23:[11654.675636] if [ $running -ne $mpts ]; then
      00:37:23:[11654.675636]     echo $(hostname) env are INSANE!;
      00:37:23:[11654.675636]     exit 1;
      00:37:23:[11654.675636] fi
      00:37:23:[11655.293057] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
      00:37:23:[11655.497269] Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs;
      00:37:23:[11655.497269] 			zpool list -H mylustre-mdt1 >/dev/null 2>&1 ||
      00:37:23:[11655.497269] 			zpool import -f -o cachefile=none -d /dev/lvm-Role_MDS mylustre-mdt1
      00:37:23:[11655.680567] Lustre: DEBUG MARKER: zfs get -H -o value 						lustre:svname mylustre-mdt1/mdt1
      01:36:09:********** Timeout by autotest system **********
      
      
      
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              brian Brian Murrell (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: