Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-643

runracer lockup

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 6563

    Description

      running the runracer test ends up haning on our machines.

      == runracer test 1: racer on clients: spoon01,spoon02,spoon03,spoon06,spoon07,spoon08,spoon09,spoon14,spoon15,spoon16,spoon17,spoon18,spoon19,spoon22,spoon23,spoon24,spoon25,spoon26,spoon27,spoon28,spoon29,spoon30,spoon31,spoon33,spoon34,spoon35,spoon36,spoon37,spoon38,spoon39,spoon40,spoon41 DURATION=120 ====================================================================================================== 10:40:49 (1314369649)
      TIMERPID=9594
      RACERPID= 9596
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=149160
      file_create: SIZE=17544
      file_create: SIZE=67352
      file_create: SIZE=61496
      file_create: SIZE=30424
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=161096
      file_create: SIZE=98816
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=170752
      file_create: SIZE=218840
      file_create: SIZE=87640
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=133560
      file_create: SIZE=137520
      file_create: SIZE=108440
      file_create: SIZE=19608
      file_create: SIZE=237256
      file_create: SIZE=257904
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=55016
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=99512
      file_create: SIZE=47928
      file_create: SIZE=125936
      file_create: SIZE=205856
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=78664
      file_create: SIZE=139200
      file_create: SIZE=204696
      file_create: SIZE=225752
      file_create: SIZE=253000
      file_create: SIZE=190752
      file_create: SIZE=109752
      file_create: SIZE=227960
      file_create: SIZE=216624
      file_create: SIZE=227512
      file_create: SIZE=211040
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=106928
      file_create: SIZE=259688
      file_create: SIZE=17168
      file_create: SIZE=29408
      file_create: SIZE=39400
      file_create: SIZE=77144
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=113376
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_createfile_create: SIZE=161328
      file_create: SIZE=5248
      file_create: SIZE=102216
      file_create: SIZE=102440
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=38592
      file_create: SIZE=180296
      file_create: SIZE=76256
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=240672
      Running /chexport/users/jsimmons/tests/racer/racer.sh for 120 seconds. CTRL-C to exit
      file_create: SIZE=7592
      file_create: SIZE=161672
      file_create: SIZE=143384
      file_create: SIZE=189600
      file_create: SIZE=185664
      file_create: SIZE=63744
      file_create: SIZE=190152

      ....
      spoon02: file_create.sh: no process killed
      spoon03: file_create.sh: no process killed
      spoon02: dir_create.sh: no process killed
      spoon02: file_rm.sh: no process killed
      spoon03: file_rm.sh: no process killed
      spoon02: file_rename.sh: no process killed
      spoon15: file_create.sh: no process killed
      spoon03: file_rename.sh: no process killed
      spoon02: file_link.sh: no process killed
      spoon03: file_link.sh: no process killed
      spoon02: file_symlink.sh: no process killed
      spoon07: file_create.sh: no process killed
      ....
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      racer cleanup
      sleeping 5 sec ...
      racer cleanup
      sleeping 5 sec ...

      Waited 50, rc=1 sleeping 40 sec ...
      Waited 50, rc=1 sleeping 40 sec ...
      Waited 110, rc=1 root 22229 0.0 0.0 8688 392 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 22230 0.0 0.0 8688 388 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 22239 0.0 0.0 8688 392 ? D 10:42 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 22299 0.0 0.0 6052 680 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Waited 110, rc=2 root 6617 0.1 0.0 10740 1164 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 21713 0.0 0.0 8688 388 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21732 0.0 0.0 8688 392 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21801 0.0 0.0 6056 720 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Waited 110, rc=2 root 6611 0.1 0.0 10740 1160 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 6620 0.1 0.0 10740 1168 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 21932 0.0 0.0 8688 392 ? D 10:42 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21939 0.0 0.0 8688 388 ? D 10:42 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21943 0.0 0.0 8688 384 ? D 10:42 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21997 0.0 0.0 6056 720 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Waited 110, rc=2 root 6683 0.1 0.0 10740 1164 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 21542 0.0 0.0 8688 388 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21601 0.0 0.0 6056 724 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      ...
      root 6626 0.1 0.0 10740 1164 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 21440 0.0 0.0 8688 388 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21446 0.0 0.0 8688 388 ? D 10:41 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 21536 0.0 0.0 6056 736 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Waited 110, rc=1 root 24540 0.0 0.0 8688 392 ? D 10:42 0:00 /bin/bash ./file_concat.sh /lustre/barry/racer 20
      root 24625 0.0 0.0 6056 732 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Waited 110, rc=1 root 6604 0.1 0.0 10740 1164 ? D 10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 6624 0.1 0.0 10740 1164 ? D spoon14: df: `/lustre/barry/racer': Cannot send after transport endpoint shutdown

      10:40 0:00 /bin/bash ./dir_create.sh /lustre/barry/racer 20
      root 21597 0.0 0.0 6056 724 ? S 10:44 0:00 grep -E file_create|dir_create|file_rm|file_rename|file_link|file_symlink|file_list|file_concat
      Filesystem 1K-blocks Used Available Use% Mounted on
      10.37.248.61@o2ib1:/lustre
      44095408 5241296 36610248 13% /lustre/barry
      Filesystem 1K-blocks Used Available Use% Mounted on

      Then it hangs

      I attached the mds kernel logs

      Attachments

        Activity

          People

            rread Robert Read (Inactive)
            simmonsja James A Simmons
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: