Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6092

segment fault and bus error during racer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • None
    • None
    • 3
    • 16956

    Description

      Though there are no kernel panic, but some bus error and segment fault happens during racer, even with single MDT on current master.

      == racer test 1: racer on clients: testnode DURATION=300 == 23:16:46 (1420615006)
      racers pids: 5216 5217
      ./file_exec.sh: line 12:  6093 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12:  8638 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 19954 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 29676 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 46760 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 51388 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 76169 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 96465 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 101751 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 103864 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 113629 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 118052 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 121051 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12:  8462 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 11135 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 11357 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~^[[20~^[[20~^[[20~^[[20~^[[20~^[[20~./file_exec.sh: line 12: 51034 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~./file_exec.sh: line 12: 60066 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 60784 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 68772 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 78361 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 96118 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 97719 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 102173 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 107360 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 116832 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 120715 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^C^[[20~./file_exec.sh: line 12: 122723 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~./file_exec.sh: line 12: 128525 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 17964 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 29243 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 44950 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 46846 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 50790 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 72941 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 84007 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 104448 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 107871 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 113919 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 12260 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 38650 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 39337 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 39855 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 48581 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 52954 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 57474 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 58930 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 65759 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 82054 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 94833 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~./file_exec.sh: line 12: 103710 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ^[[20~^[[20~^[[20~^[[20~^[[20~^C^[[20~^[[20~^[[20~./file_exec.sh: line 12:  8509 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 20436 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 35678 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 43675 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 43916 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 57146 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 62439 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 72712 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 75564 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 78896 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 85944 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      
      ^C./file_exec.sh: line 12: 112681 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 117736 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 130348 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 12633 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 22754 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 26036 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 26263 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 28503 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 31072 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 39816 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 51044 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 57162 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 57836 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      ./file_exec.sh: line 12: 62113 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
      file_create.sh: no process killed
      dir_create.sh: no process killed
      file_rm.sh: no process killed
      file_create.sh: no process killed
      file_rename.sh: no process killed
      file_link.sh: no process killed
      dir_create.sh: no process killed
      file_rm.sh: no process killed
      file_symlink.sh: no process killed
      file_list.sh: no process killed
      file_rename.sh: no process killed
      file_concat.sh: no process killed
      file_link.sh: no process killed
      file_exec.sh: no process killed
      file_symlink.sh: no process killed
      file_list.sh: no process killed
      file_chown.sh: no process killed
      file_chmod.sh: no process killed
      file_concat.sh: no process killed
      file_exec.sh: no process killed
      file_mknod.sh: no process killed
      file_truncate.sh: no process killed
      file_chown.sh: no process killed
      file_delxattr.sh: no process killed
      file_chmod.sh: no process killed
      file_mknod.sh: no process killed
      file_getxattr.sh: no process killed
      file_truncate.sh: no process killed
      file_setxattr.sh: no process killed
      file_delxattr.sh: no process killed
      file_getxattr.sh: no process killed
      file_setxattr.sh: no process killed
      Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
      racer cleanup
      sleeping 5 sec ...
      there should be NO racer processes:
      USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      Filesystem           1K-blocks  Used Available Use% Mounted on
      testnode@tcp:/lustre    374928 54024    297708  16% /mnt/lustre2
      We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
      Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
      racer cleanup
      sleeping 5 sec ...
      there should be NO racer processes:
      USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      Filesystem           1K-blocks  Used Available Use% Mounted on
      testnode@tcp:/lustre    374928 54024    297708  16% /mnt/lustre
      We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
      test_1 returned 129
      FAIL 1 (313s)
      

      I ran racer with the following patch

      [root@testnode tests]# git diff
      diff --git a/lustre/tests/racer/file_create.sh b/lustre/tests/racer/file_create.sh
      index e615365..62eb8bb 100755
      --- a/lustre/tests/racer/file_create.sh
      +++ b/lustre/tests/racer/file_create.sh
      @@ -9,7 +9,7 @@ OSTCOUNT=${OSTCOUNT:-$(lfs df $DIR 2> /dev/null | grep -c OST)}
       while /bin/true ; do 
              file=$((RANDOM % MAX))
              # $RANDOM is between 0 and 32767, and we want $blockcount in 64kB units
      -       blockcount=$((RANDOM * MAX_MB / 32 / 64))
      +       blockcount=$((RANDOM % 4))
              stripecount=$((RANDOM % (OSTCOUNT + 1)))
              [ $OSTCOUNT -gt 0 ] &&
                      lfs setstripe -c $stripecount $DIR/$file 2> /dev/null
      diff --git a/lustre/tests/racer/racer.sh b/lustre/tests/racer/racer.sh
      index deef18e..3ed624e 100755
      --- a/lustre/tests/racer/racer.sh
      +++ b/lustre/tests/racer/racer.sh
      @@ -17,7 +17,7 @@ file_list file_concat file_exec file_chown file_chmod file_mknod file_truncate \
       file_delxattr file_getxattr file_setxattr"
       
       if [ $MDSCOUNT -gt 1 ]; then
      -       RACER_PROGS="${RACER_PROGS} dir_remote dir_migrate"
      +       RACER_PROGS="${RACER_PROGS} dir_remote"
       fi
       
       racer_cleanup()
      
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: