Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6095

racer.sh should propagate $TRUNCATE to remote clients

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.7.0
    • 3
    • 16970

    Description

      Running racer with 2 clients I noticed that $TRUNCATE (from init_test_env()) was not defined when file_truncate.sh was executed. This needs to be added to the list of propagated variables in test_1():

              for rdir in $RDIRS; do
                      do_nodes $clients "DURATION=$DURATION MDSCOUNT=$MDSCOUNT \
                                         $racer $rdir $NUM_RACER_THREADS" &
                      pid=$!
                      rpids="$rpids $pid"
              done
      

      In file_truncate.sh we should test that $TRUNCATE is set, exists and is executable before continuing

      DIR=$1
      MAX=$2
      
      while true; do
              file=$DIR/$((RANDOM % MAX))
              $TRUNCATE $file $RANDOM 2> /dev/null
      done
      

      The net result of this is a lot of copies of sleep, sleeping for $RANDOM seconds

      root     13252     1  0 25226   548   1 10:59 ?        00:00:00   /mnt/lustre/racer/11 19675
      root     13505     1  0 25226   544   1 10:59 ?        00:00:00   /mnt/lustre2/racer/11 21574
      root     13815     1  0 25226   544   1 10:59 ?        00:00:00   /mnt/lustre2/racer/11 21401
      root     14293     1  0 25226   548   1 10:59 ?        00:00:00   /mnt/lustre2/racer/10 10467
      root     19131     1  0 25226   548   1 11:00 ?        00:00:00   /mnt/lustre/racer/13 12387
      root     21592     1  0 25226   532   0 11:00 ?        00:00:00   /mnt/lustre2/racer3/16 4617
      root     21860     1  0 25226   532   0 11:00 ?        00:00:00   /mnt/lustre/racer3/16 21599
      root     27685     1  0 25226   544   0 11:00 ?        00:00:00   /mnt/lustre2/racer1/11 16802
      root     31564     1  0 25226   544   1 11:00 ?        00:00:00   /mnt/lustre/racer1/2 11079
      root      5097     1  0 25226   536   1 11:00 ?        00:00:00   /mnt/lustre2/racer2/13 25438
      root     16030     1  0 25226   544   0 11:01 ?        00:00:00   /mnt/lustre2/racer2/14 21194
      root     21334     1  0 25226   532   0 11:01 ?        00:00:00   /mnt/lustre/racer2/3 29716
      root     25177     1  0 25226   536   1 11:01 ?        00:00:00   /mnt/lustre2/racer1/5 15303
      root     11551     1  0 25226   532   1 11:02 ?        00:00:00   /mnt/lustre2/racer3/11 27347
      root     12460     1  0 25226   544   1 11:02 ?        00:00:00   /mnt/lustre/racer2/2 26261
      root     14345     1  0 25226   536   0 11:02 ?        00:00:00   /mnt/lustre2/racer3/7 16972
      root     15573     1  0 25226   548   1 11:02 ?        00:00:00   /mnt/lustre2/racer2/15 13108
      root     16148     1  0 25226   536   0 11:02 ?        00:00:00   /mnt/lustre/racer3/7 13826
      root     17143     1  0 25226   544   1 11:03 ?        00:00:00   /mnt/lustre/racer1/3 15848
      root     11075     1  0 25226   548   0 11:04 ?        00:00:00   /mnt/lustre/racer3/18 19787
      root     27160     1  0 25226   548   1 11:07 ?        00:00:00   /mnt/lustre/racer2/5 21942
      root     10983     1  0 25226   532   0 11:09 ?        00:00:00   /mnt/lustre2/racer1/16 31928
      

      In turn this makes it impossible for racer to complete successfully.

      The other subscripts should be audited for similar issues.

      Attachments

        Activity

          People

            jhammond John Hammond
            jhammond John Hammond
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: