Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14849

sanity test 30d sporadically fails (Lustre 2.14)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.14.0
    • 2 mds, 2 oss and 2 clients. 4 1T disks for each oss. The test apparatus (test-framework.sh) configures each disk as a zfs pool containing an ost dataset with --device-size=400000 (defined in local.sh) using mdfs.lustre.
    • 3
    • 9223372036854775807

    Description

      sanity test 30d (see code below) is a new test in Lustre 2.14.  It runs 10 128M dd in parallel to the same file name.  It fails occasionally with "no space left on device".  We suspect the failure is related to a large file (128M) being repeated created/deleted in a small dataset (400M), causing space recycling problem on the drive.  For now we plan to increase the size of the ost dataset in each pool to 1G to mitigate the problem.

      We have the following questions: 1. How should the ost dataset be sized for Luster tests?  We could give it a much larger size but are concerned that will cause problems to other Luster tests.  For example, some tests may be expected to fail with "no space left" error.  With a very large size it may take a long time to reach the failure state.

      2. Although the dd file object repeatedly hit a relatively small dataset, it's still within the space limitation with a wide margin.  Then is the out of space error a reasonable behavior?

      test_30d() {
              cp $(which dd) $DIR || error "failed to copy dd to $DIR/dd"
              for i in {1..10}; do
                      $DIR/dd bs=1M count=128 if=/dev/zero of=$DIR/$tfile &
                      local PID=$!
                      sleep 1
                      $LCTL set_param ldlm.namespaces.*MDT*.lru_size=clear
                      wait $PID || error "executing dd from Lustre failed"
                      rm -f $DIR/$tfile
              done
              rm -f $DIR/dd
      }
      
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            xiaolinzang Xiaolin Zang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: