Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17732

Sanity 30d fails sporadically due to overly small dataset size

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      test_30d()

      { cp $(which dd) $DIR || error "failed to copy dd to $DIR/dd"   for i in \{1..10}

      ; do
      $DIR/dd bs=1M count=128 if=/dev/zero of=$DIR/$tfile &
      local PID=$!
      sleep 1
      $LCTL set_param ldlm.namespaces.MDT.lru_size=clear
      wait $PID || error "executing dd from Lustre failed"
      rm -f $DIR/$tfile
      done
       
      rm -f $DIR/dd
      }
       
      So that's 128 blocks of 1M size for each iteration.  The test logs shows out of space error in the 5th iteration:
       
      ldlm.namespaces.lustre-MDT0000-lwp-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-MDT0000-mdc-ffff88847bdc0800.lru_size=clear
      ldlm.namespaces.lustre-MDT0001-mdc-ffff88847bdc0800.lru_size=clear
      ldlm.namespaces.lustre-MDT0001-osp-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0000-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0001-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0002-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0003-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0004-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0005-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0006-osc-MDT0000.lru_size=clear
      ldlm.namespaces.lustre-OST0007-osc-MDT0000.lru_size=clear
      ldlm.namespaces.mdt-lustre-MDT0000_UUID.lru_size=clear
      /mnt/lustre/dd: error writing '/mnt/lustre/f30d.sanity': No space left on device
      67+0 records in
      66+0 records out
      69206016 bytes (69 MB, 66 MiB) copied, 1.38986 s, 49.8 MB/s
       sanity test_30d: @@@@@@ FAIL: executing dd from Lustre failed
        Trace dump:
        = /usr/lib/lustre/tests/test-framework.sh:6273:error()
        = /usr/lib/lustre/tests/sanity.sh:3144:test_30d()
        = /usr/lib/lustre/tests/test-framework.sh:6576:run_one()
        = /usr/lib/lustre/tests/test-framework.sh:6623:run_one_logged()
        = /usr/lib/lustre/tests/test-framework.sh:6450:run_test()
        = /usr/lib/lustre/tests/sanity.sh:3150:main()
       
      This is the size of the file system AFTER the test suite:
       
      df /mnt/lustre
      Filesystem     1K-blocks    Used Available Use% Mounted on
      /dev/sdb1       32711388 6270568  25058032  21% /

      Depending on the environment in question, timing for this test may vary, which may cause us to hit OOS due to only provisioning a 400M device.  Bumping this simply to 1G provides sufficient room that this rare bug no longer recurs.

      Patch to be sent shortly.

      Attachments

        Activity

          [LU-17732] Sanity 30d fails sporadically due to overly small dataset size

          "Ellis Wilson <elliswilson@microsoft.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54762
          Subject: LU-17732 tests: Sanity 30d fails sporadically
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 4769d4adb5691f3b38a2a8be325e1a64849554cf

          gerrit Gerrit Updater added a comment - "Ellis Wilson <elliswilson@microsoft.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54762 Subject: LU-17732 tests: Sanity 30d fails sporadically Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4769d4adb5691f3b38a2a8be325e1a64849554cf

          People

            elliswilson Ellis Wilson
            elliswilson Ellis Wilson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: