Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7631

conf-sanity test_82a: getstripe -c wrong: found 2, expected 3

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0, Lustre 2.11.0, Lustre 2.10.5
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/9313c1e0-b06f-11e5-bf32-5254006e85c2.

      The sub-test test_82a failed with the following error:

      /usr/bin/lfs getstripe -c /mnt/lustre/d82a.conf-sanity/f82a.conf-sanity-1 wrong: found 2, expected 3
      

      Looks like this might be related to running short of precreated OST objects on one of the OSTs and it is skipped rather than blocking the create. The MDS should allow at most 1/4 of requested stripes to be skipped if they have no objects rather than blocking the create indefinitely. However, it appears that this functionality was broken with the change from LOV to LOD, and in this case all 3 OST objects are required since (3 * 1/4 < 1) so no whole stripe could be skipped yet.

      In lod_qos_prep_create() it does not set the flags = LOV_USES_DEFAULT_STRIPE for the cases when a filesystem-wide default striping is used as was done in the original qos_prep_create(), and as such lod_alloc_qos() requires that all requested stripes to be allocated. The lod_alloc_qos() code will fall back to lod_alloc_rr() with -EAGAIN if these cannot be allocated. In lod_alloc_rr() it will return success if at least one OST object was allocated, which doesn't seem correct if a large number of stripes was requested, though it isn't clear why lod_alloc_rr() doesn't wait for the OSTs to come online and allocate the requested number of objects.

      Also, it looks like the check for lod_qos_is_usable() could be moved to the start of lod_alloc_qos() instead of after the pools are checked, since it doesn't use any of the pool information anyway.

      Info required for matching: conf-sanity 82a

      Attachments

        Issue Links

          Activity

            [LU-7631] conf-sanity test_82a: getstripe -c wrong: found 2, expected 3
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27441/
            Subject: LU-7631 tests: wait_osts_up waits for MDS precreates
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: edb0fb241bb5e0cc95c240ed977abf7f234ee045

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27441/ Subject: LU-7631 tests: wait_osts_up waits for MDS precreates Project: fs/lustre-release Branch: master Current Patch Set: Commit: edb0fb241bb5e0cc95c240ed977abf7f234ee045
            sbuisson Sebastien Buisson (Inactive) added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/48700424-56f8-11e7-8a1b-5254006e85c2

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/27441
            Subject: LU-7631 tests: wait_osts_up waits for MDS precreates
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: cb7c2473cdbc2e375182e3d7de1b0fbfa6b0865a

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/27441 Subject: LU-7631 tests: wait_osts_up waits for MDS precreates Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cb7c2473cdbc2e375182e3d7de1b0fbfa6b0865a
            yong.fan nasf (Inactive) added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/cf4a46ea-4252-11e7-bc6c-5254006e85c2

            I can't see why this test is formatting a new filesystem? It should be able to run with any existing filesystem, and this should also avoid the failure since it there will not be a startup issue with the OSTs not being ready.

            adilger Andreas Dilger added a comment - I can't see why this test is formatting a new filesystem? It should be able to run with any existing filesystem, and this should also avoid the failure since it there will not be a startup issue with the OSTs not being ready.
            mdiep Minh Diep added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/d91b538c-2568-11e7-9de9-5254006e85c2
            yong.fan nasf (Inactive) added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/bac95502-ead8-11e6-af25-5254006e85c2
            yong.fan nasf (Inactive) added a comment - +1 on master: https://testing.hpdd.intel.com/test_sets/2bde2290-be65-11e6-9f18-5254006e85c2
            yujian Jian Yu added a comment -

            One more failure on master branch in review-dne-part-1 test session:
            https://testing.hpdd.intel.com/test_sets/01a4a8fc-a446-11e6-a980-5254006e85c2

            yujian Jian Yu added a comment - One more failure on master branch in review-dne-part-1 test session: https://testing.hpdd.intel.com/test_sets/01a4a8fc-a446-11e6-a980-5254006e85c2
            rhenwood Richard Henwood (Inactive) added a comment - Another recent failure on Master with review-dne-part-1: https://testing.hpdd.intel.com/test_sets/a9d74cae-057d-11e6-b5f1-5254006e85c2

            People

              jamesanunez James Nunez (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: