Details

    • Technical task
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • Lustre 2.4.0, Lustre 2.5.0
    • 4278

    Description

      Currently, grant is still inflated if backend block size > page size (that's the case with zfs osd).
      OBD_CONNECT_GRANT_PARAM was added to address this and we need to develop the the code in osc & ofd to implement support for this feature.

      Attachments

        Issue Links

          Activity

            [LU-2049] add support for OBD_CONNECT_GRANT_PARAM

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25853/
            Subject: LU-2049 grant: Fix grant interop with pre-GRANT_PARAM clients
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 03f24e6f786459b3dd8a37ced7fb3842b864613d

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25853/ Subject: LU-2049 grant: Fix grant interop with pre-GRANT_PARAM clients Project: fs/lustre-release Branch: master Current Patch Set: Commit: 03f24e6f786459b3dd8a37ced7fb3842b864613d

            Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/25853
            Subject: LU-2049 grant: Fix grant interop with pre-GRANT_PARAM clients
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 15044478ce5f96a6bc80d8209e7fa9fed3f1a8a0

            gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/25853 Subject: LU-2049 grant: Fix grant interop with pre-GRANT_PARAM clients Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 15044478ce5f96a6bc80d8209e7fa9fed3f1a8a0

            After enabling grant checking:
            https://testing.hpdd.intel.com/test_sets/275100e8-5ff2-11e6-b5b1-5254006e85c2

            All the tests that were checked failed.

            utopiabound Nathaniel Clark added a comment - After enabling grant checking: https://testing.hpdd.intel.com/test_sets/275100e8-5ff2-11e6-b5b1-5254006e85c2 All the tests that were checked failed.

            Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/21619
            Subject: LU-2049 tests: FOR TEST ONLY GRANT_CHECK
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 64d3edb56be5b8af451a7b6947aad623fccf01ca

            gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/21619 Subject: LU-2049 tests: FOR TEST ONLY GRANT_CHECK Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 64d3edb56be5b8af451a7b6947aad623fccf01ca
            charr Cameron Harr added a comment -

            We started hitting the symptoms of LU-7510 the last couple week or so, for which this patch is marked as a fix. We're running a 2.5-5 branch. Of our 80 OSTs, we had 32 that were ~90% full and the rest were closer to 65% full. Deactivating those fuller OSTs appears to have worked around the issue for now, though we think it's starting to happen on a sister file system.

            charr Cameron Harr added a comment - We started hitting the symptoms of LU-7510 the last couple week or so, for which this patch is marked as a fix. We're running a 2.5-5 branch. Of our 80 OSTs, we had 32 that were ~90% full and the rest were closer to 65% full. Deactivating those fuller OSTs appears to have worked around the issue for now, though we think it's starting to happen on a sister file system.

            It doesn't appear that there was a test in the last patch to verify that the new grant code is working properly. I haven't looked in detail whether it is practical to make a test or not, but that should at least be given a few minutes attention before closing the bug.

            adilger Andreas Dilger added a comment - It doesn't appear that there was a test in the last patch to verify that the new grant code is working properly. I haven't looked in detail whether it is practical to make a test or not, but that should at least be given a few minutes attention before closing the bug.

            It looks like everything has landed for this, can this bug be resolved?

            utopiabound Nathaniel Clark added a comment - It looks like everything has landed for this, can this bug be resolved?
            green Oleg Drokin added a comment -

            I filed LU-7803 for a potential interop issue that I am experiencing now.

            green Oleg Drokin added a comment - I filed LU-7803 for a potential interop issue that I am experiencing now.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/7793/
            Subject: LU-2049 grant: add support for OBD_CONNECT_GRANT_PARAM
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: bd1e41672c974b97148b65115185a57ca4b7bbde

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/7793/ Subject: LU-2049 grant: add support for OBD_CONNECT_GRANT_PARAM Project: fs/lustre-release Branch: master Current Patch Set: Commit: bd1e41672c974b97148b65115185a57ca4b7bbde

            The main goal of this patch is to reduce the grant over-provisioning for clients that do not understand large blocks on ZFS. It would be useful to run a manual test, or better to write a sanity subtest that compares the grant on the client with the grant on the server to ensure they roughly match rather than being inflated by a factor of (128/4).

            For ZFS OSTs the test should be skipped if this feature is not available on the OSC file:

                    [ "$(facet_fstype ost1)" = "ZFS" ] && $LCTL get_param osc.$FSNAME-OST0000*.import |
                            grep -q "connect_flags:.*grant_param" ||
                            { skip "grant_param not available" && return }
            
            adilger Andreas Dilger added a comment - The main goal of this patch is to reduce the grant over-provisioning for clients that do not understand large blocks on ZFS. It would be useful to run a manual test, or better to write a sanity subtest that compares the grant on the client with the grant on the server to ensure they roughly match rather than being inflated by a factor of (128/4). For ZFS OSTs the test should be skipped if this feature is not available on the OSC file: [ "$(facet_fstype ost1)" = "ZFS" ] && $LCTL get_param osc.$FSNAME-OST0000*.import | grep -q "connect_flags:.*grant_param" || { skip "grant_param not available" && return }

            Patch http://review.whamcloud.com/7793 needs to be refreshed and landed.

            adilger Andreas Dilger added a comment - Patch http://review.whamcloud.com/7793 needs to be refreshed and landed.

            People

              utopiabound Nathaniel Clark
              johann Johann Lombardi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: