Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11527

sanity test_270a failed with O_DIRECT on ARM

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      The issue was reported by John Hammond, in test 270a the writing to the DOM file may fail with -ENOSPC due to grants most probably. If test 270a is being ran in a loop then the first  several times it passes but all the other times seem to fail.

      Attachments

        Issue Links

          Activity

            [LU-11527] sanity test_270a failed with O_DIRECT on ARM
            pjones Peter Jones added a comment -

            Landed for 2.12

            pjones Peter Jones added a comment - Landed for 2.12

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33633/
            Subject: LU-11527 tests: fix sanity 270a for ARM
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 26eb3969f10168314fccfaeff0bfa275f4ea1b90

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33633/ Subject: LU-11527 tests: fix sanity 270a for ARM Project: fs/lustre-release Branch: master Current Patch Set: Commit: 26eb3969f10168314fccfaeff0bfa275f4ea1b90

            The above patch fixes the problem with O_DIRECT writes in test_270a for ARM, since it needs O_DIRECT to be aligned with PAGE_SIZE=64KB.  There is a separate problem with O_DIRECT and grant usage, which is being addressed by patch https://review.whamcloud.com/9454 "LU-4664 clio: consume grant for sync write".

            adilger Andreas Dilger added a comment - The above patch fixes the problem with O_DIRECT writes in test_270a for ARM, since it needs O_DIRECT to be aligned with PAGE_SIZE=64KB.  There is a separate problem with O_DIRECT and grant usage, which is being addressed by patch https://review.whamcloud.com/9454 " LU-4664 clio: consume grant for sync write".

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33633
            Subject: LU-11527 tests: fix sanity 270a for ARM
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8d83e2fef87c2668e03bd2d701eed7c9cc2d6970

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33633 Subject: LU-11527 tests: fix sanity 270a for ARM Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8d83e2fef87c2668e03bd2d701eed7c9cc2d6970

            So in context of this ticket I propose to modify test and remove 'oflag direct' which is not so needed. As for general issue with DIRECT IO vs grants I'd first ask around is there any ticket about that.

            tappro Mikhail Pershin added a comment - So in context of this ticket I propose to modify test and remove 'oflag direct' which is not so needed. As for general issue with DIRECT IO vs grants I'd first ask around is there any ticket about that.

            this issue does not only exist at DoM, it is the same behavior on OST.
            the reason of this issue is the direct write (with "oflag direct") ignores the grant at client side, and the grant of the client
            will be increased alongside the I/O, and if the remain disk space (subtracted the total grant to clients) at MDT or OST
            is not enough (could be more easy to reach for local test on VMs), it will cause -ENOSPC

            there is no error (ran 200 times in a loop) after deleting the "oflag direct" in test 270a of sanity.th
            but it will trigger -ENSPC less than 5 times with the "oflag direct"
            this error is also reproduced by writing the normal file with "dd" + "oflag direct"

            hongchao.zhang Hongchao Zhang added a comment - this issue does not only exist at DoM, it is the same behavior on OST. the reason of this issue is the direct write (with "oflag direct") ignores the grant at client side, and the grant of the client will be increased alongside the I/O, and if the remain disk space (subtracted the total grant to clients) at MDT or OST is not enough (could be more easy to reach for local test on VMs), it will cause -ENOSPC there is no error (ran 200 times in a loop) after deleting the "oflag direct" in test 270a of sanity.th but it will trigger -ENSPC less than 5 times with the "oflag direct" this error is also reproduced by writing the normal file with "dd" + "oflag direct"

            People

              hongchao.zhang Hongchao Zhang
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: