[LU-11527] sanity test_270a failed with O_DIRECT on ARM Created: 16/Oct/18  Updated: 13/Nov/18  Resolved: 13/Nov/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Major
Reporter: Mikhail Pershin Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: DoM2, arm

Issue Links:
Related
is related to LU-4664 sync write should consume grant on cl... Resolved
is related to LU-11597 sanityn test 16a failed with direct I/O Resolved
is related to LU-11200 Centos 8 arm64 server support Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The issue was reported by John Hammond, in test 270a the writing to the DOM file may fail with -ENOSPC due to grants most probably. If test 270a is being ran in a loop then the first  several times it passes but all the other times seem to fail.



 Comments   
Comment by Hongchao Zhang [ 26/Oct/18 ]

this issue does not only exist at DoM, it is the same behavior on OST.
the reason of this issue is the direct write (with "oflag direct") ignores the grant at client side, and the grant of the client
will be increased alongside the I/O, and if the remain disk space (subtracted the total grant to clients) at MDT or OST
is not enough (could be more easy to reach for local test on VMs), it will cause -ENOSPC

there is no error (ran 200 times in a loop) after deleting the "oflag direct" in test 270a of sanity.th
but it will trigger -ENSPC less than 5 times with the "oflag direct"
this error is also reproduced by writing the normal file with "dd" + "oflag direct"

Comment by Mikhail Pershin [ 27/Oct/18 ]

So in context of this ticket I propose to modify test and remove 'oflag direct' which is not so needed. As for general issue with DIRECT IO vs grants I'd first ask around is there any ticket about that.

Comment by Gerrit Updater [ 09/Nov/18 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33633
Subject: LU-11527 tests: fix sanity 270a for ARM
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8d83e2fef87c2668e03bd2d701eed7c9cc2d6970

Comment by Andreas Dilger [ 09/Nov/18 ]

The above patch fixes the problem with O_DIRECT writes in test_270a for ARM, since it needs O_DIRECT to be aligned with PAGE_SIZE=64KB.  There is a separate problem with O_DIRECT and grant usage, which is being addressed by patch https://review.whamcloud.com/9454 "LU-4664 clio: consume grant for sync write".

Comment by Gerrit Updater [ 13/Nov/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33633/
Subject: LU-11527 tests: fix sanity 270a for ARM
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 26eb3969f10168314fccfaeff0bfa275f4ea1b90

Comment by Peter Jones [ 13/Nov/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:44:39 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.