[LU-12404] conf-sanity test 69 fails with 'create file after reformat' Created: 07/Jun/19 Updated: 25/Nov/19 Resolved: 10/Sep/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0, Lustre 2.12.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | Sergey Cheremencev |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
conf-sanity test_69 fails with 'create file after reformat' Looking at the client test log from the failure https://testing.whamcloud.com/test_sets/efc1357e-8895-11e9-8c65-52540065bddc, we see the following error Starting client: trevis-18vm4.trevis.whamcloud.com: -o user_xattr,flock trevis-18vm11@tcp:/lustre /mnt/lustre CMD: trevis-18vm4.trevis.whamcloud.com mkdir -p /mnt/lustre CMD: trevis-18vm4.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-18vm11@tcp:/lustre /mnt/lustre touch: cannot touch '/mnt/lustre/d69.conf-sanity/f69.conf-sanity-last': No space left on device conf-sanity test_69: @@@@@@ FAIL: create file after reformat This looks like LU-8158 but this is happening for non-SLES clients. Looking at the OST (vm6) console log, we see [37858.976026] Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 3 sec [37859.170354] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 3 sec [37862.106450] LustreError: 30205:0:(ofd_dev.c:1709:ofd_create_hdl()) lustre-OST0000: unable to precreate: rc = -28 [37879.498553] Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_69: @@@@@@ FAIL: create file after reformat [37879.685068] Lustre: DEBUG MARKER: conf-sanity test_69: @@@@@@ FAIL: create file after reformat A different ofd_create_hdl() error is seen in LU-8158, but the root cause could be the same. We've started seeing this test fail with this ofd_create_hdl() error since 2019-05-27 Lustre version 2.12.53.62. Here are links to a few of the failed test session logs: |
| Comments |
| Comment by Andreas Dilger [ 27/Jun/19 ] |
|
I suspect that this problem is caused by patch https://review.whamcloud.com/33833 which was committed 2019-05-25 and affects exactly the number of objects created after reformat that test_69() is verifying: commit d07d9c5ed0aa1d6614944c7d1e0ca55cba301dc4
Author: Sergey Cheremencev <c17829@cray.com>
AuthorDate: Fri Aug 24 17:03:45 2018 +0300
Commit: Oleg Drokin <green@whamcloud.com>
CommitDate: Sat May 25 04:55:51 2019 +0000
LU-11760 ofd: formatted OST recognition change
Modern system is fast enough to create above
100 000(5 * OST_MAX_PRECREATE) objects during commit interval.
Increase the difference between MDS last_used ID
and OST LAST_ID to 500 000 to avoid gaps after OST failover.
The problem is that if the OST filesystem is does not have enough free inodes to store an extra 500k objects at recovery time, and the OST has previously created more objects than this, then the OST will run out of space during this test. |
| Comment by Andreas Dilger [ 28/Jun/19 ] |
|
I put a more detailed comment on how to fix this in |
| Comment by Sergey Cheremencev [ 28/Jun/19 ] |
|
Suggest to leave this open. And do revert of https://review.whamcloud.com/#/c/33833/ with |
| Comment by Patrick Farrell (Inactive) [ 10/Sep/19 ] |
|
Fixed under |