Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.18.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Chris Horn <chorn@ddn.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/1c2cb8b1-f461-4895-a156-80c6421b32d6
test_170 failed with the following error:
Import failed 134
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/124865 - 6.12.0-124.8.1.el10_1.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/124865 - 6.12.0-124.49.1_lustre.el10.x86_64
Looks like a latent bug that is exposed by https://review.whamcloud.com/65055 ("LU-20000 lnetctl: bad CPTs/tunables during import").
while (fgets(buf, len, input) != NULL) {
...
buf += strlen(buf);
len -= strlen(buf); // <-- strlen(buf) is now 0
}
buf is advanced first, so by the time strlen(buf) is evaluated on the next line buf already points at the NUL terminator and returns 0. len is therefore never decremented while buf keeps walking forward through the allocation.
yaml_blk is malloc'd as st.st_size bytes (lnetctl.c:5639). Each iteration fgets(buf, len, …) is called with len still equal to the full original size even though buf has moved past previously-written bytes. With _FORTIFY_SOURCE + __builtin_dynamic_object_size, glibc sees that the remaining writable space at buf is smaller than len and aborts with:
*** buffer overflow detected ***: terminated Aborted (core dumped)
That's why lnetctl import --old-api core-dumps as soon as the very first sub-test feeds it a multi-line YAML doc, and why test_170 fails immediately at line 2521 ("Different CPTs").
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-lnet test_170 - Import failed 134