[LU-16977] Ofd_access_log_reader use fraction option wrongly Created: 24/Jul/23 Updated: 06/Sep/23 Resolved: 06/Sep/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | Lustre 2.16.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Alexandre Ioffe | Assignee: | Alexandre Ioffe |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
When ofd_access_log_reader has batch-fraction=100 (default), the calculation of the batch fraction is wrong. Details: In the below fragment the array sa is accessed beyond the upper boundary when aa->fraction=100 lustre\utils\ofd_access_batch.c void *alr_sort_and_print_thread(void *arg) ... sa = calloc(nr, sizeof(*sa)); ... i = nr * aa->fraction / 100; It is unlikely will cause a segfault, but the threshold of IO ops counter for the the printed ALR's might be chosen wrongly. As a result aaccess_log_reader may send not complete set of ALR's when the fraction is 100% |
| Comments |
| Comment by Gerrit Updater [ 25/Jul/23 ] |
|
"Alexandre Ioffe <aioffe@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51754 |
| Comment by Alexandre Ioffe [ 25/Jul/23 ] |
|
As bzzz suggested if fraction is 100%, we do not need filtering at all. This includes sort function. Therefore the bug fix is turned into an improvement: we eliminate high CPU sort function. |
| Comment by Gerrit Updater [ 06/Sep/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51754/ |
| Comment by Peter Jones [ 06/Sep/23 ] |
|
Landed for 2.16 |