[LU-16977] Ofd_access_log_reader use fraction option wrongly Created: 24/Jul/23  Updated: 06/Sep/23  Resolved: 06/Sep/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Alexandre Ioffe Assignee: Alexandre Ioffe
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When ofd_access_log_reader has batch-fraction=100 (default), the calculation of the batch fraction is wrong.

Details:

In the below fragment the array sa is accessed beyond the upper boundary when

aa->fraction=100

lustre\utils\ofd_access_batch.c

void *alr_sort_and_print_thread(void *arg)

...

    sa = calloc(nr, sizeof(*sa));

...

    i = nr * aa->fraction / 100;
    cut = sa[i];

It is unlikely will cause a segfault, but the threshold of IO ops counter for the the printed ALR's might be chosen wrongly. As a result aaccess_log_reader may send not complete set of ALR's when the fraction is 100%



 Comments   
Comment by Gerrit Updater [ 25/Jul/23 ]

"Alexandre Ioffe <aioffe@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51754
Subject: LU-16977 utils: access_log_reader accesses beyond batch array
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 449e90c813e1c925797c09e09937022f6df12626

Comment by Alexandre Ioffe [ 25/Jul/23 ]

As bzzz suggested if fraction is 100%, we do not need filtering at all. This includes sort function. Therefore the bug fix is turned into an improvement: we eliminate high CPU sort function.
Thanks a lot bzzz 

Comment by Gerrit Updater [ 06/Sep/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51754/
Subject: LU-16977 utils: access_log_reader accesses beyond batch array
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: bf5e5a89f9f4680c42f768b8474a3ea0bc014b54

Comment by Peter Jones [ 06/Sep/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:31:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.