[LU-6174] do_div() silently truncates divisor to uint32_t Created: 29/Jan/15  Updated: 25/Oct/23  Resolved: 25/Oct/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Major
Reporter: Isaac Huang (Inactive) Assignee: James A Simmons
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-8137 fix llverdev to be able to write and ... Resolved
is related to LU-9887 sanity-lfsck test_9a: FAIL: (4) Got s... Resolved
is related to LU-6163 64-bit divides should use do_div() Resolved
Severity: 3
Rank (Obsolete): 17275

 Description   

The Linux do_div() silently truncates divisor to uint32_t, even on x86_64:

 
25 # define do_div(n,base) ({                                      \
26         uint32_t __base = (base);

Any code that uses 64-bit divisor will end up with incorrect result whenever the higher 32 bits aren't all zero. And when the lower 32 bits are all zero, it will end up with a division by zero error.

It happens on both x86_64 and i686. I've verified it on a x86_64 VM.

So any code that looks like the following is BAD:

 
uint64_t foo, bar;
......
if (bar != 0)
    do_div(foo, bar);

There seemed to be plenty of such code in Lustre, e.g.:

 
void fld_cache_fini(struct fld_cache *cache)
        __u64 pct;
......
        if (cache->fci_stat.fst_count > 0) {
                pct = cache->fci_stat.fst_cache * 100;
                do_div(pct, cache->fci_stat.fst_count);

I'd suggest to go through and verify all callers of do_div(). In addition, it'd be good to add a warning on do_div() in our patch checker script.



 Comments   
Comment by Oleg Drokin [ 02/Feb/15 ]

Did some existing bug in lustre prompted this? Is there a patch? where?

Comment by Isaac Huang (Inactive) [ 06/Feb/15 ]

I just happened to notice that the divisor was silently truncated, and grep'ed through Lustre code and found a few vulnerable places. There's no existing bug as far as I know, but it does look like something that may cause us trouble in the future.

Comment by James A Simmons [ 07/Mar/16 ]

If the divisor is u64 then the correct function to use is div64_u64_rem() which is in math64.h. We will have to do a code audit for this.

Comment by Gerrit Updater [ 13/Nov/19 ]

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/36749
Subject: LU-6174 nrs: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d6c660becccb9264c67055eca1dfb3046d4b53b4

Comment by Gerrit Updater [ 13/Nov/19 ]

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/36750
Subject: LU-6174 osd-ldiskfs: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e85a22ff25f77bc1e840b979389cab16f16c7583

Comment by Gerrit Updater [ 13/Nov/19 ]

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/36751
Subject: LU-6174 obd: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e1f43c1c33933337c0c5ed395546504f51cc7f10

Comment by Gerrit Updater [ 14/Dec/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36751/
Subject: LU-6174 obd: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e8f793f620f40eff540b2b49ecab00c0b9de3fc5

Comment by Gerrit Updater [ 17/Mar/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36749/
Subject: LU-6174 nrs: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c80319213c6dc4ac9826a1bd10c75373e08db837

Comment by Gerrit Updater [ 24/Mar/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36750/
Subject: LU-6174 osd-ldiskfs: perform proper division
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ed2c88b78c88b13f9cbcb3430909350915363ae5

Comment by Peter Jones [ 24/Mar/20 ]

Seems to be complete for 2.14

Comment by James A Simmons [ 24/Mar/20 ]

Only thing I can see left is the lov / lod handling.

Comment by Gerrit Updater [ 30/Mar/20 ]

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/38101
Subject: LU-6174 lod: remove incorrect ll_do_div64
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6883cec5d4806489d36260184bc64ed42d27689a

Comment by James A Simmons [ 01/Apr/20 ]

Just lov layer needs to be fixed but it needs a more complex solution.

Comment by Gerrit Updater [ 07/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38101/
Subject: LU-6174 lod: remove incorrect ll_do_div64
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 12ed98d4ef294b78393bc9acb2aefd570b263855

Comment by Gerrit Updater [ 13/Jul/20 ]

James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/39343
Subject: LU-6174 lov: use standard Linux 64 divison macros
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cb42ed3f095bb4f7e09b7826651d47b1aaee91f9

Comment by Andreas Dilger [ 08/Apr/21 ]

The new test_54c in patch https://review.whamcloud.com/43230 "LU-8137 utils: fix llverdev for use on regular files" should exercise the do_div() code reasonably, since it is creating a 2000-stripe file and writing at offsets over 16TB * OST_COUNT, but I don't know yet how long that test will take to run (it may need to be SLOW).

Comment by Andreas Dilger [ 09/Apr/21 ]

It probably makes sense to further improve the LU-8137 test to explicitly use both small and a large file offsets (and a constant timestamp for the whole test run) with a widely-striped file (overstriped with "C 2000") to cover critical offsets within the file (e.g. 0, 16000GB, ..., 1024 * 16000GB, 2000 * 16000GB with corresponding -size values that cover the interesting region), without having to exhaustively write a huge file. All of the writes should be done first, then the reads in a separate step, to catch cases where offsets are miscalculated and alias/modulo to a smaller offset.

Comment by Gerrit Updater [ 18/Jan/22 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46164
Subject: LU-6174 nrs: perform proper division
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 231695666b25f725fa7363a2f4a7d4af532a338d

Comment by Gerrit Updater [ 25/Oct/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/39343/
Subject: LU-6174 lov: use standard Linux 64 divison macros
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6eee4ea5b63e592eb3eb0cc5a91f787226a863e2

Comment by Peter Jones [ 25/Oct/23 ]

Is this body of work now complete - or is more work needed?

Comment by James A Simmons [ 25/Oct/23 ]

It is finally complete!!!!

Generated at Sat Feb 10 01:57:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.