[LU-14580] Lustre 2.12.6 performance regression Created: 02/Apr/21  Updated: 04/Jul/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.6
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Stephen Champion Assignee: Neil Brown
Resolution: Unresolved Votes: 0
Labels: performance
Environment:

16 Socket Superdome Flex w/8 EDR IB interfaces


Issue Links:
Related
is related to LU-14055 Write performance regression caused b... Resolved
is related to LU-12820 obd_dirty_transit_pages is always zer... Resolved
Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

Between Lustre 2.12.5 and 2.12.6 releases, we have a major performance regression on large systems with multiple interfaces.

On a 16 socket Superdome Flex with 8 EDR interfaces functioning as a client of a filesystem capable of > 100 GB/s, using Lustre 2.12.5, we are measuring up to 50 GB/s. With Lustre 2.12.6, the same test is down to 13 GB/s.

Reverting commit ID f92c7a161242c478658af09159a127bc21cba611 restores performance.



 Comments   
Comment by Stephen Champion [ 02/Apr/21 ]

FWIW, the test we are using is IOR w/ POSIX Direct IO:
+ mpirun sdf-1 240 /jet/home/champios/ior/bin/ior -i 3 -t 16m -b 13120m -s 1 -a POSIX --posix.odirect -e -F -w -k -E -D 30 -o /ocean/neocortex/tests/IOR-files/iorfile

Comment by Peter Jones [ 02/Apr/21 ]

Thanks for the heads-up Steve!

Comment by John Hammond [ 02/Apr/21 ]

schamp instead of reverting the entire f92c7a1 change, would it be possible to just revert the following hunk and rerun your benchmark?

diff --git a/lustre/include/obd.h b/lustre/include/obd.h
index 2b5fb97..8545fd4 100644
--- a/lustre/include/obd.h
+++ b/lustre/include/obd.h
@@ -204,7 +204,6 @@ struct client_obd {
        /* the grant values are protected by loi_list_lock below */
        unsigned long            cl_dirty_pages;      /* all _dirty_ in pages */
        unsigned long            cl_dirty_max_pages;  /* allowed w/o rpc */
-       unsigned long            cl_dirty_transit;    /* dirty synchronous */
        unsigned long            cl_avail_grant;   /* bytes of credit for ost */
        unsigned long            cl_lost_grant;    /* lost credits (trunc) */
        /* grant consumed for dirty pages */
Comment by Neil Brown [ 03/Apr/21 ]

This looks to be a badly ported patch.

Before the patch, osc_consume_write_grant() increments obd_dirty_pages.  After the patch it doesn't.

There are two callers of osc_consume_write_grant().

One of them, in osc_enter_cache_try() now increments obd_dirty_pages before calling osc_consume_write_grant()

The other, in osc_queue_sync_pages(), hasn't been updated.  Presumably it needs an increment.  Though there is already an increment after the call.

So the patch makes a behaviour change which wasn't intended.  I don't know what the correct behaviour is.

@mpershin you did the backport: can you check it?

 

Comment by Mikhail Pershin [ 08/Apr/21 ]

Sure, I am checking that . These changes are combination of two patches actually: first ID is Ia047affc33fb9277e6c28a8f6d7d088c385b51a8 and next one is already referred f92c7a161242c478658af09159a127bc21cba611

Their ports to 2.12 base were a bit different from master branch so I am checking if some functionality was broken. 

Comment by Mikhail Pershin [ 08/Apr/21 ]

I don't see problems with patch itself. Increment in osc_consume_write_grant() was removed because it is done by atomic_long_add_return() now outside that call and it is done in both places where it is called. But maybe the patch "LU-12687 osc: consume grants for direct I/O" itself causes slowdown? Now grants are taken for Direct IO as well, so maybe that is related to not enough grants problem or similar. Are there any complains about grants on client during IOR run?

Comment by Stephen Champion [ 17/May/21 ]

@John Hammond Good idea! I tried re-introducing the padding to no effect on x86_64. 128-bit cache line architectures may have different results.

Comment by Peter Jones [ 28/May/21 ]

Did you get any updated results on this Steve?

Comment by Patrick Farrell [ 14/Jun/21 ]

I'm pretty confident this and LU-14055 are the same issue, so I'm snagging both of these.

I'm going to leave this one open for now, but I think LU-14055 is the better place to continue the discussion.

Generated at Sat Feb 10 03:10:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.