[LU-14580] Lustre 2.12.6 performance regression Created: 02/Apr/21 Updated: 04/Jul/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Stephen Champion | Assignee: | Neil Brown |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | performance | ||
| Environment: |
16 Socket Superdome Flex w/8 EDR IB interfaces |
||
| Issue Links: |
|
||||||||||||
| Severity: | 2 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
Between Lustre 2.12.5 and 2.12.6 releases, we have a major performance regression on large systems with multiple interfaces. On a 16 socket Superdome Flex with 8 EDR interfaces functioning as a client of a filesystem capable of > 100 GB/s, using Lustre 2.12.5, we are measuring up to 50 GB/s. With Lustre 2.12.6, the same test is down to 13 GB/s. Reverting commit ID f92c7a161242c478658af09159a127bc21cba611 restores performance. |
| Comments |
| Comment by Stephen Champion [ 02/Apr/21 ] |
|
FWIW, the test we are using is IOR w/ POSIX Direct IO: |
| Comment by Peter Jones [ 02/Apr/21 ] |
|
Thanks for the heads-up Steve! |
| Comment by John Hammond [ 02/Apr/21 ] |
|
schamp instead of reverting the entire f92c7a1 change, would it be possible to just revert the following hunk and rerun your benchmark? diff --git a/lustre/include/obd.h b/lustre/include/obd.h
index 2b5fb97..8545fd4 100644
--- a/lustre/include/obd.h
+++ b/lustre/include/obd.h
@@ -204,7 +204,6 @@ struct client_obd {
/* the grant values are protected by loi_list_lock below */
unsigned long cl_dirty_pages; /* all _dirty_ in pages */
unsigned long cl_dirty_max_pages; /* allowed w/o rpc */
- unsigned long cl_dirty_transit; /* dirty synchronous */
unsigned long cl_avail_grant; /* bytes of credit for ost */
unsigned long cl_lost_grant; /* lost credits (trunc) */
/* grant consumed for dirty pages */
|
| Comment by Neil Brown [ 03/Apr/21 ] |
|
This looks to be a badly ported patch. Before the patch, osc_consume_write_grant() increments obd_dirty_pages. After the patch it doesn't. There are two callers of osc_consume_write_grant(). One of them, in osc_enter_cache_try() now increments obd_dirty_pages before calling osc_consume_write_grant() The other, in osc_queue_sync_pages(), hasn't been updated. Presumably it needs an increment. Though there is already an increment after the call. So the patch makes a behaviour change which wasn't intended. I don't know what the correct behaviour is. @mpershin you did the backport: can you check it?
|
| Comment by Mikhail Pershin [ 08/Apr/21 ] |
|
Sure, I am checking that . These changes are combination of two patches actually: first ID is Ia047affc33fb9277e6c28a8f6d7d088c385b51a8 and next one is already referred f92c7a161242c478658af09159a127bc21cba611 Their ports to 2.12 base were a bit different from master branch so I am checking if some functionality was broken. |
| Comment by Mikhail Pershin [ 08/Apr/21 ] |
|
I don't see problems with patch itself. Increment in osc_consume_write_grant() was removed because it is done by atomic_long_add_return() now outside that call and it is done in both places where it is called. But maybe the patch " |
| Comment by Stephen Champion [ 17/May/21 ] |
|
@John Hammond Good idea! I tried re-introducing the padding to no effect on x86_64. 128-bit cache line architectures may have different results. |
| Comment by Peter Jones [ 28/May/21 ] |
|
Did you get any updated results on this Steve? |
| Comment by Patrick Farrell [ 14/Jun/21 ] |
|
I'm pretty confident this and I'm going to leave this one open for now, but I think |