[LU-16691] optimize ldiskfs prealloc (PA) under random read workloads - Whamcloud Community JIRA

Details

Type: Improvement
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.16.0
Affects Version/s: Lustre 2.16.0, Lustre 2.15.2
Labels:
- ldiskfs

Rank (Obsolete):
9223372036854775807

Description

In some cases, ldiskfs block allocation can consume a large amount of CPU cycles handling block allocations and cause OST threads to become blocked:

crmd[16542]:  notice: High CPU load detected: 261.019989
crmd[16542]:  notice: High CPU load detected: 258.720001
crmd[16542]:  notice: High CPU load detected: 265.029999
crmd[16542]:  notice: High CPU load detected: 270.309998

 INFO: task ll_ost00_027:20788 blocked for more than 90 seconds.
 ll_ost00_027    D ffff92242eda9080     0 20788      2 0x00000080
 Call Trace:
 schedule+0x29/0x70
 wait_transaction_locked+0x85/0xd0 [jbd2]
 add_transaction_credits+0x278/0x310 [jbd2]
 start_this_handle+0x1a1/0x430 [jbd2]
 jbd2__journal_start+0xf3/0x1f0 [jbd2]
 __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
 osd_trans_start+0x1e7/0x570 [osd_ldiskfs]
 ofd_trans_start+0x75/0xf0 [ofd]
 ofd_attr_set+0x586/0xb00 [ofd]
 ofd_setattr_hdl+0x31d/0x960 [ofd]
 tgt_request_handle+0xb7e/0x1700 [ptlrpc]
 ptlrpc_server_handle_request+0x253/0xbd0 [ptlrpc]
 ptlrpc_main+0xc09/0x1c30 [ptlrpc]

Perf stats show that a large amount of CPU time is used in preallocation:

Samples: 86M of event 'cycles', 4000 Hz, Event count (approx.): 25480688920 lost: 0/0 drop: 0/0
Overhead  Shared Object               Symbol
  23,81%  [kernel]                    [k] _raw_qspin_lock
  21,90%  [kernel]                    [k] ldiskfs_mb_use_preallocated
  20,16%  [kernel]                    [k] __raw_callee_save___pv_queued_spin_unlock
  15,46%  [kernel]                    [k] ldiskfs_mb_normalize_request
   1,21%  [kernel]                    [k] rwsem_spin_on_owner
   0,98%  [kernel]                    [k] native_write_msr_safe
   0,54%  [kernel]                    [k] apic_timer_interrupt
   0,51%  [kernel]                    [k] ktime_get

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

oss07.perf.svg
1.28 MB
31/Mar/23 4:58 AM

Issue Links

is related to

LU-12970 improve mballoc for huge filesystems

Open

Activity

People

Assignee:: Alex Zhuravlev

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 31/Mar/23 5:01 AM

Updated:: 20/Jun/25 9:57 AM

Resolved:: 09/Jul/23 2:25 PM