[LU-18463] mdtest perfomance degradation after LU-16973 - Whamcloud Community JIRA

Gerrit Updater added a comment - 15/Jan/25 1:02 PM

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57771
Subject: ~~LU-18463~~ ptlrpc: removing cfs_flush_fput idle
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 449c576282d6aca5dea50e97837b05210483ee68

Gerrit Updater added a comment - 15/Jan/25 1:02 PM "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57771 Subject: LU-18463 ptlrpc: removing cfs_flush_fput idle Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 449c576282d6aca5dea50e97837b05210483ee68

Peter Jones made changes - 20/Dec/24 10:16 PM

Link

Original: This issue is related to JFC-21 [ JFC-21 ]

Peter Jones made changes - 16/Dec/24 3:25 PM

Fix Version/s		New: Lustre 2.17.0 [ 16192 ]
Resolution		New: Fixed [ 1 ]
Status	Original: Open [ 1 ]	New: Resolved [ 5 ]

Peter Jones added a comment - 16/Dec/24 3:25 PM

Merged for 2.17

Peter Jones added a comment - 16/Dec/24 3:25 PM Merged for 2.17

Gerrit Updater added a comment - 16/Dec/24 8:03 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57073/
Subject: ~~LU-18463~~ ptlrpc: removing cfs_flush_fput idle
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6a3d7634a1fa0f7af14a8288cdd595bd6f8579eb

Gerrit Updater added a comment - 16/Dec/24 8:03 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57073/ Subject: LU-18463 ptlrpc: removing cfs_flush_fput idle Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6a3d7634a1fa0f7af14a8288cdd595bd6f8579eb

Andreas Dilger made changes - 29/Nov/24 8:23 PM

Link

New: This issue is related to ~~LU-18152~~ [ ~~LU-18152~~ ]

Andreas Dilger made changes - 29/Nov/24 8:05 PM

Link

New: This issue is related to ~~LU-16973~~ [ ~~LU-16973~~ ]

Alexander Boyko added a comment - 26/Nov/24 11:54 AM - edited

adilger

Was any investigation done here to see why this thread is not "idle"? Was cfs_flush_delayed_fput() just called too often, or was this not correctly detecting the idle state?

WIth cfs_flush_delayed_fput() inside ptlrpc, we dedicate one rpc thread to a flushing. With a previous high load, it could be unavailable for a general req processing for a long time. For example mdtest - high io, than it do sync and wait all threads are done, MDT ptlrpc became idle for a moment, one thread starts to flush, and mdtest starts a second step for a testing. Also all 200+ ptlrpc_main() would call flush_delayed_fput() when idle, but only one do all work. It is better to have ptlrpc_main() as small as possible.

The problem with a perf regression was a bit complicated. The most important thing is context switch during a request processing.
During testing we saw a good mdt_reint_unlink takes only 115 micro seconds, and a bad one ten times more.
mdt_reint_unlink[120822407.0/1048657=115.0]
mdt_reint_unlink[1170551560.0/1048657=1116.0]
the next is a profiling of a bad mdt_reint_unlink, it took 23751 micro seconds. Where the most time was spend during lu_object_free context switch.

16) <...>-2682981  |  ....              |  mdt_reint_unlink [mdt]() {
 16) <...>-2682981  |  ....  0.051 us    |    ktime_get();
 16) <...>-2682981  |  ....              |    ldlm_request_cancel [ptlrpc]() {
 16) <...>-2682981  |  ....  0.060 us    |      req_capsule_get_size [ptlrpc]();
 16) <...>-2682981  |  ....  0.040 us    |      lustre_msg_get_flags [ptlrpc]();
 16) <...>-2682981  |  ....  0.271 us    |      __ldlm_handle2lock [ptlrpc]();
 16) <...>-2682981  |  ....  0.030 us    |      ldlm_resource_getref [ptlrpc]();
 16) <...>-2682981  |  ....  0.060 us    |      mdt_lvbo_update [mdt]();
 16) <...>-2682981  |  ....  2.314 us    |      ldlm_lock_cancel [ptlrpc]();
 16) <...>-2682981  |  ....  0.671 us    |      ldlm_lock_put [ptlrpc]();
 16) <...>-2682981  |  ....  0.160 us    |      ldlm_reprocess_all [ptlrpc]();
 16) <...>-2682981  |  ....  0.621 us    |      ldlm_resource_putref [ptlrpc]();
 16) <...>-2682981  |  ....  6.602 us    |    } /* ldlm_request_cancel [ptlrpc] */
 16) <...>-2682981  |  ....              |    mdt_object_find [mdt]() {
 16) <...>-2682981  |  ....  0.691 us    |      lu_object_find [obdclass]();
 16) <...>-2682981  |  ....  0.921 us    |    } /* mdt_object_find [mdt] */
 16) <...>-2682981  |  ....              |    mdt_lock_pdo_init [mdt]() {
 16) <...>-2682981  |  ....  0.030 us    |      full_name_hash();
 16) <...>-2682981  |  ....  0.290 us    |    } /* mdt_lock_pdo_init [mdt] */
 16) <...>-2682981  |  ....              |    mdt_reint_object_lock [mdt]() {
 16) <...>-2682981  |  ....+ 11.602 us   |      mdt_object_lock_internal [mdt]();
 16) <...>-2682981  |  ....+ 11.822 us   |    } /* mdt_reint_object_lock [mdt] */
 16) <...>-2682981  |  ....              |    mdt_version_get_check_save [mdt]() {
 16) <...>-2682981  |  ....  0.411 us    |      mdt_obj_version_get.isra.35 [mdt]();
 16) <...>-2682981  |  ....  0.051 us    |      lustre_msg_get_flags [ptlrpc]();
 16) <...>-2682981  |  ....  0.130 us    |      mdt_version_save [mdt]();
 16) <...>-2682981  |  ....  1.152 us    |    } /* mdt_version_get_check_save [mdt] */
 16) <...>-2682981  |  ....              |    mdt_lookup_version_check [mdt]() {
 16) <...>-2682981  |  ....  5.911 us    |      mdd_lookup [mdd]();
 16) <...>-2682981  |  ....  0.050 us    |      lustre_msg_get_flags [ptlrpc]();
 16) <...>-2682981  |  ....  6.372 us    |    } /* mdt_lookup_version_check [mdt] */
 16) <...>-2682981  |  ....              |    mdt_object_find [mdt]() {
 16) <...>-2682981  |  ....  0.371 us    |      lu_object_find [obdclass]();
 16) <...>-2682981  |  ....  0.531 us    |    } /* mdt_object_find [mdt] */
 16) <...>-2682981  |  ....              |    lu_object_find_slice [obdclass]() {
 16) <...>-2682981  |  ....  0.200 us    |      lu_object_find_at [obdclass]();
 16) <...>-2682981  |  ....  0.551 us    |    } /* lu_object_find_slice [obdclass] */
 16) <...>-2682981  |  ....              |    osd_xattr_get [osd_ldiskfs]() {
 16) <...>-2682981  |  ....  0.030 us    |      lu_context_key_get [obdclass]();
 16) <...>-2682981  |  ....  2.064 us    |      __vfs_getxattr();
 16) <...>-2682981  |  ....  2.535 us    |    } /* osd_xattr_get [osd_ldiskfs] */
 16) <...>-2682981  |  ....              |    lu_object_put [obdclass]() {
 16) <...>-2682981  |  ....  0.250 us    |      __wake_up();
 16) <...>-2682981  |  ....  0.662 us    |    } /* lu_object_put [obdclass] */
 16) <...>-2682981  |  ....  0.030 us    |    mdt_lock_reg_init [mdt]();
 16) <...>-2682981  |  ....              |    mdt_reint_striped_lock [mdt]() {
 16) <...>-2682981  |  ....  3.787 us    |      mdt_reint_object_lock [mdt]();
 16) <...>-2682981  |  ....  0.371 us    |      lu_object_find_slice [obdclass]();
 16) <...>-2682981  |  ....  0.692 us    |      osd_xattr_get [osd_ldiskfs]();
 16) <...>-2682981  |  ....  0.130 us    |      lu_object_put [obdclass]();
 16) <...>-2682981  |  ....  6.012 us    |    } /* mdt_reint_striped_lock [mdt] */
 16) <...>-2682981  |  ....              |    mdt_version_get_save [mdt]() {
 16) <...>-2682981  |  ....  0.040 us    |      lustre_msg_get_flags [ptlrpc]();
 16) <...>-2682981  |  ....  0.120 us    |      mdt_obj_version_get.isra.35 [mdt]();
 16) <...>-2682981  |  ....  0.090 us    |      mdt_version_save [mdt]();
 16) <...>-2682981  |  ....  0.741 us    |    } /* mdt_version_get_save [mdt] */
 16) <...>-2682981  |  ....              |    mutex_lock() {
16) <...>-2682981  |  ....  0.040 us    |      _cond_resched();
 16) <...>-2682981  |  ....  0.220 us    |    } /* mutex_lock */
 16) <...>-2682981  |  ....              |    mdd_unlink [mdd]() {
 16) <...>-2682981  |  ....  0.040 us    |      mdd_env_info [mdd]();
 16) <...>-2682981  |  ....  0.040 us    |      mdd_env_info [mdd]();
 16) <...>-2682981  |  ....  0.040 us    |      mdd_env_info [mdd]();
 16) <...>-2682981  |  ....  0.601 us    |      mdd_la_get [mdd]();
 16) <...>-2682981  |  ....  0.200 us    |      mdd_la_get [mdd]();
 16) <...>-2682981  |  ....  1.472 us    |      mdd_hsm_archive_exists [mdd]();
 16) <...>-2682981  |  ....  5.941 us    |      mdd_unlink_sanity_check [mdd]();
 16) <...>-2682981  |  ....  1.393 us    |      mdd_trans_create [mdd]();
 16) <...>-2682981  |  ....  0.131 us    |      mdd_env_info [mdd]();
 16) <...>-2682981  |  ....  0.110 us    |      dt_try_as_dir [obdclass]();
 16) <...>-2682981  |  ....  1.182 us    |      lod_declare_delete [lod]();
 16) <...>-2682981  |  ....  0.211 us    |      lod_declare_ref_del [lod]();
 16) <...>-2682981  |  ....  0.882 us    |      lod_declare_attr_set [lod]();
 16) <...>-2682981  |  d...  ==========> |
 16) <...>-2682981  |  d...  3.336 us    |      do_IRQ();
 16) <...>-2682981  |  d...  <========== |
 16) <...>-2682981  |  ....  0.270 us    |      lod_declare_ref_del [lod]();
 16) <...>-2682981  |  ....  0.110 us    |      lod_declare_ref_del [lod]();
 16) <...>-2682981  |  ....  1.413 us    |      lod_declare_attr_set [lod]();
 16) <...>-2682981  |  ....  6.422 us    |      mdd_declare_finish_unlink [mdd]();
 16) <...>-2682981  |  ....  0.050 us    |      mdd_declare_changelog_store [mdd]();
 16) <...>-2682981  |  ....  6.422 us    |      mdd_trans_start [mdd]();
 16) <...>-2682981  |  ....  0.380 us    |      mdd_write_lock [mdd]();
 16) <...>-2682981  |  ....  6.833 us    |      __mdd_index_delete [mdd]();
 16) <...>-2682981  |  ....  3.086 us    |      lod_ref_del [lod]();
 16) <...>-2682981  |  ....  1.523 us    |      lod_ref_del [lod]();
 16) <...>-2682981  |  ....  0.200 us    |      mdd_la_get [mdd]();
 16) <...>-2682981  |  ....  1.864 us    |      mdd_update_time [mdd]();
 16) <...>-2682981  |  ....  1.453 us    |      mdd_update_time [mdd]();
 16) <...>-2682981  |  ....+ 17.032 us   |      mdd_finish_unlink [mdd]();
 16) <...>-2682981  |  ....  0.120 us    |      mdd_la_get [mdd]();
 16) <...>-2682981  |  ....  0.160 us    |      mdd_write_unlock [mdd]();
 16) <...>-2682981  |  ....  0.080 us    |      mdd_changelog_ns_store [mdd]();
 16) <...>-2682981  |  ....  8.286 us    |      mdd_trans_stop [mdd]();
 16) <...>-2682981  |  ....+ 79.349 us   |    } /* mdd_unlink [mdd] */
 16) <...>-2682981  |  ....  0.030 us    |    mutex_unlock();
 16) <...>-2682981  |  ....              |    mdt_handle_last_unlink [mdt]() {
 16) <...>-2682981  |  ....  0.211 us    |      req_capsule_server_get [ptlrpc]();
 16) <...>-2682981  |  ....  0.191 us    |      mdt_pack_attr2body [mdt]();
 16) <...>-2682981  |  ....  0.862 us    |    } /* mdt_handle_last_unlink [mdt] */
 16) <...>-2682981  |  ....  0.050 us    |    ktime_get();
 16) <...>-2682981  |  ....              |    mdt_counter_incr [mdt]() {
 16) <...>-2682981  |  ....  0.100 us    |      lprocfs_counter_add [obdclass]();
 16) <...>-2682981  |  ....  0.111 us    |      lprocfs_counter_add [obdclass]();
 16) <...>-2682981  |  ....  0.060 us    |      lustre_msg_get_jobid [ptlrpc]();
 16) <...>-2682981  |  ....  0.060 us    |      lprocfs_job_stats_log [obdclass]();
 16) <...>-2682981  |  ....  1.012 us    |    } /* mdt_counter_incr [mdt] */
 16) <...>-2682981  |  ....              |    mdt_reint_striped_unlock [mdt]() {
 16) <...>-2682981  |  d...  ==========> |
 16) <...>-2682981  |  d...  3.476 us    |      do_IRQ();
 16) <...>-2682981  |  d...  <========== |
 16) <...>-2682981  |  ....  0.410 us    |      mdt_object_unlock [mdt]();
 16) <...>-2682981  |  ....  4.599 us    |    } /* mdt_reint_striped_unlock [mdt] */
 16) <...>-2682981  |  ....              |    lu_object_put [obdclass]() {
 16) <...>-2682981  |  ....  0.030 us    |      _raw_spin_lock();
 16) <...>-2682981  |  ....  0.070 us    |      osd_object_release [osd_ldiskfs]();
 16) <...>-2682981  |  ....  0.030 us    |      lod_object_release [lod]();
 16) <...>-2682981  |  ....  0.030 us    |      lu_fid_hash [obdclass]();
 16) <...>-2682981  |  ....  0.030 us    |      _raw_spin_lock_bh();
 16) <...>-2682981  |  ....  0.070 us    |      _raw_spin_unlock_bh();
 16) <...>-2682981  |  ....              |      lu_object_free.isra.36 [obdclass]() {
 16) <...>-2682981  => <...>-2682997
 16) <...>-2683002  => <...>-2682981
 16) <...>-2682981  |  ....* 23614.85 us |      } /* lu_object_free.isra.36 [obdclass] */
 16) <...>-2682981  |  ....* 23617.11 us |    } /* lu_object_put [obdclass] */
 16) <...>-2682981  |  ....              |    mdt_object_unlock [mdt]() {
 16) <...>-2682981  |  ....  3.196 us    |      mdt_save_lock [mdt]();
 16) <...>-2682981  |  ....  0.401 us    |      mdt_save_lock [mdt]();
 16) <...>-2682981  |  ....  3.927 us    |    } /* mdt_object_unlock [mdt] */
 16) <...>-2682981  |  ....              |    lu_object_put [obdclass]() {
 16) <...>-2682981  |  ....  0.040 us    |      _raw_spin_lock();
 16) <...>-2682981  |  ....  0.070 us    |      osd_object_release [osd_ldiskfs]();
 16) <...>-2682981  |  ....  0.030 us    |      lod_object_release [lod]();
 16) <...>-2682981  |  ....  1.022 us    |    } /* lu_object_put [obdclass] */
 16) <...>-2682981  |  ....* 23751.36 us |  } /* mdt_reint_unlink [mdt] */

In addition to improve this thing we did

ext4-dquot-commit-speedup.patch
sysctl -w kernel.sched_cfs_bandwidth_slice_us=50000

Also it would be good to move last object put outside of mdt_handler, and call it after reply. And maybe some yield() after request processing.

Alexander Boyko added a comment - 26/Nov/24 11:54 AM - edited adilger Was any investigation done here to see why this thread is not "idle"? Was cfs_flush_delayed_fput() just called too often, or was this not correctly detecting the idle state? WIth cfs_flush_delayed_fput() inside ptlrpc, we dedicate one rpc thread to a flushing. With a previous high load, it could be unavailable for a general req processing for a long time. For example mdtest - high io, than it do sync and wait all threads are done, MDT ptlrpc became idle for a moment, one thread starts to flush, and mdtest starts a second step for a testing. Also all 200+ ptlrpc_main() would call flush_delayed_fput() when idle, but only one do all work. It is better to have ptlrpc_main() as small as possible. The problem with a perf regression was a bit complicated. The most important thing is context switch during a request processing. During testing we saw a good mdt_reint_unlink takes only 115 micro seconds, and a bad one ten times more. mdt_reint_unlink [120822407.0/1048657=115.0] mdt_reint_unlink [1170551560.0/1048657=1116.0] the next is a profiling of a bad mdt_reint_unlink, it took 23751 micro seconds. Where the most time was spend during lu_object_free context switch. 16) <...>-2682981 | .... | mdt_reint_unlink [mdt]() { 16) <...>-2682981 | .... 0.051 us | ktime_get(); 16) <...>-2682981 | .... | ldlm_request_cancel [ptlrpc]() { 16) <...>-2682981 | .... 0.060 us | req_capsule_get_size [ptlrpc](); 16) <...>-2682981 | .... 0.040 us | lustre_msg_get_flags [ptlrpc](); 16) <...>-2682981 | .... 0.271 us | __ldlm_handle2lock [ptlrpc](); 16) <...>-2682981 | .... 0.030 us | ldlm_resource_getref [ptlrpc](); 16) <...>-2682981 | .... 0.060 us | mdt_lvbo_update [mdt](); 16) <...>-2682981 | .... 2.314 us | ldlm_lock_cancel [ptlrpc](); 16) <...>-2682981 | .... 0.671 us | ldlm_lock_put [ptlrpc](); 16) <...>-2682981 | .... 0.160 us | ldlm_reprocess_all [ptlrpc](); 16) <...>-2682981 | .... 0.621 us | ldlm_resource_putref [ptlrpc](); 16) <...>-2682981 | .... 6.602 us | } /* ldlm_request_cancel [ptlrpc] */ 16) <...>-2682981 | .... | mdt_object_find [mdt]() { 16) <...>-2682981 | .... 0.691 us | lu_object_find [obdclass](); 16) <...>-2682981 | .... 0.921 us | } /* mdt_object_find [mdt] */ 16) <...>-2682981 | .... | mdt_lock_pdo_init [mdt]() { 16) <...>-2682981 | .... 0.030 us | full_name_hash(); 16) <...>-2682981 | .... 0.290 us | } /* mdt_lock_pdo_init [mdt] */ 16) <...>-2682981 | .... | mdt_reint_object_lock [mdt]() { 16) <...>-2682981 | ....+ 11.602 us | mdt_object_lock_internal [mdt](); 16) <...>-2682981 | ....+ 11.822 us | } /* mdt_reint_object_lock [mdt] */ 16) <...>-2682981 | .... | mdt_version_get_check_save [mdt]() { 16) <...>-2682981 | .... 0.411 us | mdt_obj_version_get.isra.35 [mdt](); 16) <...>-2682981 | .... 0.051 us | lustre_msg_get_flags [ptlrpc](); 16) <...>-2682981 | .... 0.130 us | mdt_version_save [mdt](); 16) <...>-2682981 | .... 1.152 us | } /* mdt_version_get_check_save [mdt] */ 16) <...>-2682981 | .... | mdt_lookup_version_check [mdt]() { 16) <...>-2682981 | .... 5.911 us | mdd_lookup [mdd](); 16) <...>-2682981 | .... 0.050 us | lustre_msg_get_flags [ptlrpc](); 16) <...>-2682981 | .... 6.372 us | } /* mdt_lookup_version_check [mdt] */ 16) <...>-2682981 | .... | mdt_object_find [mdt]() { 16) <...>-2682981 | .... 0.371 us | lu_object_find [obdclass](); 16) <...>-2682981 | .... 0.531 us | } /* mdt_object_find [mdt] */ 16) <...>-2682981 | .... | lu_object_find_slice [obdclass]() { 16) <...>-2682981 | .... 0.200 us | lu_object_find_at [obdclass](); 16) <...>-2682981 | .... 0.551 us | } /* lu_object_find_slice [obdclass] */ 16) <...>-2682981 | .... | osd_xattr_get [osd_ldiskfs]() { 16) <...>-2682981 | .... 0.030 us | lu_context_key_get [obdclass](); 16) <...>-2682981 | .... 2.064 us | __vfs_getxattr(); 16) <...>-2682981 | .... 2.535 us | } /* osd_xattr_get [osd_ldiskfs] */ 16) <...>-2682981 | .... | lu_object_put [obdclass]() { 16) <...>-2682981 | .... 0.250 us | __wake_up(); 16) <...>-2682981 | .... 0.662 us | } /* lu_object_put [obdclass] */ 16) <...>-2682981 | .... 0.030 us | mdt_lock_reg_init [mdt](); 16) <...>-2682981 | .... | mdt_reint_striped_lock [mdt]() { 16) <...>-2682981 | .... 3.787 us | mdt_reint_object_lock [mdt](); 16) <...>-2682981 | .... 0.371 us | lu_object_find_slice [obdclass](); 16) <...>-2682981 | .... 0.692 us | osd_xattr_get [osd_ldiskfs](); 16) <...>-2682981 | .... 0.130 us | lu_object_put [obdclass](); 16) <...>-2682981 | .... 6.012 us | } /* mdt_reint_striped_lock [mdt] */ 16) <...>-2682981 | .... | mdt_version_get_save [mdt]() { 16) <...>-2682981 | .... 0.040 us | lustre_msg_get_flags [ptlrpc](); 16) <...>-2682981 | .... 0.120 us | mdt_obj_version_get.isra.35 [mdt](); 16) <...>-2682981 | .... 0.090 us | mdt_version_save [mdt](); 16) <...>-2682981 | .... 0.741 us | } /* mdt_version_get_save [mdt] */ 16) <...>-2682981 | .... | mutex_lock() { 16) <...>-2682981 | .... 0.040 us | _cond_resched(); 16) <...>-2682981 | .... 0.220 us | } /* mutex_lock */ 16) <...>-2682981 | .... | mdd_unlink [mdd]() { 16) <...>-2682981 | .... 0.040 us | mdd_env_info [mdd](); 16) <...>-2682981 | .... 0.040 us | mdd_env_info [mdd](); 16) <...>-2682981 | .... 0.040 us | mdd_env_info [mdd](); 16) <...>-2682981 | .... 0.601 us | mdd_la_get [mdd](); 16) <...>-2682981 | .... 0.200 us | mdd_la_get [mdd](); 16) <...>-2682981 | .... 1.472 us | mdd_hsm_archive_exists [mdd](); 16) <...>-2682981 | .... 5.941 us | mdd_unlink_sanity_check [mdd](); 16) <...>-2682981 | .... 1.393 us | mdd_trans_create [mdd](); 16) <...>-2682981 | .... 0.131 us | mdd_env_info [mdd](); 16) <...>-2682981 | .... 0.110 us | dt_try_as_dir [obdclass](); 16) <...>-2682981 | .... 1.182 us | lod_declare_delete [lod](); 16) <...>-2682981 | .... 0.211 us | lod_declare_ref_del [lod](); 16) <...>-2682981 | .... 0.882 us | lod_declare_attr_set [lod](); 16) <...>-2682981 | d... ==========> | 16) <...>-2682981 | d... 3.336 us | do_IRQ(); 16) <...>-2682981 | d... <========== | 16) <...>-2682981 | .... 0.270 us | lod_declare_ref_del [lod](); 16) <...>-2682981 | .... 0.110 us | lod_declare_ref_del [lod](); 16) <...>-2682981 | .... 1.413 us | lod_declare_attr_set [lod](); 16) <...>-2682981 | .... 6.422 us | mdd_declare_finish_unlink [mdd](); 16) <...>-2682981 | .... 0.050 us | mdd_declare_changelog_store [mdd](); 16) <...>-2682981 | .... 6.422 us | mdd_trans_start [mdd](); 16) <...>-2682981 | .... 0.380 us | mdd_write_lock [mdd](); 16) <...>-2682981 | .... 6.833 us | __mdd_index_delete [mdd](); 16) <...>-2682981 | .... 3.086 us | lod_ref_del [lod](); 16) <...>-2682981 | .... 1.523 us | lod_ref_del [lod](); 16) <...>-2682981 | .... 0.200 us | mdd_la_get [mdd](); 16) <...>-2682981 | .... 1.864 us | mdd_update_time [mdd](); 16) <...>-2682981 | .... 1.453 us | mdd_update_time [mdd](); 16) <...>-2682981 | ....+ 17.032 us | mdd_finish_unlink [mdd](); 16) <...>-2682981 | .... 0.120 us | mdd_la_get [mdd](); 16) <...>-2682981 | .... 0.160 us | mdd_write_unlock [mdd](); 16) <...>-2682981 | .... 0.080 us | mdd_changelog_ns_store [mdd](); 16) <...>-2682981 | .... 8.286 us | mdd_trans_stop [mdd](); 16) <...>-2682981 | ....+ 79.349 us | } /* mdd_unlink [mdd] */ 16) <...>-2682981 | .... 0.030 us | mutex_unlock(); 16) <...>-2682981 | .... | mdt_handle_last_unlink [mdt]() { 16) <...>-2682981 | .... 0.211 us | req_capsule_server_get [ptlrpc](); 16) <...>-2682981 | .... 0.191 us | mdt_pack_attr2body [mdt](); 16) <...>-2682981 | .... 0.862 us | } /* mdt_handle_last_unlink [mdt] */ 16) <...>-2682981 | .... 0.050 us | ktime_get(); 16) <...>-2682981 | .... | mdt_counter_incr [mdt]() { 16) <...>-2682981 | .... 0.100 us | lprocfs_counter_add [obdclass](); 16) <...>-2682981 | .... 0.111 us | lprocfs_counter_add [obdclass](); 16) <...>-2682981 | .... 0.060 us | lustre_msg_get_jobid [ptlrpc](); 16) <...>-2682981 | .... 0.060 us | lprocfs_job_stats_log [obdclass](); 16) <...>-2682981 | .... 1.012 us | } /* mdt_counter_incr [mdt] */ 16) <...>-2682981 | .... | mdt_reint_striped_unlock [mdt]() { 16) <...>-2682981 | d... ==========> | 16) <...>-2682981 | d... 3.476 us | do_IRQ(); 16) <...>-2682981 | d... <========== | 16) <...>-2682981 | .... 0.410 us | mdt_object_unlock [mdt](); 16) <...>-2682981 | .... 4.599 us | } /* mdt_reint_striped_unlock [mdt] */ 16) <...>-2682981 | .... | lu_object_put [obdclass]() { 16) <...>-2682981 | .... 0.030 us | _raw_spin_lock(); 16) <...>-2682981 | .... 0.070 us | osd_object_release [osd_ldiskfs](); 16) <...>-2682981 | .... 0.030 us | lod_object_release [lod](); 16) <...>-2682981 | .... 0.030 us | lu_fid_hash [obdclass](); 16) <...>-2682981 | .... 0.030 us | _raw_spin_lock_bh(); 16) <...>-2682981 | .... 0.070 us | _raw_spin_unlock_bh(); 16) <...>-2682981 | .... | lu_object_free.isra.36 [obdclass]() { 16) <...>-2682981 => <...>-2682997 16) <...>-2683002 => <...>-2682981 16) <...>-2682981 | ....* 23614.85 us | } /* lu_object_free.isra.36 [obdclass] */ 16) <...>-2682981 | ....* 23617.11 us | } /* lu_object_put [obdclass] */ 16) <...>-2682981 | .... | mdt_object_unlock [mdt]() { 16) <...>-2682981 | .... 3.196 us | mdt_save_lock [mdt](); 16) <...>-2682981 | .... 0.401 us | mdt_save_lock [mdt](); 16) <...>-2682981 | .... 3.927 us | } /* mdt_object_unlock [mdt] */ 16) <...>-2682981 | .... | lu_object_put [obdclass]() { 16) <...>-2682981 | .... 0.040 us | _raw_spin_lock(); 16) <...>-2682981 | .... 0.070 us | osd_object_release [osd_ldiskfs](); 16) <...>-2682981 | .... 0.030 us | lod_object_release [lod](); 16) <...>-2682981 | .... 1.022 us | } /* lu_object_put [obdclass] */ 16) <...>-2682981 | ....* 23751.36 us | } /* mdt_reint_unlink [mdt] */ In addition to improve this thing we did ext4-dquot-commit-speedup.patch sysctl -w kernel.sched_cfs_bandwidth_slice_us=50000 Also it would be good to move last object put outside of mdt_handler, and call it after reply. And maybe some yield() after request processing.

Peter Jones made changes - 19/Nov/24 5:30 PM

Link

New: This issue is related to JFC-21 [ JFC-21 ]

Gerrit Updater added a comment - 19/Nov/24 4:42 PM

"Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57073
Subject: ~~LU-18463~~ ptlrpc: removing cfs_flush_fput idle
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 312255b0fe3c7ffce9279078166a4cf5c9f5158c

Gerrit Updater added a comment - 19/Nov/24 4:42 PM "Alexander Boyko <alexander.boyko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57073 Subject: LU-18463 ptlrpc: removing cfs_flush_fput idle Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 312255b0fe3c7ffce9279078166a4cf5c9f5158c

mdtest perfomance degradation after LU-16973

Details

Description

Attachments

Issue Links

Activity

People

Dates