There's some profiling information collected during testing in automatic mode. I hope it can be useful.
There's no significant difference in open(O_CREAT) processing time, but close() seems to have become slower:
LU-11164:
- open/close 5948 (time 1551093589.39 total 10.00 last 594.79)
- open/close 10000 (time 1551093596.20 total 16.82 last 594.22)
- open/close 15658 (time 1551093606.20 total 26.82 last 565.79)
total: 20000 open/close in 34.55 seconds: 578.94 ops/second
ll_atomic_open:22765503.0:20000:1138.0 ll_file_release:9271202.0:20000:463.0 ll_inode_revalidate:850446.0:40005:21.0 ll_file_open:326223.0:20000:16.0
lmv_intent_lock:21263116.0:60005:354.0
mdc_enqueue_base:20668025.0:20002:1033.0 mdc_close:8846957.0:20000:442.0 mdc_intent_lock:21006833.0:60005:350.0
ldlm_cli_enqueue:19807126.0:20002:990.0 ldlm_handle_enqueue0:12404488.0:20002:620.0 ldlm_cli_enqueue_fini:870123.0:20002:43.0
ptlrpc_queue_wait:27100485.0:40004:677.0 ptlrpc_server_handle_request:19790070.0:66468:297.0
mdt_reint_open:11116636.0:20000:555.0 mdt_close:2967136.0:20000:148.0 mdt_intent_getattr:147.0:2:73.0
mdd_create:5972754.0:20000:298.0 mdd_create_object:2055115.0:20000:102.0 mdd_trans_start:1641342.0:60000:27.0 mdd_trans_stop:1128474.0:60000:18.0 mdd_update_time:131643.0:20000:6.0 mdd_changelog_ns_store:11204.0:20000:0.0
ldiskfs_create_inode:1199474.0:20000:59.0
pre-LU-11164:
- open/close 6245 (time 1551094193.65 total 10.00 last 624.45)
- open/close 10000 (time 1551094199.67 total 16.02 last 623.75)
- open/close 15804 (time 1551094209.68 total 26.02 last 580.35)
total: 20000 open/close in 33.23 seconds: 601.83 ops/second
ll_atomic_open:22203052.0:20000:1110.0 ll_file_release:8576370.0:20000:428.0 ll_inode_revalidate:823214.0:40005:20.0 ll_file_open:321143.0:20000:16.0
lmv_intent_lock:20684255.0:60005:344.0
mdc_enqueue_base:20090571.0:20002:1004.0 mdc_close:8149427.0:20000:407.0 mdc_intent_lock:20431136.0:60005:340.0
ldlm_cli_enqueue:19175889.0:20002:958.0 ldlm_handle_enqueue0:12598163.0:20002:629.0 ldlm_cli_enqueue_fini:908190.0:20002:45.0
ptlrpc_queue_wait:25720604.0:40004:642.0 ptlrpc_server_handle_request:19790636.0:41798:473.0
mdt_reint_open:11314481.0:20000:565.0 mdt_close:2930872.0:20000:146.0 mdt_intent_getattr:157.0:2:78.0
mdd_create:6212753.0:20000:310.0 mdd_create_object:2052026.0:20000:102.0 mdd_trans_start:1929037.0:60000:32.0 mdd_trans_stop:1121551.0:60000:18.0 mdd_update_time:130001.0:20000:6.0 mdd_changelog_ns_store:9368.0:20000:0.0
ldiskfs_create_inode:1182995.0:20000:59.0
open times look similar (1138.0 us vs 1110.0 us per operation, total 22.7 mln us vs 22.2 mln us), but close times have changed (463 us vs 428 us, 10% difference, total 9.2 mln us vs 8.5 mln us).
The results stay similar for the neighbor commits.
I'm not an expert in this patch, but isn't it performing lu_env_init()/lu_env_fini() in ptlrpc_set_wait() when this sequence isn't really needed?
This observation seems to be backed by the posted profiling data:
mdt_close() execution times are equal with and without the patch: 146/148 us, the differences between ll_file_release() and mdc_close() execution times are equal too: 463-442=21=428-407.
Apparently, the observed performance drop happens between mdc_close() and mdt_close() which smoothly correlates with the possible lu_env_init()/lu_env_fini() overhead in ptlrpc_set_wait().