[LU-6381] replace global dq_state_lock/dq_list_lock with per-sb spinlocks and per-sb hash table. Created: 18/Mar/15 Updated: 20/Jul/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Di Wang | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
[3/18/15, 11:01:00 AM] Andreas Dilger: The other thought I had was to replace the global dq_state_lock and dq_list_lock with per-sb spinlocks and a per-sb hash table, and use a separate lock for the dq_list_lock for the quota format calls |
| Comments |
| Comment by James A Simmons [ 18/Mar/15 ] |
|
Will this be a special kernel patch needed or will it be at the lustre level? |
| Comment by Di Wang [ 18/Mar/15 ] |
|
I believe it will be changed in the kernel patch. |
| Comment by Jodi Levi (Inactive) [ 18/Mar/15 ] |
|
Niu, |
| Comment by Andreas Dilger [ 18/Mar/15 ] |
|
The scalability of dqget() is poor because it is holding two global spinlocks for code that is shared across multiple filesystems. In Several things that could be done to improve this:
|
| Comment by Andreas Dilger [ 18/Mar/15 ] |
|
Please also post patches to the upstream linux-fsdevel mailing list for review and feedback. |
| Comment by Niu Yawei (Inactive) [ 19/Mar/15 ] |
Andreas, do we have the performance & oprofile data for multiple MDTs on same MDS? I didn't find it in IU-4 (there is only numbers for multiple MDTs with quota disabled). It would be interesting if we can compare it with the data when quota disabled, and it's even better to do such comparison for multiple MDTs vs. single MDT (with quota enabled), so we can verify that if there is other bottlenecks besides these two global locks. I'm not sure if it's proper to ask Nathan from IU to the test for us? An interesting observation is that unlink test wasn't affected by the quota global locks like mknod test. (unlink calls dqput() which will get the dq_list_lock), see the oprofile of unlink test (64 threads, 32 mnt, single MDT, Lustre 2.6): CPU: Intel Architectural Perfmon, speed 3292.01 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 vma samples % image name app name symbol name ffffffff811be860 831572 4.7333 vmlinux vmlinux __find_get_block_slow 0000000000032bd0 523313 2.9787 obdclass.ko obdclass.ko class_handle2object ffffffff811701c0 328672 1.8708 vmlinux vmlinux kmem_cache_free ffffffff811bf060 319529 1.8188 vmlinux vmlinux __find_get_block 000000000001f6c0 288861 1.6442 libcfs.ko libcfs.ko cfs_percpt_lock 0000000000031cd0 264236 1.5040 obdclass.ko obdclass.ko lprocfs_counter_add 0000000000050530 234202 1.3331 obdclass.ko obdclass.ko lu_context_key_get ffffffff81058e10 209186 1.1907 vmlinux vmlinux task_rq_lock ffffffff8128f490 165519 0.9421 vmlinux vmlinux memset ffffffff811708f0 150077 0.8542 vmlinux vmlinux kfree The oprofile for mknod test looks like (64 threads, 32 mnt, Lustre 2.6): CPU: Intel Architectural Perfmon, speed 3292.01 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 vma samples % image name app name symbol name ffffffff811eb7f0 4744016 21.8990 vmlinux vmlinux dqput ffffffff811eb180 3570862 16.4836 vmlinux vmlinux dquot_mark_dquot_dirty ffffffff811ecb70 2488686 11.4881 vmlinux vmlinux dqget ffffffff811be860 436818 2.0164 vmlinux vmlinux __find_get_block_slow 0000000000002630 383431 1.7700 ldiskfs.ko ldiskfs.ko ldiskfs_check_dir_entry ffffffff811bf060 297226 1.3720 vmlinux vmlinux __find_get_block 0000000000026690 147702 0.6818 ldiskfs.ko ldiskfs.ko ldiskfs_dx_find_entry 0000000000050530 147313 0.6800 obdclass.ko obdclass.ko lu_context_key_get 000000000000a6a0 130593 0.6028 jbd2.ko jbd2.ko jbd2_journal_add_journal_head ffffffff81058e10 121861 0.5625 vmlinux vmlinux task_rq_lock 0000000000031cd0 121519 0.5609 obdclass.ko obdclass.ko lprocfs_counter_add And there is actually another global quota lock 'dq_data_lock', which is used on each inode/block allocating/deleting, but I'm not quite sure why the contention on this lock is negligible (as showed by oprofile data). |