Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.1.1
-
3
-
4587
Description
We have two MDT service threads using 100% CPU on a production MDS. I can't get a backtrace from crash because they do not yield the CPU, but based on oprofile they seem to be spinning in cfs_hash_for_each_relax(). At the same time we are seeing client hangs and high lock cancellation rates on the OSTs.
samples % image name app name symbol name 4225020 33.0708 libcfs.ko libcfs.ko cfs_hash_for_each_relax 3345225 26.1843 libcfs.ko libcfs.ko cfs_hash_hh_hhead 532409 4.1674 ptlrpc.ko ptlrpc.ko ldlm_cancel_locks_for_export_cb 307199 2.4046 ptlrpc.ko ptlrpc.ko lock_res_and_lock 175349 1.3725 vmlinux vmlinux native_read_tsc 151989 1.1897 ptlrpc.ko ptlrpc.ko ldlm_del_waiting_lock 136679 1.0698 libcfs.ko libcfs.ko cfs_hash_rw_lock 109269 0.8553 jbd2.ko jbd2.ko journal_clean_one_cp_list
Attachments
Issue Links
- is duplicated by
-
LU-1087 mdt thread spinning out of control
- Resolved
- Trackbacks
-
Changelog 2.1 Changes from version 2.1.2 to version 2.1.3 Server support for kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.13.1.el5 (RHEL5) 2.6.32279.2.1....