Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
Lustre 2.1.0
-
None
-
3
-
6428
Description
The load average on the MDS for a classified production 2.1 filesystem jumped to over 400. Top showed mdt_rdpg_* threads all using 4-7% CPU time. This may have been due to a pathological workload, but we were wondering if there's something like an overly contended lock in ldiskfs going on here.
Most of the stacks looked like this:
__cond_resched
_cond_resched
ifind_fast
iget_locked
ldiskfs_iget
? generic_detach_inode
osd_iget
osd_ea_fid_get
osd_it_ea_rec
mdd_readpage
cml_readpage
mdt_readpage
? mdt_unpack_req_pack_rep
mdt_handle_common
? lustre_msg_get_transno
mdt_readpage_handle
ptlrpc_main
child_rip
Attachments
Activity
Resolution | New: Fixed [ 1 ] | |
Status | Original: In Progress [ 3 ] | New: Resolved [ 5 ] |
Comment |
[ I cannot find out any special reason(s) to NOT allow FID stored in dirent for upgrading from lustre-1.8 to lustre-2.1. According to Lustre-2.1 implementation, the new created file should has FID stored in dirent in its parent directory. This can be verified by the following: 1) create new directory on the upgraded system. 2) create new files under the new directory. 3) list such new directory. So what the original stack is for? for listing the old files which were created before the upgrade or after the upgrade? ] |
Status | Original: Open [ 1 ] | New: In Progress [ 3 ] |
Assignee | Original: WC Triage [ wc-triage ] | New: Lai Siyao [ laisiyao ] |