Details
-
Improvement
-
Resolution: Fixed
-
Major
-
None
-
None
-
9223372036854775807
Description
It was frequently documented that using e.g. fortran programs that write into same file is kind of slow. The reason is every time you open a file for write in fortran, it adds O_CREAT flag to the open causing the locking to be much less cooperative and when the name is the same, basically every thread gets bottlenecked on that same lock as the opens are processed because we currently decide the locking mode based on open flags only.
Similar problem exists for other kind of creates like mkdirs.
For the open-create case it's in mdt_reint_open():
again:
lh = &info->mti_lh[MDT_LH_PARENT];
mdt_lock_pdo_init(lh,
(create_flags & MDS_OPEN_CREAT) ? LCK_PW : LCK_PR,
&rr->rr_name);
parent = mdt_object_find(info->mti_env, mdt, rr->rr_fid1);
if (IS_ERR(parent))
GOTO(out, result = PTR_ERR(parent));
result = mdt_object_lock(info, parent, lh, MDS_INODELOCK_UPDATE);
Similar code for mkdir/mknod/... in mdt_create
lh = &info->mti_lh[MDT_LH_PARENT]; mdt_lock_pdo_init(lh, LCK_PW, &rr->rr_name); rc = mdt_object_lock(info, parent, lh, MDS_INODELOCK_UPDATE);
It looks like we should be able to do a lockless lookup on the parent and then in the parent (internal fs locking should ensure that the parent does not disappear from under us) then relock with desired read/write mode and relookup. If the file has disappeared by then and we are in PR lock mode for the parent - we need to drop the PR lock and reobtain the PW one and try again.
The risk is there that if many threads do it for non-existing file there's still going to be some number of them stuck on the same ldlm lock, but hopefully fewer than before.