[LU-146] executable files on NFS share failed with "Text file busy" when executed Created: 18/Mar/11 Updated: 03/Oct/11 Due: 31/Mar/11 Resolved: 03/Oct/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | Lustre 1.8.7 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Lai Siyao | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Use lustre as exported filesystem for NFSd. |
||
| Story Points: | 2 |
| Severity: | 3 |
| Bugzilla ID: | 24,437 |
| Epic: | NFS, export, ldlm |
| Rank (Obsolete): | 9725 |
| Description |
|
We have exported the lustre filesystem over nfs from one of the lustre clients(say c1). We mount The following is the error message we get when we try the above. [root@sfsclient8 testfs]# ./hi_world_copy The same problem will occur if this is compiled binary as well. |
| Comments |
| Comment by Oleg Drokin [ 18/Mar/11 ] |
|
There is a patch in bz 24437 for that, though I am not entirely happy with it. |
| Comment by Build Master (Inactive) [ 22/Mar/11 ] |
|
Integrated in Lai Siyao : cb21d418e2a56413e47dd70304cd87334902a6a4
|
| Comment by Build Master (Inactive) [ 23/Mar/11 ] |
|
Integrated in Lai Siyao : d0d5945f8ed6d15bf028a26d02f09cfc69abcf4f
|
| Comment by Build Master (Inactive) [ 23/Mar/11 ] |
|
Integrated in Lai Siyao : ebea6122a9feca28fb82de066d432dd829a02704
|
| Comment by Peter Jones [ 25/Mar/11 ] |
|
Lai Could you please attach this patch to the bz ticket so that Oracle can land it upstream? Thanks Peter |
| Comment by Lai Siyao [ 25/Mar/11 ] |
|
Yes, Peter. |
| Comment by Peter Jones [ 29/Mar/11 ] |
|
Lai Landing permission has been granted by Oracle for this change. Can you please send the patch to lustre-gate-18@sun.com Thanks Peter |
| Comment by Peter Jones [ 30/Mar/11 ] |
|
Lai, Oracle have landed this fix upstream for 1.8.6. Does the same change need to be made to master? Peter |
| Comment by Oleg Drokin [ 30/Mar/11 ] |
|
No, I checked and 2.x is fine, it already always gets the lock. |
| Comment by Peter Jones [ 30/Mar/11 ] |
|
Great then we can resolve this one |
| Comment by Cory Spitz [ 12/Aug/11 ] |
|
Vladimir S. thinks that the fix landed to 1.8.6 is breaking lock ordering. See bz 24437 comment #50. |
| Comment by Peter Jones [ 12/Aug/11 ] |
|
Lai Can you please look into the reported issues with this patch as your top priority? Thanks Peter |
| Comment by Oleg Drokin [ 12/Aug/11 ] |
|
When inspecting the original patch the lock ordering did not match what I though would be happening (and not matching what I originally done I think). Anyway even without that patch the race is there too, just more narrow. The proper way to address all of this is to drop second child lock getting all the way down and instead add all the conditions before original mds_get_parent_child_locked() call that will do the proper ordering of the parent/child locks, we just need to extend the cases where we request the child lock there. |
| Comment by Lai Siyao [ 17/Aug/11 ] |
|
Review is on http://review.whamcloud.com/#change,1259 |
| Comment by Vladimir V. Saveliev [ 02/Sep/11 ] |
|
with this patch (http://review.whamcloud.com/#change,1259) racer fails with the below LBUG on MDS: 2011-09-02 07:10:12 LustreError: 3959:0:(handler.c:2512:mds_intent_policy()) ASSERTION(new_lock != More details in https://bugzilla.lustre.org/show_bug.cgi?id=24525#c13 |
| Comment by Lai Siyao [ 02/Sep/11 ] |
|
Good catch, the patch may skip fetching child lock if it doesn't find child inode at the first time, see mds_get_parent_child_locked() (mds_reint.c line 1649): if (inode == NULL) {
1649 child_lockh = NULL;
1650 goto retry_locks;
1651 }
child_lockh should be set to NULL only for (it_op == IT_OPEN && !(flags & MDS_OPEN_LOCK)). I will commit a patch to review soon. |
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Build Master (Inactive) [ 03/Oct/11 ] |
|
Integrated in Johann Lombardi : bec818434c27bb390b4c8866e73d1afb0dd9e884
|
| Comment by Peter Jones [ 03/Oct/11 ] |
|
Landed for 1.8.7. Not needed for 2.x. |