[LU-1101] ncorrect permission handling when creating existing directories Created: 14/Feb/12 Updated: 06/Nov/13 Resolved: 06/Nov/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Marek Magrys | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | None | ||
| Environment: |
Lustre 2.1 on clients and servers, Scientific Linux 5 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Bugzilla ID: | 23,459 | ||||||||
| Rank (Obsolete): | 4020 | ||||||||
| Description |
|
Lustre seems to handle permissions on mkdir incorectlly in some cases. This issue makes it hard (or impossible) to use Torque scheduler directly on top of a Lustre filesystem. This is in fact copy of bugzilla bug #23459, which was reported by us some time ago for 1.8 branch, however it looks like the bug is still there even in 2.1. All the symptoms described in bugzilla are identical and the reproducer code provided by Lukasz Flis still works for this issue. |
| Comments |
| Comment by Marek Magrys [ 14/Feb/12 ] |
|
To clarify: Feb 14 13:56:23 n6-4-16 pbs_mom: LOG_ERROR::Permission denied (13) in TMakeTmpDir, Unable to make job transient directory: /mnt/lustre/scratch/jobs/18555647.batch.grid.cyf-kr.edu.pl An example output of the reproducer: [b14flis@n6-4-16 repro]$ ./a.out /mnt/lustre/scratch/jobs/ Iteration: 2 ERROR: inconsistency detected: previous rc: 13 vs current rc: 0 |
| Comment by Lukasz Flis [ 27/Feb/12 ] |
|
Hi, One of our users using Quantum Espresso application hit the bug today. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% strace dump shown that mkdir result was: After doing stat on the directory before invoking application problem dissapeared: Cheers, |
| Comment by Kit Westneat (Inactive) [ 27/Feb/12 ] |
|
To answer the last question in the bugzilla report, the code that causes this bug was added here as an MDS optimization: |
| Comment by Lukasz Flis [ 11/Apr/12 ] |
|
Hi, Just to update: We have tested and it appeared this is not a problem in 2.2.0 clients. |
| Comment by Lukasz Flis [ 18/Jun/12 ] |
|
Hello, 2.2.0 clients are not usable yet for us (one unreported LBUG) Is there any plan to include fix for the issue in upcoming 2.1.2? |
| Comment by Andreas Dilger [ 06/Nov/13 ] |
|
Closing as a duplicate of |