[LU-4185] Incorrect permission handling when creating existing directories at ICHEC Created: 29/Oct/13  Updated: 03/Nov/17  Resolved: 06/Dec/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.4, Lustre 2.5.0, Lustre 2.6.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Rajeshwaran Ganesan Assignee: Zhenyu Xu
Resolution: Fixed Votes: 2
Labels: None

Attachments: File reproducer.c    
Issue Links:
Duplicate
is duplicated by LU-1101 ncorrect permission handling when cre... Resolved
Related
is related to LU-8019 Openlock breakage Resolved
Severity: 3
Rank (Obsolete): 11322

 Description   

One of our customers is hitting what appears to be LU-1101. They are also running torque, here is the message from the customer:

We use torque as resource manager and job start fails because of mkdir() returns EPERM. The only workaround we have for this is to not set the tmpdir in torque, which is suboptimal. Other applications might suffer the same bug. Quantum espresso is reported to fail as well (in the lustre bug report) and we have users running it.

The bug was originally reported in bz 23459:
https://bugzilla.lustre.org/show_bug.cgi?id=23459

The patch which appears to be the root cause was added as part of bz 18534:
https://bugzilla.lustre.org/show_bug.cgi?id=18534

Thanks.



 Comments   
Comment by Peter Jones [ 29/Oct/13 ]

Bobijam

Could you please assist with this one?

Thanks

Peter

Comment by Zhenyu Xu [ 30/Oct/13 ]

This seems relate to certain kernel version, I tried linux-2.6.32-358.23.2.el6 with current master, which also has the same ll_lookup_nd code as in 2.1.4 lustre, and I've tried both reproducer c code in bz23459, both test passed without the error, all of the mkdir return the expected errno 17 (EEXIST).

What kernel are you using? I'll try it with 2.1.4 lustre code.

Comment by Kit Westneat (Inactive) [ 30/Oct/13 ]

It looks like it's actually 2.1.5 on the servers, but 2.1.4 on the clients. Here are the kernel versions:
mds: 2.6.32-279.19.1.el6_lustre.es52.x86_64
client: 3.0.74-0.6.6-default

client:
lustre-client-modules-kmp-default-2.1.4_3.0.74_0.6.8-5.1

mds:
lustre-modules-2.1.5-ddn1.0_2.6.32_279.19.1.el6_lustre.es52.x86_64_ES.x86_64

Comment by Zhenyu Xu [ 30/Oct/13 ]

can you collect -1 logs of the client and the mds?

Comment by Zhenyu Xu [ 31/Oct/13 ]

I downloaded 3.0.74 kernel and failed build 2.1.4 lustre client upon it. (We don't support 3.0 kernel on 2.1.x lustre yet). Can you teach me how to build it?

Also I'd still like to have -1 logs of the client and the mds to do log analysis.

Thank you.

Comment by Kit Westneat (Inactive) [ 31/Oct/13 ]

I think it's actually the SLES11 SP2 kernel:
https://jira.hpdd.intel.com/browse/LU-3292

We are working on getting the lnet.debug=-1 logs.

Comment by Enrico Tagliavini - ICHEC [ 01/Nov/13 ]

Hi I'm the ICHEC Systems Administrator reporting the problem. I want to point out the directory torque tries to create doesn't exist:

The lustre filesystem on the client is mounted at /ichec/work and the torque tmpdir is set to /ichec/work/scratch/tmp/ which means torque will try to create /ichec/work/scratch/tmp/$JOB_ID which is unique, thus not existing. The C code snipped failing from the version of torque we are using (4.2.5) should be:

part = strtok(path, "/");

if (part == NULL)

{ rc = -1; goto done; }

(part - 1) = '/'; / leading '/' */

while ((part = strtok(NULL, "/")) != NULL)
{
if (mkdir(path, mode) == -1)
{
if (errno != EEXIST)

{ rc = errno; goto done; }

}

*(part - 1) = '/';
}

/* very last component */

if (mkdir(path, mode) == -1)
{
if (errno != EEXIST)

{ rc = errno; goto done; }

}

This code tries to create the whole tree first, in case it doesn't exist. In our case only the very last component is the only one not existing.

Note that if I tried too the reproduces from LU #1101 and I failed reproducing the bug with it for now. The only way to be 100% sure the bug will show itself is to reboot the node and submit a torque job.

Comment by Enrico Tagliavini - ICHEC [ 01/Nov/13 ]

Update: I found a way to reproduce the issue using the reproducer code from the original bz entry, and I was wrong about the direcotry triggering the problem: it the already existing one. Sorry for the confusion. I guess I run the command the wrong way in my previous attempts.

enrico@r3i1n14:~/mkdirbug/reproduce> ./reproducer /ichec/work/scratch/tmp/
Iteration: 1
mkdir(/ichec/work/scratch, 493) : Permission denied
mkdirtree /ichec/work/scratch/tmp/1804289383 : failed: rc=13 (Permission denied)

Python can reproduce this easily as well:

enrico@r3i1n14:~/mkdirbug/reproduce> python
Python 2.6.8 (unknown, May 29 2012, 22:30:44)
[GCC 4.3.4 [gcc-4_3-branch revision 152973]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from os import mkdir
>>> mkdir('/ichec/work/scratch/tmp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 13] Permission denied: '/ichec/work/scratch/tmp'
>>> mkdir('/ichec/work/scratch/tmp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 13] Permission denied: '/ichec/work/scratch/tmp'
>>> from os import stat
>>> stat('/ichec/work/scratch/tmp')
posix.stat_result(st_mode=17407, st_ino=144122436235821058, st_dev=2903758926L, st_nlink=5, st_uid=0, st_gid=0, st_size=4096, st_atime=1383316840, st_mtime=1383316845, st_ctime=1383316845)
>>> mkdir('/ichec/work/scratch/tmp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 17] File exists: '/ichec/work/scratch/tmp'

Comment by Bob Glossman (Inactive) [ 01/Nov/13 ]

I'm trying to reproduce but can't get it to happen. No matter what I do I only see errno 17, EEXIST. I never get errno 13, EACCESS. I've tried both sles11sp2 clients and centos6.4 clients and see the same behavior.

What are the permissions in all the intermediate dirs in your path? I note that all your stats show st_uid=0, st_gid=0. I assume that means you are running and creating all your dirs as root?

If you could just be a little more clear of the exact sequence of cmds to reproduce the problem it would be a big help.

Comment by Enrico Tagliavini - ICHEC [ 01/Nov/13 ]

Sorry, following are more details. If you need something else ask

  1. directories permissions after the example
  2. node just booted

enrico@r3i0n0:~> mount | grep 'type lustre'
10.148.100.60@o2ib:10.148.100.61@o2ib:/home on /ichec/home type lustre (rw,_netdev,localflock)
10.148.100.60@o2ib:10.148.100.61@o2ib:/work on /ichec/work type lustre (rw,_netdev,localflock)
enrico@r3i0n0:~> # note it happens without localflock as well
enrico@r3i0n0:~> pwd
/ichec/home/staff/enrico
enrico@r3i0n0:~> python
Python 2.6.8 (unknown, May 29 2012, 22:30:44)
[GCC 4.3.4 [gcc-4_3-branch revision 152973]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from os import mkdir
>>> mkdir('/ichec/work/scratch/tmp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 13] Permission denied: '/ichec/work/scratch/tmp'

>>> from os import stat
>>> stat('/ichec/work/scratch/tmp')
posix.stat_result(st_mode=17407, st_ino=144122436235821058, st_dev=2903758926L, st_nlink=4, st_uid=0, st_gid=0, st_size=4096, st_atime=1383319699, st_mtime=1383320272, st_ctime=1383320272)
>>> mkdir('/ichec/work/scratch/tmp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 17] File exists: '/ichec/work/scratch/tmp'

This is the simplest way to reproduce the issue.

Permission of the tmpdir
enrico@r3i0n0:~> ls -ld /ichec/work/scratch/tmp
drwxrwxrwt 4 root root 4096 Nov 1 15:37 /ichec/work/scratch/tmp
enrico@r3i0n0:~> ls -ld /ichec/work/scratch/
drwxr-xr-x 3 root root 4096 Oct 29 14:21 /ichec/work/scratch/
enrico@r3i0n0:~> ls -ld /ichec/work/
drwxr-xr-x 10 root root 4096 Oct 24 15:46 /ichec/work/
enrico@r3i0n0:~> ls -ld /ichec/
drwxr-xr-x 4 root root 61 Sep 30 17:14 /ichec/
enrico@r3i0n0:~> ls -ld /
drwxr-xr-x 24 root root 4096 Oct 22 11:20 /

Using the C program reproducer (updated version, more readable) from the original BZ:

enrico@r3i0n1:~/mkdirbug/reproduce> pwd
/ichec/home/staff/enrico/mkdirbug/reproduce
enrico@r3i0n1:~/mkdirbug/reproduce> ./reproducer /ichec/work/scratch/tmp/
Iteration: 1
mkdir(/ichec/work/scratch, 493) : Permission denied
mkdirtree /ichec/work/scratch/tmp/1804289383 : failed: rc=13 (Permission denied)
Iteration: 2
doing stat before creating directory

ERROR: inconsistency detected: previous rc: 13 vs current rc: 0

The error doesn't happen if you are executing the program from the tmpdir (obviously):
enrico@r3i0n2:/ichec/work/scratch/tmp> pwd
/ichec/work/scratch/tmp
enrico@r3i0n2:/ichec/work/scratch/tmp> ./reproducer /ichec/work/scratch/tmp/
Iteration: 1
Iteration: 2
doing stat before creating directory
Iteration: 3
Iteration: 4
doing stat before creating directory

Thank you
Enrico

Comment by Bob Glossman (Inactive) [ 01/Nov/13 ]

Could you please attach the source of your C program reproducer? I might find that easier to work with and a stronger guarantee that I'm trying the exact same thing as you.

I note your mount flags show _netdev. I'm not familiar with that option. What is it? Doubt it's related but I want to pin down all the diffs between you and me.

Comment by Enrico Tagliavini - ICHEC [ 01/Nov/13 ]

This is the source code for the reproducer

Comment by Bob Glossman (Inactive) [ 01/Nov/13 ]

Thank you for your reproducer. Seems to run fine for me, no error:

# ./reproducer /mnt/lustre/tmp
Iteration: 1
Iteration: 2
doing stat before creating directory
Iteration: 3
Iteration: 4
doing stat before creating directory
Iteration: 5
Iteration: 6
doing stat before creating directory
Iteration: 7
Iteration: 8
doing stat before creating directory
Iteration: 9

I am running as root. I am running on a current sles11sp2 client. Will try rolling back to a 2.1 lustre version client to see if the problem is version specific.

Comment by Kit Westneat (Inactive) [ 01/Nov/13 ]

Here are the debug=-1 logs that requested previously:
http://ddntsr.com/ftp/2013-11-01-SR-27367_r3i1n15.dk-log.tar.gz
http://ddntsr.com/ftp/2013-11-01-SR-27367_mkd2.dk-log.tar.gz

Is there any reason to think that it's not the same issue identified by Hongchao in bz 23459? It appears that the potentially incorrect namedata lookup shortcut is still in llite/namei.c.

Comment by Zhenyu Xu [ 01/Nov/13 ]

I think that patch can cure this issue. Just want to understand further.

Comment by Kit Westneat (Inactive) [ 01/Nov/13 ]

Ok cool, just want to make sure we are all on the same page. I was looking at the client dk and I think that code path is hit here:

00000080:00002000:0.0:1383306592.567518:0:4982:0:(dcache.c:116:ll_dcompare()) found name tmp(ffff880840b64500) flags 0x20e010 refc 1
00000080:00000001:0.0:1383306592.567519:0:4982:0:(dcache.c:123:ll_dcompare()) Process leaving (rc=1 : 1 : 1)
00000080:00000001:0.0:1383306592.567520:0:4982:0:(namei.c:571:ll_lookup_nd()) Process entered
00000080:00000001:0.0:1383306592.567521:0:4982:0:(dcache.c:199:ll_set_dd()) Process entered
00000080:00002000:0.0:1383306592.567521:0:4982:0:(dcache.c:204:ll_set_dd()) ldd on dentry tmp (ffff88083d4152c0) parent ffff88084c088180 inode           (null) refc 1
00000080:00000010:0.0:1383306592.567522:0:4982:0:(dcache.c:209:ll_set_dd()) kmalloced 'lld': 104 at ffff88084c721dc0.
00000080:00000001:0.0:1383306592.567523:0:4982:0:(dcache.c:222:ll_set_dd()) Process leaving (rc=0 : 0 : 0)
00000080:00000001:0.0:1383306592.567523:0:4982:0:(namei.c:593:ll_lookup_nd()) Process leaving (rc=0 : 0 : 0)

I can't tell if it's returning EPERM because of that though.

Comment by Zhenyu Xu [ 04/Nov/13 ]

I think it could relates to the discretionary access control of the fs, if the running task does not has CAP_DAC_OVERRIDE capability (i.e. override DAC access), this issue could happen.

int generic_permission(struct inode *inode, int mask,
                int (*check_acl)(struct inode *inode, int mask))
{
        int ret;

        /*
         * Do the basic POSIX ACL permission checks.
         */
        ret = acl_permission_check(inode, mask, check_acl);
        if (ret != -EACCES)
                return ret;

        /*
         * Read/write DACs are always overridable.
         * Executable DACs are overridable if at least one exec bit is set.
         */
        if (!(mask & MAY_EXEC) || execute_ok(inode))
                if (capable(CAP_DAC_OVERRIDE))
                        return 0;

        /*
         * Searching includes executable on directories, else just read.
         */
        mask &= MAY_READ | MAY_WRITE | MAY_EXEC;
        if (mask == MAY_READ || (S_ISDIR(inode->i_mode) && !(mask & MAY_WRITE)))
                if (capable(CAP_DAC_READ_SEARCH))
                        return 0;

        return -EACCES;
}
Comment by Zhenyu Xu [ 04/Nov/13 ]

with patch in bz 23459, the -EEXIST would be detected before this permission checking.

Comment by Zhenyu Xu [ 04/Nov/13 ]

Hi Oleg,

Could you take a look, I am still confused what the root cause of the -EACCES (I think its from the checking of the parent of the last dir dentry, but I don't know what the relationship it has with the existing dir, if we are to create non-existing dir, the same permission checking path still will go through).

Comment by Kit Westneat (Inactive) [ 04/Nov/13 ]

I think the issue is that the application code (Torque) uses mkdir to check for existence, even if it does not have permission to create the directory. Lustre believes that the directory doesn't exist, even though it does, so it does a permissions check to see if Torque can create the directory, which fails. These applications require that mkdir return EEXIST before checking permissions.

Bob, I think you need to run the reproducer as a non-privileged user in order for it to work. The directory should exist, but the user should not have permission to create it. The expected error is EEXIST, but because of the bug you will get EACCESS. Here are the instructions for how to use the reproducer from the bugzilla ticket linked earlier:

Short instruction:
to reproduce the bug:

  1. as root
    mkdir <lustre_root>/test -m755
    chown root:root <lustre_root>/test
    mkdir <lustre_root>/scratch -m1777
  2. as user (su - someuser)
    gcc lustre-test.c -o test
    ./test <lustre_root>/scratch/
Comment by Kit Westneat (Inactive) [ 07/Nov/13 ]

Any updates? Should we try the patch from bz 23459?

Comment by Zhenyu Xu [ 07/Nov/13 ]

yes, I think you can try that patch. But I'm not sure to include it in the source tree, since it would kill a little open & create performance, while just to return different error code for creating existing directory scenario.

Let's try it first, to see whether it can handle the issue which torque meets.

Comment by Kit Westneat (Inactive) [ 07/Nov/13 ]

Ok, sounds good. I wonder what the performance hit is, is there any automated mdtest'ing done? I'll let you know how it goes.

Comment by Patrick Farrell (Inactive) [ 12/Nov/13 ]

Just a quick note - We're seeing this as well at Cray, both on 2.4.1 on SLES and master on CentOS.

I've got a simpler reproducer as well.

As root:
mkdir test
cd test
mkdir test2
su [another_user]
mkdir test2
^-- This hits EPERM, when it should be EEXIST.

If you then do ls -l and try again, you get EEXIST as expected.
(By the way, this hints at a work around that our application code is planning to try - stat the directory to be 'created' first.)

I'll test the suggested patch against master.

Comment by Bob Glossman (Inactive) [ 12/Nov/13 ]

Patrick,
Using your simple command line reproducer I can see the problem with a 2.4.1 sles11sp2 client, but not with a current master sles11sp2 client. I have to assume it was fixed somewhere along the way since 2.4.1

Comment by Patrick Farrell (Inactive) [ 12/Nov/13 ]

Bob,

Interesting! The version of master I tested against was slightly out of date. I'll rebuild and test again with the latest.

For what it's worth, my testing of master was on CentOS.

Also, I forgot to specify, but the SLES I used with 2.4.1 was SLES11SP2 (as in your test).

Comment by Patrick Farrell (Inactive) [ 12/Nov/13 ]

Bob,

I just tested with a fresh pull of master on CentOS and I'm still seeing the issue as described.
Perhaps you accidentally stated the directory when you were testing on master? Or there could be some interaction between master and SLES11SP2.

The patch from bz 23459 does fix my reproducer. Still, as noted about, there's a possible performance impact.

Comment by Bob Glossman (Inactive) [ 12/Nov/13 ]

Patrick,

You are entirely correct. I must have mistyped something during my master test. Reran the experiment. Now seeing the problem reproduce with both 2.4.1 and master sles11sp2 clients. EPERM reported instead of EEXIST in both cases.

Comment by Bob Glossman (Inactive) [ 12/Nov/13 ]

Have also confirmed the same failure can be seen with current master on sles11sp3 clients as well. EPERM instead of EEXIST reported there too. I believe this is good evidence that the problem really is in lustre client code, is still there in current master, and the specific distro or kernel version doesn't matter.

Comment by Zhenyu Xu [ 13/Nov/13 ]

The patch is tracking at http://review.whamcloud.com/8257, but I'm not sure to include it in the source tree, since it would kill a little open & create performance, while just to return a different error code for creating existing directory scenario.

Comment by Enrico Tagliavini - ICHEC [ 13/Nov/13 ]

Hi Zhenyu,

mkdir() of an existing directory return EEXIST on common filesystems, whatever the permission is. I have no idea if there is a standard mandating that or it is just common usage, but if lustre is going to make something different it will break a lot of workflows, for the good of performances. A lot of applications will need to add a stat before the creation useful only when lustre is the underlaying file system, leading to a performance hit anyway.

From the performance PoV this is indeed not the ideal solution, but if you are trying to make a directory I think you really want to know first if it exists already, before knowing if you have the right permission to create it or not.

Kind regards
Enrico

Comment by Patrick Farrell (Inactive) [ 29/Aug/14 ]

One additional thing about this that bothers me:
It seems very undesirable that the value returned for an operation in a directory would change depending on whether or not you've read the directory contents recently.

It would be nice to have this addressed:
Is this issue that Intel expects to fix, either with the patch already suggested, or some other way, or is this a minor divergence from POSIX that Intel's planning to keep in order to achieve better performance?

Comment by Patrick Farrell (Inactive) [ 16/Jul/15 ]

Yuck - I recommend abandoning this change. I just took a closer look at the performance impact, and it's bad.

Below are single node mdtest results. The directory creation is what's interesting - As that is create without open, which is the case we're optimizing away (and the patch would undo that optimization). File creates are create+open, and as expected, the performance doesn't change.
I don't think sacrificing 30-50% of our create performance is worth fixing a quite obscure bit of aberrant behavior.

Without the fix in place:
SUMMARY: (of 10 iterations)
Operation Max Min Mean Std Dev
--------- — — ---- -------
Directory creation: 1799.957 1636.083 1736.426 55.974
Perf_Data mdtest 1 Dir_create 1736.426 ops/sec
File creation : 1253.821 985.786 1130.595 65.472
Perf_Data mdtest 1 File_create 1130.595 ops/sec
Tree creation : 1968.233 1540.891 1771.710 126.015
Perf_Data mdtest 1 Tree_create 1771.710 ops/sec

With the fix in place, same node, same system:
SUMMARY: (of 10 iterations)
Operation Max Min Mean Std Dev
--------- — — ---- -------
Directory creation: 1252.500 1027.566 1196.570 61.338
Perf_Data mdtest 1 Dir_create 1196.570 ops/sec
File creation : 1182.596 1070.822 1118.642 35.296
Tree creation : 1009.946 864.270 919.236 43.143
Perf_Data mdtest 1 Tree_create 919.236 ops/sec

Like I said, 30-50% drop in performance. Yuck.

Comment by David Trudgian [ 15/Feb/16 ]

Hi,

We are a DDN customer, who recently hit this issue with (I believe) 2 packages.

Firstly, running MATLAB 2014b and later we receive 'Permission Denied' errors when using MATLAB's built-in mkdir function to create a nested directory structure on our lustre fs. Using MATLAB's system command to call out to shell 'mkdir -p' always works without issue. Mathworks investigated for us, but indicated they can't support lustre. MATLAB's in-built mkdir works prior to 2014a, so (after coming across the 2nd example) I think it's related to this LU-4185, I've contacted them again with some more info - to see if EPERM instead of EEXIST makes sense. (I'm guessing it does since MATLAB is reporting a Permission Denied error for the failing mkdir calls).

The second occurrence thing that led me here was failing output directory creation for the cufflinks bioinformatics tool on the lustre fs. Code is available, and I was able to take their implementation of mkpath (for recursive directory creation) and see that it failed trying to run mkdir on an existing path - receiving EPERM not EEXIST. The relevant code is here:

https://github.com/cole-trapnell-lab/cufflinks/blob/master/src/common.cpp

Search mkpath, currently line 262.

I can make a patch to cufflinks to run a stat before attempting a mkdir, rather than just relying on mkdir returning EEXIST. However, it would be nice to know if there is ongoing work to try and address this.

Thanks.

Comment by David Trudgian [ 15/Feb/16 ]

Issue ref for the cufflinks package we have the problem with: https://github.com/cole-trapnell-lab/cufflinks/issues/53

Comment by Zhenyu Xu [ 01/Mar/16 ]

updated the http://review.whamcloud.com/8257 to address this issue while not impacting the performance.

Comment by Zhenyu Xu [ 01/Mar/16 ]

Though the updated patch can solve this issue, it raises another issue which is that it will ignore ACL permission failure opon directories.

Comment by Zhenyu Xu [ 01/Mar/16 ]

Updated the patch which adds a client proc entry to control whether bypass this optimization, by default, this optimization is turned on. And with special applications which only check EEXIST return value of mkdir, we can turn it off by "lctl set_param" command.

$ lctl set_param llite.<client_uuid>.create_no_open_optimization=0

Comment by Andreas Dilger [ 02/Jun/16 ]

It would be worthwhile to check if patch http://review.whamcloud.com/20354 "LU-8019 llite: Restore proper opencache operations" fixes this problem.

Comment by Oleg Drokin [ 06/Jul/16 ]

I tried to read the comments to get a better handle on this issue.

The reproducer from Cray is basically:
in a place that user1 owns and user2 does not have write permissions: create a directory as user1 ; then with cold caches try to create same directory as user2 - that fails with EPERM because we do not know that directory already exists. (I imagine if user1 created a non-directory then we get the same issue too).

This seems to be a pretty narrow case, though.
So the other EPERM cases - do they happen when all processes have write permissions in the dir where we are trying to create? Was that source of the failures distilled to a testcase oreven better - understood yet?

Comment by Oleg Drokin [ 06/Jul/16 ]

I guess if somebody confirm that the case is distilled to the apps that basically get a work path in the form of /mnt/lustre/homes/user/somedir/app_folder where user only has write permissions from user onward, but the app does the equivalent of:
mkdir /mnt ; mkdir /mnt/lustre ; mkdir /mnt/lustre/homes ; mkdir /mnt/lustre/homes/user ; .... and stops on !-EEXIST
Then we can probably add a lot more cheaper workaround in the ll_lookup_nd in the form of:
if this CREATE with but NOT OPEN ( == good proxy this is mkdir, mknod and so on)
AND if this user does NOT have write permissions in the parent, THEN go on with the lookup.

This way we do not penalize all the important fast cases being:
1. User has write permissions - great, we'll do create and if it fails - that would be on the server with the correct error.
2. User does not have write permissions, but the directory exists - that's ok too, we go to the server anyway, get our dentry and never get into create in the end.

The
3. User does not have permissions and directory does not exist - this will get slower, we jump from 0 to 1 RPC in this case. But I assume this is unimportant enough that nobody would notice or care?

Comment by Gerrit Updater [ 06/Jul/16 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/21169
Subject: LU-4185 llite: Revise create with no open optimization.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 571495cff60cab20d687402524eb051474b3e32b

Comment by Patrick Farrell (Inactive) [ 06/Jul/16 ]

Oleg - I think you've captured the problem as I understand end users are reporting it (as well as the Cray reproducer).

One question - Which Gerrit link are we using? Bobi Jam has updated to a patch like yours as well.

Comment by Oleg Drokin [ 06/Jul/16 ]

Bobi's patch has a test which is better than mine that does not.

Comment by Oleg Drokin [ 06/Jul/16 ]

Patrick: It would be great if you can verify the current patch (Either one) is ok in majority of operations like mdtest and such.

Comment by Oleg Drokin [ 07/Jul/16 ]

BTW, interesting thing, apparently NFS has the same problem as Lustre and I have not heard any complaints.
Time to fill some more bugs, I guess.

[green@fedora1 crash]$ mkdir aaa 
mkdir: cannot create directory 'aaa': Permission denied
[green@fedora1 crash]$ mkdir lost+found
mkdir: cannot create directory 'lost+found': Permission denied
[green@fedora1 crash]$ ls -ld lost+found
drwx------ 2 root root 16384 May 25  2013 lost+found
[green@fedora1 crash]$ mkdir lost+found
mkdir: cannot create directory 'lost+found': File exists
Comment by Oleg Drokin [ 09/Jul/16 ]

After some discussion with the kernel guys, it looks like EEXIST is not a requirement, so the applications that depend on it are broken.

See http://marc.info/?l=linux-kernel&m=146803277310677&w=2 and http://marc.info/?l=linux-nfs&m=146803381011004&w=2

So I guess it's a good idea to file bugreports with those apps even if we have this also adjusted in Lustre.

Comment by Gerrit Updater [ 05/Oct/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/8257/
Subject: LU-4185 llite: Revise create with no open optimization
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a2d5b2e83c0a512a3ea59698e8481621ab5856c2

Comment by Joseph Gmitter (Inactive) [ 06/Dec/16 ]

In discussion with Bobijam, this issue is resolved with the landing of https://review.whamcloud.com/#/c/8257/

Generated at Sat Feb 10 01:40:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.