Have reproduction case:
1 Client - CentOS 6.3 - Lustre 2.1.3-2.6.32_279.2.1.el6.x86_64
gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC)
1 MDS - CentOS 6.3 - Lustre 2.1.3-2.6.32_279.2.1.el6_lustre.gc46c389.x86_64 (1 mds/mgs partition)
1 OSS - CentOS 6.3 - Lustre 2.1.3-2.6.32_279.2.1.el6_lustre.gc46c389.x86_64 (60 osts on loop devices)
All TCP interconnect
All Server partitions created & mounted with standard autotest tools (auster -c PATH/TO/custom.sh sanity.sh --only MOUNT)
In mounted FS:
cd /mnt/lustre
mkdir 58
lfs setstripe -c 58 58
cd 58
cat << EOF > test.c
#include <stdio.h>
int
main(int argc, char *argv[])
{
return 0;
}
EOF
gcc test.c
./a.out
Expected Results:
Should run w/o error. For directories with a strip width less than 54, there is no error.
Actual Results:
For stripe widths of 54 and larger (at least up to 60) the following error results:
./a.out: Text file busy
Other Test Results:
- If the filesystem is remounted then all a.out's will run correctly.
- If any a.out is copied (either w/in or between directories, regardless of stripe width) it will run fine
- If "bad" a.out is moved it will still illicit the same error
With 2 Clients (both CentOS 6.3 - Lustre 2.1.3-2.6.32_279.2.1.el6.x86_64)
1) First clients generates "bad" a.out
2) Second client mounts FS
3) 2nd client gets error running a.out
4) First client unmounts FS
5) 2nd Client no longer gets error running a.out
WAG of cause:
Client 1 when compiling test.c somehow continues to have a writecount ref (mot_writecount > 0) after compilation finishes. This ref is cleaned up when the client fully disconnects.
Patches picked to branch