Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 1.8.7
-
None
-
RHEL5/x86_64
-
3
-
4852
Description
mdtest in shared directory fails with 50 clients and MDTEST_NFILES=1024, also fails w/100 clients and MDTEST_NFILES=256. Failure is silent, no Lustre Errors or any servers messages. Failure is consistent and repeatable. Failure only occurs when run in shared directory mode.
Failure example:
000: Command line used: /opt/mdtest-1.8.3/bin/mdtest -d /p/l_wham/white215/hyperion.3867/mdtest -i3 -n1024
000: Path: /p/l_wham/white215/hyperion.3867
000: FS: 87.0 TiB Used FS: 0.0% Inodes: 546.9 Mi Used Inodes: 2.1%
000:
000: 400 tasks, 409600 files/directories
000: 10/14/2011 01:17:15: Process 0(hyperion319): FAILED in create_remove_items_helper, unable to remove directory: No such file or directory
srun: mvapich: 2011-10-14T01:17:15: ABORT from MPI rank 0 [on hyperion319]
000: [0] [MPI Abort by user] Aborting Program!
000: [0:hyperion319] Abort: MPI_Abort() code: 1, rank 0, MPI Abort by user Aborting program ! at line 99 in file mpid_init.c
000: slurmd[hyperion319]: *** STEP 1219936.0 KILLED AT 2011-10-14T01:17:15 WITH SIGNAL 9 ***
Attachments
Issue Links
- Trackbacks
-
Lustre 1.8.7-wc1 release testing tracker Lustre 1.8.7wc1 RC1 Tag: v187WC1RC1 Build:
-
Changelog 1.8 {}version 1.8.7wc1{} {}Support for networks: socklnd \any kernel supported by Lustre, qswlnd Qsnet kernel modules 5.20 and later, openiblnd IbGold 1.8.2, o2iblnd OFED 1.3, 1.4.1, 1.4.2, 1.5.1, 1.5.2, 1.5.3.1 and 1.5.3.2 gmlnd GM 2.1....
-
Changelog 2.1 Changes from version 2.1.0 to version 2.1.1 Server support for kernels: 2.6.18274.12.1.el5 (RHEL5) 2.6.32220.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.12.1.el5 (RHEL5) 2.6.32220.el6 (RHEL6) 2.6.32.360....
-
Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....