[LU-1689] Test failure on test suite mmp, subtest test_8 Created: 29/Jul/12 Updated: 04/Jan/13 Resolved: 16/Aug/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.1.3 |
| Fix Version/s: | Lustre 2.3.0, Lustre 2.1.3 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 4494 | ||||||||
| Description |
|
This issue was created by maloo for Li Wei <liwei@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/33a228d8-d877-11e1-ba66-52540035b04c. The sub-test test_8 failed with the following error:
== mmp test 8: mount during e2fsck =================================================================== 22:27:31 (1343453251)
Running e2fsck on the device /dev/lvm-MDS/P1 on mds1...
Mounting /dev/lvm-MDS/P1 on mds1...
CMD: client-27vm3 e2fsck -fy /dev/lvm-MDS/P1
client-27vm3: e2fsck 1.42.3.wc1 (28-May-2012)
CMD: client-27vm3 mkdir -p /mnt/mds1
CMD: client-27vm3 test -b /dev/lvm-MDS/P1
Starting mds1: -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1
CMD: client-27vm3 mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1
client-27vm3: mount.lustre: mount /dev/dm-0 at /mnt/mds1 failed: Device or resource busy
Start of /dev/lvm-MDS/P1 on mds1 failed 16
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Setting filetype for entry 'last_rcvd' in / (2) to 1.
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
lustre-MDT0000: ***** FILE SYSTEM WAS MODIFIED *****
lustre-MDT0000: 122/122650624 files (11.5% non-contiguous), 15460665/61316096 blocks
Running e2fsck on the device /dev/lvm-OSS/P1 on ost1...
CMD: client-27vm4 e2fsck -fy /dev/lvm-OSS/P1
client-27vm4: e2fsck 1.42.3.wc1 (28-May-2012)
Mounting /dev/lvm-OSS/P1 on ost1...
CMD: client-27vm4 mkdir -p /mnt/ost1
CMD: client-27vm4 test -b /dev/lvm-OSS/P1
Starting ost1: /dev/lvm-OSS/P1 /mnt/ost1
CMD: client-27vm4 mkdir -p /mnt/ost1; mount -t lustre /dev/lvm-OSS/P1 /mnt/ost1
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
lustre-OST0000: 100/1981440 files (1.0% non-contiguous), 243716/33790976 blocks
CMD: client-27vm4 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/1.4-gcc/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin::/sbin NAME=autotest_config sh rpc.sh set_default_debug \"0x33f0404\" \" 0xffb7e3ff\" 32
CMD: client-27vm4 e2label /dev/lvm-OSS/P1 2>/dev/null
Started lustre-OST0000
mmp test_8: @@@@@@ FAIL: mount /dev/lvm-OSS/P1 on ost1 should fail
Dumping lctl log to /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.*.1343453321.log
CMD: client-27vm3,client-27vm4,client-27vm5,client-27vm6.lab.whamcloud.com /usr/sbin/lctl dk > /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.debug_log.\$(hostname -s).1343453321.log;
dmesg > /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.dmesg.\$(hostname -s).1343453321.log
CMD: client-27vm4 grep -c /mnt/ost1' ' /proc/mounts
Stopping /mnt/ost1 (opts:) on client-27vm4
CMD: client-27vm4 umount -d /mnt/ost1
CMD: client-27vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
mmp test_8: @@@@@@ FAIL: test_8 failed with 2
Dumping lctl log to /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.*.1343453348.log
CMD: client-27vm3,client-27vm4,client-27vm5,client-27vm6.lab.whamcloud.com /usr/sbin/lctl dk > /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.debug_log.\$(hostname -s).1343453348.log;
dmesg > /logdir/test_logs/2012-07-27/lustre-reviews-el6-x86_64-el5-x86_64__7972__-7f9bca970cd0/mmp.test_8.dmesg.\$(hostname -s).1343453348.log
Info required for matching: mmp 8 |
| Comments |
| Comment by Keith Mannthey (Inactive) [ 30/Jul/12 ] |
|
I failed on mmp 8 as well. My code base is master + a small unrelated test change. CMD: client-26vm3 e2fsck -fy /dev/lvm-MDS/P1 CMD: client-26vm3 e2label /dev/lvm-MDS/P1 2>/dev/null client-26vm3: e2fsck 1.42.3.wc1 (28-May-2012) client-26vm3: e2fsck: Cannot continue, aborting. client-26vm3: client-26vm3: /dev/lvm-MDS/P1 is in use. Started lustre-MDT0000 mmp test_8: @@@@@@ FAIL: mount /dev/lvm-MDS/P1 on mds1 should fail This looks like a similar failure just on the MDS rather than the OST. |
| Comment by Peter Jones [ 02/Aug/12 ] |
|
Minh is going to look into this one |
| Comment by nasf (Inactive) [ 03/Aug/12 ] |
|
Another failure instance: https://maloo.whamcloud.com/test_sets/40afbf52-dd33-11e1-85a8-52540035b04c |
| Comment by Minh Diep [ 03/Aug/12 ] |
|
I haven't been able to reproduce this manually. Perhaps this related to the vm we are using in the lab |
| Comment by Sarah Liu [ 03/Aug/12 ] |
|
Hit this error when doing manual test on tag-2.2.92 OFED build with physical nodes |
| Comment by nasf (Inactive) [ 03/Aug/12 ] |
|
another failure: https://maloo.whamcloud.com/test_sets/7be7c896-ddae-11e1-85a8-52540035b04c |
| Comment by nasf (Inactive) [ 03/Aug/12 ] |
|
Another similar failure: https://maloo.whamcloud.com/test_sets/2a7e6362-dd8f-11e1-85a8-52540035b04c |
| Comment by Minh Diep [ 05/Aug/12 ] |
|
This is a timing issue CMD: fat-intel-3vm3 e2fsck -fy /dev/lvm-MDS/P1 The mount command was executed before the e2fsck. Hence, the e2fsck failed instead of mount command fail as expected. |
| Comment by Jian Yu [ 07/Aug/12 ] |
|
Yes, it's a timing issue. We have to update the script to make sure that mount operation is really performed during e2fsck. |
| Comment by Jinshan Xiong (Inactive) [ 08/Aug/12 ] |
|
This ticket is blocking maloo test quite often. I've hit it 3 times in my test. |
| Comment by Minh Diep [ 11/Aug/12 ] |
| Comment by Jian Yu [ 13/Aug/12 ] |
|
Another instance on Lustre 2.1.3 RC1: |
| Comment by Peter Jones [ 13/Aug/12 ] |
|
Yujian Minh is on vacation this week so could you please take care of revising the latest fix for this issue? Thanks Peter |
| Comment by Jian Yu [ 14/Aug/12 ] |
|
The patch in http://review.whamcloud.com/#change,3569 was updated. |
| Comment by Zhenyu Xu [ 14/Aug/12 ] |
|
When the patch passes test, please port it to b2_1 as well, b2_1 branch autotest also suffers the same issue. |
| Comment by Jian Yu [ 14/Aug/12 ] |
|
Patch for b2_1 branch: http://review.whamcloud.com/#change,3643 |
| Comment by Peter Jones [ 16/Aug/12 ] |
|
Landed for 2.3 |
| Comment by Jian Yu [ 21/Aug/12 ] |
|
The issue still occurred: The new patch is in |