[LU-544] 1.8<->2.1 interop: conf-sanity test 56: mkfs.lustre FAIL: Journal size too big for filesystem. Created: 28/Jul/11 Updated: 29/Aug/11 Resolved: 29/Aug/11 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.7 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre Clients: Lustre Servers: |
||
| Severity: | 3 |
| Rank (Obsolete): | 4915 |
| Description |
|
conf-sanity test 56 failed as follows: == test 56: check big indexes == 10:16:05
fat-amd-1-ib:
fat-amd-1-ib: mkfs.lustre FATAL: Unable to build fs /dev/sdb5 (256)
fat-amd-1-ib:
fat-amd-1-ib: mkfs.lustre FATAL: mkfs failed 256
Permanent disk data:
Target: lustre-MDTffff
Index: unassigned
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x75
(MDT MGS needs_index first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr,acl
Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0
device size = 19085MB
formatting backing filesystem ldiskfs on /dev/sdb5
target name lustre-MDTffff
4k blocks 10000
options -J size=16 -I 512 -i 2048 -q -O dirdata,uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre-MDTffff -J size=16 -I 512 -i 2048 -q -O dirdata,uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F /dev/sdb5 10000
Journal size too big for filesystem.
Permanent disk data:
Target: lustre-OST03e8
Index: 1000
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.4.132@o2ib sys.timeout=20
device size = 19085MB
formatting backing filesystem ldiskfs on /dev/sdb5
target name lustre-OST03e8
4k blocks 10000
options -I 256 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre-OST03e8 -I 256 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E lazy_journal_init -F /dev/sdb5 10000
Writing CONFIGS/mountdata
Permanent disk data:
Target: lustre-OST2710
Index: 10000
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x62
(OST first_time update )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=192.168.4.132@o2ib sys.timeout=20
device size = 19085MB
formatting backing filesystem ldiskfs on /dev/sdb6
target name lustre-OST2710
4k blocks 10000
options -I 256 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre-OST2710 -I 256 -q -O extents,uninit_bg,dir_nlink,huge_file,flex_bg -G 256 -E lazy_journal_init -F /dev/sdb6 10000
Writing CONFIGS/mountdata
start mds service on fat-amd-1-ib
Starting mds: -o user_xattr,acl /dev/sdb5 /mnt/mds
fat-amd-1-ib: mount.lustre: mount /dev/sdb5 at /mnt/mds failed: Invalid argument
fat-amd-1-ib: This may have multiple causes.
fat-amd-1-ib: Are the mount options correct?
fat-amd-1-ib: Check the syslog for more info.
mount -t lustre /dev/sdb5 /mnt/mds
Start of /dev/sdb5 on mds failed 22
start ost1 service on fat-amd-2-ib
Starting ost1: /dev/sdb5 /mnt/ost1
fat-amd-2-ib: mount.lustre: mount /dev/sdb5 at /mnt/ost1 failed: Input/output error
fat-amd-2-ib: Is the MGS running?
mount -t lustre /dev/sdb5 /mnt/ost1
Start of /dev/sdb5 on ost1 failed 5
start ost2 service on fat-amd-2-ib
Starting ost2: /dev/sdb6 /mnt/ost2
fat-amd-2-ib: mount.lustre: mount /dev/sdb6 at /mnt/ost2 failed: Input/output error
fat-amd-2-ib: Is the MGS running?
mount -t lustre /dev/sdb6 /mnt/ost2
Start of /dev/sdb6 on ost2 failed 5
conf-sanity test_56: @@@@@@ FAIL: Unable to start second ost
Dumping lctl log to /home/yujian/test_logs/2011-07-27/072321/conf-sanity.test_56.*.1311787010.log
tar: Removing leading `/' from member names
/home/yujian/test_logs/2011-07-27/072321/conf-sanity-1311787010.tar.bz2
mount lustre on /mnt/lustre.....
Starting client: fat-amd-3-ib: -o user_xattr,acl,flock fat-amd-1-ib@o2ib:/lustre /mnt/lustre
mount.lustre: mount fat-amd-1-ib@o2ib:/lustre at /mnt/lustre failed: Cannot send after transport endpoint shutdown
conf-sanity test_56: @@@@@@ FAIL: Unable to mount client
Dumping lctl log to /home/yujian/test_logs/2011-07-27/072321/conf-sanity.test_56.*.1311787075.log
tar: Removing leading `/' from member names
/home/yujian/test_logs/2011-07-27/072321/conf-sanity-1311787075.tar.bz2
ioctl on /mnt/lustre for getting connect flags failed: Inappropriate ioctl for device (25)
conf-sanity test_56: @@@@@@ FAIL: quotacheck has failed
Dumping lctl log to /home/yujian/test_logs/2011-07-27/072321/conf-sanity.test_56.*.1311787107.log
tar: Removing leading `/' from member names
/home/yujian/test_logs/2011-07-27/072321/conf-sanity-1311787107.tar.bz2
Stopping clients: client-12-ib,client-13-ib,fat-amd-3-ib /mnt/lustre (opts:)
Stopping clients: client-12-ib,client-13-ib,fat-amd-3-ib /mnt/lustre2 (opts:)
Stopping clients: client-12-ib,client-13-ib,fat-amd-3-ib /mnt/lustre (opts:)
Stopping clients: client-12-ib,client-13-ib,fat-amd-3-ib /mnt/lustre2 (opts:)
Formatting mgs, mds, osts
Resetting fail_loc on all nodes...done.
FAIL (192s)
Maloo report: https://maloo.whamcloud.com/test_sets/cddb6528-b8c4-11e0-8bdf-52540025f9af |
| Comments |
| Comment by Oleg Drokin [ 28/Jul/11 ] |
|
I am not sure what interop problems do you suspect here? Looking at the mkfs output and command line, we do mkfs on 10000 4k blocks (which is ~40M), and request 16M journal which is of course too big. |
| Comment by Andreas Dilger [ 29/Jul/11 ] |
|
It is probably unhappy with 2.x mkfs.lustre because the MDS has a higher inode ratio than 1.8, so more space in the filesystem is consumed by inodes (about 1/3 of all blocks, compared to only 1/8th). I don't think this is a real defect however, since the default mkfs.lustre code should handle this properly. One option is to use a smaller journal, "-J size=8" instead of "-J size=16". It seems the large journal size is needed for this test, or the large OST index will cause problems due to causing a large transaction size (see https://bugzilla.lustre.org/show_bug.cgi?id=17931). My calculations show that an 8MB journal should be enough for about 16000 stripes. |
| Comment by Jian Yu [ 29/Jul/11 ] |
|
After looking into the conf-sanity.sh test script on b1_8 and master branches, I found on master branch the MDSSIZE and OSTSIZE were specified as: # use small MDS + OST size to speed formatting time # do not use too small MDSSIZE/OSTSIZE, which affect the default jouranl size MDSSIZE=200000 OSTSIZE=200000 While on b1_8 branch, they were: # use small MDS + OST size to speed formatting time MDSSIZE=40000 OSTSIZE=40000 And for test 56, --mkfsoptions='\"-J size=16\"' was used on both the branches to reformat the MDT. The change of the size on master branch was made by the following commit: commit 234209056398f32c1f1ac301bfc3d1bee9582730
Author: Fan Yong <Yong.Fan@sun.com>
Date: Tue Apr 27 10:51:27 2010 +0800
b=22614 enlarge MDSSIZE/OSTSIZE to increase default journal size for conf-sanity
1) enlarge MDSSIZE/OSTSIZE to increase default journal size for conf-sanity
2) journal handler error process in lustre_commit_dquot
i=robert.read
i=landen
diff --git a/lustre/tests/conf-sanity.sh b/lustre/tests/conf-sanity.sh
index fbe96bb..448aa09 100644
--- a/lustre/tests/conf-sanity.sh
+++ b/lustre/tests/conf-sanity.sh
@@ -32,8 +32,9 @@ if [ -n "$MDSSIZE" ]; then
STORED_MDSSIZE=$MDSSIZE
fi
# use small MDS + OST size to speed formatting time
-MDSSIZE=40000
-OSTSIZE=40000
+# do not use too small MDSSIZE/OSTSIZE, which affect the default jouranl size
+MDSSIZE=200000
+OSTSIZE=200000
So, could I just increase the value of MDSSIZE and OSTSIZE on b1_8 branch to fix the issue in this ticket? |
| Comment by Andreas Dilger [ 29/Jul/11 ] |
|
I would prefer to reduce the journal size to 8MB than increase the OST/MDT filesystem size, so long as the tests are still passing. We could probably also speed up mke2fs further by passing "-E lazy_itable_init", which is OK for temp filesystems like conf-sanity. |
| Comment by Jian Yu [ 01/Aug/11 ] |
|
Patch for b1_8 branch is in: http://review.whamcloud.com/1171. |
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Build Master (Inactive) [ 09/Aug/11 ] |
|
Integrated in Johann Lombardi : da3ed4dbc862a47b791a027f2be63236bc69ebae
|
| Comment by Jian Yu [ 11/Aug/11 ] |
Since the MDSSIZE defined in conf-sanity.sh on master branch is large enough for using '-J size=16' in conf-sanity test 56, I did not reduce the journal size to 8MB while creating the patch for master branch. I found that passing "-E lazy_itable_init" to mke2fs to speed up Lustre filesystem formatting time in the whole conf-sanity test suite is more reasonable than just speeding up test 56, so I made a patch for master branch as follows: http://review.whamcloud.com/1210. Please review. |
| Comment by Jian Yu [ 12/Aug/11 ] |
|
The similar patch for b1_8 branch is in http://review.whamcloud.com/1223. Please review. |
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 15/Aug/11 ] |
|
Integrated in Oleg Drokin : 3d3e2a1674b943821236c430f9494c42b2c60de6
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 33516d58ff7694b1bc35ba5348b3e76caf8945ee
|
| Comment by Jian Yu [ 29/Aug/11 ] |
|
Patch has been merged on b1_8 and master branches. The issue was fixed. |