[LU-12406] conf-sanity test 111 fails with ‘add mds1 failed with new params’ Created: 07/Jun/19 Updated: 05/Dec/22 Resolved: 05/Dec/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
conf-sanity test_111 fails with ‘add mds1 failed with new params’. Looking at the client test_log, we see that the problem is with the --mkfsoptions flag in the mkfs command CMD: trevis-18vm11 mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=2400000 --mkfsoptions=\"-O large_dir -i 1048576 -O ea_inode -E lazy_itable_init\" --reformat /dev/mapper/mds1_flakey
trevis-18vm11: mkfs.lustre: don't specify multiple -O options
trevis-18vm11:
trevis-18vm11: mkfs.lustre FATAL: mkfs failed 22
trevis-18vm11: mkfs.lustre: exiting with 22 (Invalid argument)
Permanent disk data:
Target: lustre:MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x65
(MDT MGS first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: sys.timeout=20 mdt.identity_upcall=/usr/sbin/l_getidentity
device size = 2048MB
conf-sanity test_111: @@@@@@ FAIL: add mds1 failed with new params
It looks like Lustre does not like having multiple ‘-O’ flags in the --mkfsoptions. This may be an issue with the test and forming the --mkfsoptions list. Looking in the client test_log, we also see that test_115 suffers from this same issue, but the test is skipped due to this issue == conf-sanity test 115: Access large xattr with inodes number over 2TB ============================== 02:17:13 (1559787433)
Stopping clients: trevis-18vm4.trevis.whamcloud.com,trevis-18vm5 /mnt/lustre (opts:)
CMD: trevis-18vm4.trevis.whamcloud.com,trevis-18vm5 running=\$(grep -c /mnt/lustre' ' /proc/mounts);
if [ \$running -ne 0 ] ; then
echo Stopping client \$(hostname) /mnt/lustre opts:;
…
CMD: trevis-18vm11 mkfs.lustre --mgsnode=trevis-18vm11@tcp --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions=\"-O ea_inode -E lazy_itable_init\" --device-size=3298534883328 --mkfsoptions='-O lazy_itable_init,ea_inode,^resize_inode,meta_bg -i 1024' --mgs --reformat /dev/loop0
trevis-18vm11:
trevis-18vm11: mkfs.lustre FATAL: Unable to build fs /dev/loop0 (256)
trevis-18vm11:
trevis-18vm11: mkfs.lustre FATAL: mkfs failed 256
Permanent disk data:
Target: lustre:MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x65
(MDT MGS first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: mgsnode=10.9.4.221@tcp sys.timeout=20 mdt.identity_upcall=/usr/sbin/l_getidentity
device size = 3145728MB
formatting backing filesystem ldiskfs on /dev/loop0
target name lustre:MDT0000
kilobytes 3221225472
options -i 1024 -J size=4096 -I 1024 -q -O lazy_itable_init,ea_inode,^resize_inode,meta_bg,dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000 -i 1024 -J size=4096 -I 1024 -q -O lazy_itable_init,ea_inode,^resize_inode,meta_bg,dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F /dev/loop0 3221225472k
Invalid filesystem option set: lazy_itable_init,ea_inode,^resize_inode,meta_bg,dirdata,uninit_bg,^extents,dir_nlink,quota,huge_ file,flex_bg
SKIP: conf-sanity test_115 format large MDT failed
It looks like this started failing on 2019-06-03 with Lustre 2.12.53.104. Logs for failed test sessions with this test failure are at |
| Comments |
| Comment by Jian Yu [ 15/Aug/19 ] |
|
+1 on master branch: https://testing.whamcloud.com/test_sets/496cea80-bf2a-11e9-a2b6-52540065bddc |
| Comment by Andreas Dilger [ 30/Aug/19 ] |
|
This seems very similar to I'm not totally satisfied with that solution, since it only fixes the test-framework to merge multiple options and not mkfs.lustre itself. That said, maybe a similar fix is needed for this test as well? I see this test failing a lot on Oleg's test hardware, so it would be nice to clear that up, even if the solution is sub-optimal. |
| Comment by Andreas Dilger [ 30/Aug/19 ] |
|
It looks like this problem is at least avoided by patch https://review.whamcloud.com/35358 " While that doesn't completely solve the problem (it would still be an error if large_dir was removed from fs_mkfs_opts, it would essentially never be hit again and this ticket could be closed as "Won't fix". |
| Comment by Andreas Dilger [ 30/Aug/19 ] |
|
Note that the 35482 patch doesn't fix this because it is removing duplicates inside mkfs_opts(), while test_111() is adding them afterward. |
| Comment by Andreas Dilger [ 05/Dec/22 ] |
|
conf-sanity test_111 passing regularly on master, when not skipped because of SLOW=no |