Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/0a61138c-4746-11e8-960d-52540065bddc
test_66 failed with the following error:
start mgsmds failed
env:
server: 2.10.3
client: master tag-2.11.51
MDS console
[24309.817277] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == conf-sanity test 66: replace nids ================================================================= 04:18:51 \(1524457131\)
[24310.002846] Lustre: DEBUG MARKER: == conf-sanity test 66: replace nids ================================================================= 04:18:51 (1524457131)
[24310.158582] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null ||
[24310.158582] /usr/sbin/lctl lustre_build_version 2>/dev/null ||
[24310.158582] /usr/sbin/lctl --version 2>/dev/null | cut -d' ' -f2
[24310.481269] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
[24310.783544] Lustre: DEBUG MARKER: modprobe dm-flakey;
[24310.783544] dmsetup targets | grep -q flakey
[24311.084083] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
[24311.378500] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
[24311.675770] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
[24311.969396] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
[24312.264602] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1
[24312.433912] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[24312.674767] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[24312.979662] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
[24313.575096] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24313.575831] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24313.763080] Lustre: DEBUG MARKER: trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24313.767795] Lustre: DEBUG MARKER: trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24313.931273] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
[24314.233192] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]\{3}[0-9]\{4}'
[24314.532763] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]\{3}[0-9]\{4}'
[24314.838122] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
[24315.139638] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
[24317.489027] Lustre: MGS: Regenerating lustre-OST0000 log by user request: rc = 0
[24318.567869] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-12vm11.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24318.757830] Lustre: DEBUG MARKER: trevis-12vm11.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24326.565614] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids
[24326.865961] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24327.169266] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre-OST0000.osc.active='0'
[24327.321799] Lustre: Permanently deactivating lustre-OST0000
[24327.479666] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24327.778343] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24329.076476] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24330.379271] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24331.690338] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24332.996291] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24334.302332] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24335.602968] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24336.559906] Lustre: setting import lustre-OST0000_UUID INACTIVE by administrator request
[24336.909924] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.lustre-OST0000-osc-MDT0000.active
[24337.220621] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000 10.9.4.143@tcp
[24337.370502] LustreError: 7981:0:(mgs_llog.c:1460:mgs_replace_nids()) Only MGS is allowed to be started
[24337.372748] LustreError: 7981:0:(mgs_handler.c:1085:mgs_iocontrol()) MGS: error replacing nids: rc = -115
[24337.811166] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000 10.9.4.143@tcp
[24337.971349] LustreError: 8054:0:(mgs_llog.c:1460:mgs_replace_nids()) Only MGS is allowed to be started
[24337.973676] LustreError: 8054:0:(mgs_handler.c:1085:mgs_iocontrol()) MGS: error replacing nids: rc = -115
[24345.426539] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000 10.9.4.143@tcp
[24345.578944] LustreError: 8127:0:(mgs_llog.c:1460:mgs_replace_nids()) Only MGS is allowed to be started
[24345.581284] LustreError: 8127:0:(mgs_handler.c:1085:mgs_iocontrol()) MGS: error replacing nids: rc = -115
[24345.743910] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
[24346.044583] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
[24358.425214] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
[24358.425214] lctl dl | grep ' ST ' || true
[24358.731570] Lustre: DEBUG MARKER: modprobe dm-flakey;
[24358.731570] dmsetup targets | grep -q flakey
[24359.064984] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
[24359.376549] Lustre: DEBUG MARKER: modprobe dm-flakey;
[24359.376549] dmsetup targets | grep -q flakey
[24359.674059] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
[24359.975842] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
[24360.272450] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
[24360.568454] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
[24360.862648] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o nosvc /dev/mapper/mds1_flakey /mnt/lustre-mds1
[24361.035392] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[24361.209524] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[24361.511499] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
[24362.074864] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24362.077336] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24362.260453] Lustre: DEBUG MARKER: trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24362.261203] Lustre: DEBUG MARKER: trevis-12vm12.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[24362.428567] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
[24362.735065] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
[24363.034376] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
[24363.329492] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000
[24363.623910] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-WRONG0000 10.9.4.143@tcp
[24363.772100] LustreError: 9837:0:(mgs_handler.c:1085:mgs_iocontrol()) MGS: error replacing nids: rc = -22
[24363.922467] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000 wrong nids list
[24364.221357] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-OST0000 10.9.4.143@tcp
[24364.521716] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-MDT0000
[24364.818065] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-MDT0000 wrong nids list
[24365.116828] Lustre: DEBUG MARKER: /usr/sbin/lctl replace_nids lustre-MDT0000 10.9.4.144@tcp
[24365.454742] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
[24365.755903] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
[24372.077997] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
[24372.077997] lctl dl | grep ' ST ' || true
[24372.401400] Lustre: DEBUG MARKER: modprobe dm-flakey;
[24372.401400] dmsetup targets | grep -q flakey
[24372.747264] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
[24373.052891] Lustre: DEBUG MARKER: modprobe dm-flakey;
[24373.052891] dmsetup targets | grep -q flakey
[24373.353748] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
[24373.650075] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
[24373.949670] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
[24374.246051] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
[24374.540157] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1
[24374.710196] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[24374.773860] LustreError: 11091:0:(obd_config.c:1229:class_process_config()) no device for: lustre-OST0000-osc-MDT0000
[24374.776291] LustreError: 11091:0:(obd_config.c:1682:class_config_llog_handler()) MGC10.9.4.144@tcp: cfg command failed: rc = -22
[24374.780108] Lustre: cmd=cf00f 0:lustre-OST0000-osc-MDT0000 1:osc.active=0
[24374.780108]
[24374.783827] LustreError: 15b-f: MGC10.9.4.144@tcp: The configuration from log 'lustre-MDT0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre.
[24374.788357] LustreError: 11047:0:(obd_mount_server.c:1373:server_start_targets()) failed to start server lustre-MDT0000: -22
[24374.790899] LustreError: 11047:0:(obd_mount_server.c:1866:server_fill_super()) Unable to start targets: -22
[24374.793252] Lustre: Failing over lustre-MDT0000
[24380.850375] LustreError: 11047:0:(obd_mount.c:1506:lustre_fill_super()) Unable to mount (-22)
[24381.005775] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
[24381.380273] Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_66: @@@@@@ FAIL: start mgsmds failed
[24381.559591] Lustre: DEBUG MARKER: conf-sanity test_66: @@@@@@ FAIL: start mgsmds failed
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_66 - start mgsmds failed