Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
None
-
None
-
None
-
before upgrade: lustre-master #3226 RHEL6.7
after upgrade: lustre-b2_5_fe #62 RHEL6.6
-
3
-
9223372036854775807
Description
1. upgrade system from 2.5.5 RHEL6.6 to master RHEL6.7 PASS
2. downgrade system from master RHEL6.7 to 2.5.5 6.6 FAIL
mount MDS failed
Lustre: DEBUG MARKER: == upgrade-downgrade End == 15:01:41 (1447110101) LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: MGC10.2.4.47@tcp: Connection restored to MGS (at 0@lo) Lustre: lustre-MDT0000: used disk, loading LustreError: 12684:0:(mdt_recovery.c:263:mdt_server_data_init()) lustre-MDT0000: unsupported incompat filesystem feature(s) 400 LustreError: 12684:0:(obd_config.c:572:class_setup()) setup lustre-MDT0000 failed (-22) LustreError: 12684:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.2.4.47@tcp: cfg command failed: rc = -22 Lustre: cmd=cf003 0:lustre-MDT0000 1:lustre-MDT0000_UUID 2:0 3:lustre-MDT0000-mdtlov 4:f LustreError: 15b-f: MGC10.2.4.47@tcp: The configuration from log 'lustre-MDT0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre. LustreError: 15c-8: MGC10.2.4.47@tcp: The configuration from log 'lustre-MDT0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 12589:0:(obd_mount_server.c:1254:server_start_targets()) failed to start server lustre-MDT0000: -22 LustreError: 12589:0:(obd_mount_server.c:1737:server_fill_super()) Unable to start targets: -22 LustreError: 12589:0:(obd_mount_server.c:847:lustre_disconnect_lwp()) lustre-MDT0000-lwp-MDT0000: Can't end config log lustre-client. LustreError: 12589:0:(obd_mount_server.c:1422:server_put_super()) lustre-MDT0000: failed to disconnect lwp. (rc=-2) LustreError: 12589:0:(obd_config.c:619:class_cleanup()) Device 5 not setup Lustre: 12589:0:(client.c:1943:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1447110105/real 1447110105] req@ffff8808352bac00 x1517404919169064/t0(0) o251->MGC10.2.4.47@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1447110111 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: server umount lustre-MDT0000 complete LustreError: 12589:0:(obd_mount.c:1330:lustre_fill_super()) Unable to mount (-22) Lustre: DEBUG MARKER: Using TIMEOUT=100 Lustre: DEBUG MARKER: upgrade-downgrade : @@@@@@ FAIL: NAME=ncli not mounted LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: MGC10.2.4.47@tcp: Connection restored to MGS (at 0@lo) Lustre: lustre-MDT0000: used disk, loading LustreError: 13112:0:(mdt_recovery.c:263:mdt_server_data_init()) lustre-MDT0000: unsupported incompat filesystem feature(s) 400 LustreError: 13112:0:(obd_config.c:572:class_setup()) setup lustre-MDT0000 failed (-22) LustreError: 13112:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.2.4.47@tcp: cfg command failed: rc = -22 Lustre: cmd=cf003 0:lustre-MDT0000 1:lustre-MDT0000_UUID 2:0 3:lustre-MDT0000-mdtlov 4:f LustreError: 15b-f: MGC10.2.4.47@tcp: The configuration from log 'lustre-MDT0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre. LustreError: 15c-8: MGC10.2.4.47@tcp: The configuration from log 'lustre-MDT0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 13025:0:(obd_mount_server.c:1254:server_start_targets()) failed to start server lustre-MDT0000: -22 LustreError: 13025:0:(obd_mount_server.c:1737:server_fill_super()) Unable to start targets: -22 LustreError: 13025:0:(obd_mount_server.c:847:lustre_disconnect_lwp()) lustre-MDT0000-lwp-MDT0000: Can't end config log lustre-client. LustreError: 13025:0:(obd_mount_server.c:1422:server_put_super()) lustre-MDT0000: failed to disconnect lwp. (rc=-2) LustreError: 13025:0:(obd_config.c:619:class_cleanup()) Device 5 not setup Lustre: 13025:0:(client.c:1943:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1447110256/real 1447110256] req@ffff88081d67dc00 x1517404919169104/t0(0) o251->MGC10.2.4.47@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1447110262 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: server umount lustre-MDT0000 complete LustreError: 13025:0:(obd_mount.c:1330:lustre_fill_super()) Unable to mount (-22) Lustre: DEBUG MARKER: Using TIMEOUT=100 Lustre: DEBUG MARKER: upgrade-downgrade : @@@@@@ FAIL: NAME=ncli not mounted [root@onyx-25 ~]#
the complete case is:
1. format and setup the system with 1 MDS(1MDT), 1 OSS(1 OST) and 2 clients with lustre 2.5.5 RHEL6.6; create some data
2. shundown the whole system, umount all nodes
3. upgrade the whole system to b2_8/build #8; only clear the boot disk, keep data disk untouched
4. remount the whole system, check the data, works fine;
5. shudown the whole system again, umount all nodes
6. do additional step, remounting the MDS with abort_recovery option
7. umount the MDS again
8. downgrade all servers and clients to 2.5.5 again without touching the data disk
9. mount MDS failed as above.
Please find the attached for more logs. 'before means before downgrade; after means after downgrade'