Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
None
-
3
-
14209
Description
A production lustre cluster "porter" was upgraded from 2.4.0-28chaos to lustre-2.4.2-11chaos today. OSTs now will not start.
# porter1 /root > /etc/init.d/lustre start Stopping snmpd: [ OK ] Shutting down cerebrod: [ OK ] Mounting porter1/lse-ost0 on /mnt/lustre/local/lse-OST0001 mount.lustre: mount porter1/lse-ost0 at /mnt/lustre/local/lse-OST0001 failed: Input/output error Is the MGS running? # porter1 /root >
Lustre: Lustre: Build Version: 2.4.2-11chaos-11chaos--PRISTINE-2.6.32-431.17.2.1chaos.ch5.2.x86_64 LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.115.67@o2ib10 (no target) LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.38@o2ib7 (no target) LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.101@o2ib7 (no target) LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff881026873800 x1470103660003336/t0(0) o253->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 4768/4768 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 LustreError: 5426:0:(obd_mount_server.c:1140:server_register_target()) lse-OST0001: error registering with the MGS: rc = -5 (not fatal) LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.116.205@o2ib5 (no target) LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.114.162@o2ib5 (no target) LustreError: Skipped 19 previous similar messages LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff881026873800 x1470103660003340/t0(0) o101->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.162@o2ib7 (no target) LustreError: Skipped 23 previous similar messages LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff881026873800 x1470103660003344/t0(0) o101->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 LustreError: 15c-8: MGC172.19.1.165@o2ib100: The configuration from log 'lse-OST0001' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 5426:0:(obd_mount_server.c:1273:server_start_targets()) failed to start server lse-OST0001: -5 Lustre: lse-OST0001: Unable to start target: -5 LustreError: 5426:0:(obd_mount_server.c:865:lustre_disconnect_lwp()) lse-MDT0000-lwp-OST0001: Can't end config log lse-client. LustreError: 5426:0:(obd_mount_server.c:1442:server_put_super()) lse-OST0001: failed to disconnect lwp. (rc=-2) LustreError: 5426:0:(obd_mount_server.c:1472:server_put_super()) no obd lse-OST0001 Lustre: server umount lse-OST0001 complete LustreError: 5426:0:(obd_mount.c:1290:lustre_fill_super()) Unable to mount (-5)
# porter1 /root > lctl ping 172.19.1.165@o2ib100 # <-- MGS NID 12345-0@lo 12345-172.19.1.165@o2ib100
Attachments
Issue Links
- is related to
-
LU-2887 sanity-quota test_12a: slow due to ZFS VMs sharing single disk
- Resolved