[LU-5148] OSTs won't mount following upgrade to 2.4.2 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Cannot Reproduce
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
- llnl

Severity:
3
Rank (Obsolete):
14209

Description

A production lustre cluster "porter" was upgraded from 2.4.0-28chaos to lustre-2.4.2-11chaos today. OSTs now will not start.

# porter1 /root > /etc/init.d/lustre start
Stopping snmpd:                                            [  OK  ]
Shutting down cerebrod:                                    [  OK  ]
Mounting porter1/lse-ost0 on /mnt/lustre/local/lse-OST0001
mount.lustre: mount porter1/lse-ost0 at /mnt/lustre/local/lse-OST0001 failed: Input/output error
Is the MGS running?
# porter1 /root >

Lustre: Lustre: Build Version: 2.4.2-11chaos-11chaos--PRISTINE-2.6.32-431.17.2.1chaos.ch5.2.x86_64
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.115.67@o2ib10 (no target)
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.38@o2ib7 (no target)
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.101@o2ib7 (no target)
LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired   req@ffff881026873800 x1470103660003336/t0(0) o253->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 4768/4768 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 5426:0:(obd_mount_server.c:1140:server_register_target()) lse-OST0001: error registering with the MGS: rc = -5 (not fatal)
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.116.205@o2ib5 (no target)
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.114.162@o2ib5 (no target)
LustreError: Skipped 19 previous similar messages
LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired   req@ffff881026873800 x1470103660003340/t0(0) o101->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 137-5: lse-OST0002_UUID: not available for connect from 192.168.120.162@o2ib7 (no target)
LustreError: Skipped 23 previous similar messages
LustreError: 5426:0:(client.c:1053:ptlrpc_import_delay_req()) @@@ send limit expired   req@ffff881026873800 x1470103660003344/t0(0) o101->MGC172.19.1.165@o2ib100@172.19.1.165@o2ib100:26/25 lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
LustreError: 15c-8: MGC172.19.1.165@o2ib100: The configuration from log 'lse-OST0001' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
LustreError: 5426:0:(obd_mount_server.c:1273:server_start_targets()) failed to start server lse-OST0001: -5
Lustre: lse-OST0001: Unable to start target: -5
LustreError: 5426:0:(obd_mount_server.c:865:lustre_disconnect_lwp()) lse-MDT0000-lwp-OST0001: Can't end config log lse-client.
LustreError: 5426:0:(obd_mount_server.c:1442:server_put_super()) lse-OST0001: failed to disconnect lwp. (rc=-2)
LustreError: 5426:0:(obd_mount_server.c:1472:server_put_super()) no obd lse-OST0001
Lustre: server umount lse-OST0001 complete
LustreError: 5426:0:(obd_mount.c:1290:lustre_fill_super()) Unable to mount  (-5)

# porter1 /root > lctl ping 172.19.1.165@o2ib100 # <-- MGS NID
12345-0@lo
12345-172.19.1.165@o2ib100

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

lustre.log.porter1.1402001330.gz
4.22 MB
05/Jun/14 8:52 PM
lustre.log.porter-mds1.1402001323.gz
0.3 kB
05/Jun/14 8:52 PM
porter-mds1.console.txt
243 kB
09/Jun/14 9:21 PM

Issue Links

is related to

LU-2887 sanity-quota test_12a: slow due to ZFS VMs sharing single disk

Resolved

Activity

People

Assignee:: Hongchao Zhang

Reporter:: Ned Bass (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 05/Jun/14 8:30 PM

Updated:: 18/Jul/16 9:50 PM

Resolved:: 29/Apr/16 12:38 AM