Details
-
Task
-
Resolution: Not a Bug
-
Major
-
None
-
Lustre 2.4.0
-
11227
Description
We are working on getting our Lustre servers upgraded from 1.8.7 to Lustre 2.4.1 with data in place.
Attachments
Issue Links
- is related to
-
LU-4761 console quota message spew
-
- Resolved
-
Activity
The clients are either direct attached via Ethernet, or behind LNET routers bridging between Ethernet and InfiniBand. The client host shown in the example is on Ethernet, and uses the following lnet.conf:
options lnet networks=tcp(bond0)
The compute nodes on InfiniBand use:
options lnet networks="o2ib(ib1)" routes="tcp 1 10.7.104.54@o2ib; tcp 2 10.7.104.[54-101]@o2ib" dead_router_check_interval=60 live_router_check_interval=60
The preferred router is listed first, and then the group of routers.
The routers use:
options lnet networks="tcp(bond0),o2ib(ib1)" forwarding="enabled"
The Lustre servers use:
options lnet networks="tcp(bond0)" routes="o2ib 192.168.230.[63-78]@tcp 192.168.230.[95-126]@tcp" dead_router_check_interval=60 live_router_check_interval=60
Do we know the network topography? Are there any routers in between the client and the servers? Can we get the LNET configuration for the client/router/servers?
The LNET failure is being repeated between the MDS and an OSS.
This was spotted by seeing a message about a slow reply:
[root@monkey-mds-10-3 ~]# dmesg
Lustre: 11080:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1394570829/real 1394570829] req@ffff880638534000 x1462310351923796/t0(0) o8->monkey-OST0065-osc@172.25.32.111@tcp:28/4 lens 400/544 e 0 to 1 dl 1394570885 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 11080:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 34 previous similar messages
Lustre: MGS: Client 3d8c91ee-addb-8c84-31d6-137714d50e7e (at 172.25.32.111@tcp) reconnecting
Lustre: Skipped 3 previous similar messages
[root@monkey-mds-10-3 ~]# lctl ping 172.25.32.111@tcp failed to ping 172.25.32.111@tcp: Input/output error [root@monkey-mds-10-3 ~]# ping -c 4 172.25.32.111 PING 172.25.32.111 (172.25.32.111) 56(84) bytes of data. 64 bytes from 172.25.32.111: icmp_seq=1 ttl=64 time=0.223 ms 64 bytes from 172.25.32.111: icmp_seq=2 ttl=64 time=0.124 ms 64 bytes from 172.25.32.111: icmp_seq=3 ttl=64 time=0.146 ms 64 bytes from 172.25.32.111: icmp_seq=4 ttl=64 time=0.151 ms --- 172.25.32.111 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 2999ms rtt min/avg/max/mdev = 0.124/0.161/0.223/0.037 ms [root@monkey-mds-10-3 ~]#
It also shows up as packets queued on the network. All of these servers use the TCP LNET driver.
[root@monkey-mds-10-3 ~]# !181 netstat -at | grep ^tcp | grep -v '^tcp 0 0 ' tcp 0 1984 money-mds-10-3-10g.lus:1023 money-oss-2-1-10g.lustr:988 ESTABLISHED [root@monkey-mds-10-3 ~]# netstat -at | grep ^tcp | grep -v '^tcp 0 0 ' tcp 0 2344 money-mds-10-3-10g.lus:1023 money-oss-2-1-10g.lustr:988 ESTABLISHED
On one of our file systems, we're seeing two problems. One is repeated messages about not being able to clean up orphans. Note that this is after a shutdown of all clients, a reset of the Lustre servers including a lustre_rmmod, and a mount of the MDT with the abort_recov option.
Mar 11 13:13:22 monkey-mds-10-3 kernel: LustreError: 11286:0:(osp_precreate.c:737:osp_precreate_cleanup_orphans()) monkey-OST0025-osc: cannot cleanup orphans: rc = -11 Mar 11 13:13:22 monkey-mds-10-3 kernel: LustreError: 11286:0:(osp_precreate.c:737:osp_precreate_cleanup_orphans()) Skipped 3 previous similar messages Mar 11 13:15:43 monkey-mds-10-3 kernel: LustreError: 11234:0:(osp_precreate.c:737:osp_precreate_cleanup_orphans()) monkey-OST000b-osc: cannot cleanup orphans: rc = -11 Mar 11 13:15:43 monkey-mds-10-3 kernel: LustreError: 11234:0:(osp_precreate.c:737:osp_precreate_cleanup_orphans()) Skipped 2 previous similar messages
The second issue is a hanging connection between a client and an OSS, that includes LNET pings failing.
[ux455215@oasis-dm4 temp_project]$ cd turbulence/tars/ [ux455215@oasis-dm4 tars]$ ls DD0000_1024ppm.tar DD0074_1024ppm.tar DD0098_1024ppm.tar ts_1024ppmDD0110.tar ts_1024ppmDD0134.tar ts_1024ppmDD0154.tar ts_1024ppmDD0178.tar ts_1024ppmDD0202.tar ts_1024ppmDD0226.tar DD0060_1024ppm.tar DD0078_1024ppm.tar DD0102_1024ppm.tar ts_1024ppmDD0114.tar ts_1024ppmDD0138.tar ts_1024ppmDD0158.tar ts_1024ppmDD0182.tar ts_1024ppmDD0206.tar DD0064_1024ppm.tar DD0082_1024ppm.tar DD0106_1024ppm.tar ts_1024ppmDD0118.tar ts_1024ppmDD0142.tar ts_1024ppmDD0162.tar ts_1024ppmDD0186.tar ts_1024ppmDD0210.tar DD0068_1024ppm.tar DD0086_1024ppm.tar dsets-to-grab.txt ts_1024ppmDD0122.tar ts_1024ppmDD0146.tar ts_1024ppmDD0166.tar ts_1024ppmDD0190.tar ts_1024ppmDD0214.tar DD0070_1024ppm.tar DD0090_1024ppm.tar dsets-to-untar.txt ts_1024ppmDD0126.tar ts_1024ppmDD0148.tar ts_1024ppmDD0170.tar ts_1024ppmDD0194.tar ts_1024ppmDD0218.tar DD0072_1024ppm.tar DD0094_1024ppm.tar dsets-to-untar.txt~ ts_1024ppmDD0130.tar ts_1024ppmDD0150.tar ts_1024ppmDD0174.tar ts_1024ppmDD0198.tar ts_1024ppmDD0222.tar [ux455215@oasis-dm4 tars]$ ls -l ...hangs...
[root@oasis-dm4 ~]# lctl ping 172.25.33.103@tcp failed to ping 172.25.33.103@tcp: Input/output error [root@oasis-dm4 ~]# ping -c 4 172.25.33.103 PING 172.25.33.103 (172.25.33.103) 56(84) bytes of data. 64 bytes from 172.25.33.103: icmp_seq=1 ttl=63 time=0.109 ms 64 bytes from 172.25.33.103: icmp_seq=2 ttl=63 time=0.159 ms 64 bytes from 172.25.33.103: icmp_seq=3 ttl=63 time=0.155 ms 64 bytes from 172.25.33.103: icmp_seq=4 ttl=63 time=0.158 ms [root@monkey-oss-16-3 ~]# lctl ping 192.168.230.54@tcp failed to ping 192.168.230.54@tcp: Input/output error [root@monkey-oss-16-3 ~]# ping -c 4 192.168.230.54 PING 192.168.230.54 (192.168.230.54) 56(84) bytes of data. 64 bytes from 192.168.230.54: icmp_seq=1 ttl=63 time=0.153 ms 64 bytes from 192.168.230.54: icmp_seq=2 ttl=63 time=0.175 ms 64 bytes from 192.168.230.54: icmp_seq=3 ttl=63 time=0.175 ms 64 bytes from 192.168.230.54: icmp_seq=4 ttl=63 time=0.175 ms
dmesg from client:
Lustre: Mounted monkey-client Lustre: 24401:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1394569229/real 1394569229] req@ffff8806189e8c00 x1462311819674200/t0(0) o8->monkey-OST0078-osc-ffff880c3428b800@172.25.33.103@tcp:28/4 lens 400/544 e 0 to 1 dl 1394569234 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 24401:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1394569254/real 1394569254] req@ffff8806354eb400 x1462311819674824/t0(0) o8->monkey-OST0078-osc-ffff880c3428b800@172.25.33.103@tcp:28/4 lens 400/544 e 0 to 1 dl 1394569264 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 24401:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 11 previous similar messages Lustre: 24401:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1394569279/real 1394569279] req@ffff88063566ac00 x1462311819675332/t0(0) o8->monkey-OST0078-osc-ffff880c3428b800@172.25.33.103@tcp:28/4 lens 400/544 e 0 to 1 dl 1394569294 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 24401:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [root@oasis-dm4 ~]# lctl dl | grep 0078 123 UP osc monkey-OST0078-osc-ffff880c3428b800 d81c416b-007d-7139-df5b-77b081b31fb4 5 [root@oasis-dm4 ~]# lctl dl -t | grep 0078 123 UP osc monkey-OST0078-osc-ffff880c3428b800 d81c416b-007d-7139-df5b-77b081b31fb4 5 172.25.33.103@tcp
dmesg on OSS:
LDISKFS-fs (sdf): mounted filesystem with ordered data mode. quota=off. Opts: Lustre: 28444:0:(ofd_dev.c:221:ofd_process_config()) For interoperability, skip this ost.quota_type. It is obsolete. Lustre: monkey-OST0078: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450 Lustre: monkey-OST0078: Will be in recovery for at least 2:30, or until 1 client reconnects Lustre: monkey-OST0078: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. Lustre: monkey-OST0078: deleting orphan objects from 0x0:12305142 to 0x0:12305281
/var/log/messages on OSS (172.25.32.253@tcp is the MDS):
Mar 11 13:05:11 monkey-oss-16-3 kernel: LustreError: 137-5: monkey-OST0078_UUID: not available for connect from 172.25.32.253@tcp (no target) Mar 11 13:05:11 monkey-oss-16-3 kernel: LustreError: Skipped 2 previous similar messages Mar 11 13:07:25 monkey-oss-16-3 kernel: LustreError: 137-5: monkey-OST0018_UUID: not available for connect from 172.25.32.253@tcp (no target) Mar 11 13:07:25 monkey-oss-16-3 kernel: LustreError: Skipped 4 previous similar messages
Minh, can we escalate this? After our work last night it seems that this is highly reproducible. Also, I don't see it as an LNET routing issue, since the other Lustre file system we tested on these clients performed as expected.
An update: we can reproduce the failure from a single host attached via Ethernet using the TCP LNET driver. This eliminates the routers as a direct cause.
Here are Mahidhar's notes on what he observed.