[LU-6702] shutting down OSTs in parallel with MDT(s) Created: 09/Jun/15  Updated: 12/Jun/15

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question/Request Priority: Major
Reporter: Brian Murrell (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocker
Rank (Obsolete): 9223372036854775807

 Description   

When shutting down OSTs and MDTs in parallel, we see some OSTs that shut down quite quickly:

Jun  9 10:45:56 eagle-8.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST002e
Jun  9 10:45:56 eagle-8.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST002e complete
Jun  9 10:45:57 eagle-8.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST0002
Jun  9 10:45:57 eagle-8.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST0002 complete
Jun  9 10:45:56 eagle-8.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST002e
Jun  9 10:45:56 eagle-8.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST002e complete
Jun  9 10:45:57 eagle-8.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST0002
Jun  9 10:45:57 eagle-8.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST0002 complete

And yet in other cases, some OSTs get hung up on timeouts, seemingly to the MDT while being shut down:

Jun  9 10:45:57 eagle-18.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST000c
Jun  9 10:45:58 eagle-18.eagle.hpdd.intel.com kernel: LustreError: 137-5: testfs-OST000c_UUID: not available for connect from 10.100.4.47@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
Jun  9 10:45:58 eagle-18.eagle.hpdd.intel.com kernel: LustreError: Skipped 52 previous similar messages
Jun  9 10:45:58 eagle-18.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST000c complete
Jun  9 10:46:00 eagle-18.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST0038
Jun  9 10:46:00 eagle-18.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST0038 complete
Jun  9 10:46:29 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1585:0:(client.c:1920:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1433871982/real 1433871982]  req@ffff8802d3ce1800 x1497165981090008/t0(0) o400->testfs-MDT0000-lwp-OST0022@10.100.4.2@tcp:12/10 lens 224/224 e 0 to 1 dl 1433871989 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Jun  9 10:46:29 eagle-18.eagle.hpdd.intel.com kernel: Lustre: testfs-MDT0000-lwp-OST004d: Connection to testfs-MDT0000 (at 10.100.4.2@tcp) was lost; in progress operations using this service will wait for recovery to complete
Jun  9 10:46:29 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1585:0:(client.c:1920:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
Jun  9 10:47:14 eagle-18.eagle.hpdd.intel.com kernel: LustreError: 137-5: testfs-OST000c_UUID: not available for connect from 10.100.4.54@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
Jun  9 10:47:14 eagle-18.eagle.hpdd.intel.com kernel: LustreError: Skipped 35 previous similar messages
Jun  9 10:47:55 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1433872064/real 1433872064]  req@ffff880101e93800 x1497165981090052/t0(0) o38->testfs-MDT0000-lwp-OST0022@10.100.4.1@tcp:12/10 lens 400/544 e 0 to 1 dl 1433872075 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Jun  9 10:47:55 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) Skipped 6 previous similar messages
Jun  9 10:49:44 eagle-18.eagle.hpdd.intel.com kernel: LustreError: 137-5: testfs-OST0023_UUID: not available for connect from 10.100.4.54@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
Jun  9 10:49:44 eagle-18.eagle.hpdd.intel.com kernel: LustreError: Skipped 77 previous similar messages
Jun  9 10:51:05 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1433872239/real 1433872239]  req@ffff88028aab4800 x1497165981090128/t0(0) o38->testfs-MDT0000-lwp-OST0022@10.100.4.1@tcp:12/10 lens 400/544 e 0 to 1 dl 1433872265 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Jun  9 10:51:05 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) Skipped 11 previous similar messages
Jun  9 10:54:47 eagle-18.eagle.hpdd.intel.com kernel: LustreError: 137-5: testfs-OST004e_UUID: not available for connect from 10.100.4.33@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
Jun  9 10:54:47 eagle-18.eagle.hpdd.intel.com kernel: LustreError: 137-5: testfs-OST000c_UUID: not available for connect from 10.100.4.33@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
Jun  9 10:54:47 eagle-18.eagle.hpdd.intel.com kernel: LustreError: Skipped 167 previous similar messages
Jun  9 10:56:20 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1433872539/real 1433872539]  req@ffff8803250dc800 x1497165981090224/t0(0) o38->testfs-MDT0000-lwp-OST0022@10.100.4.1@tcp:12/10 lens 400/544 e 0 to 1 dl 1433872580 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Jun  9 10:56:20 eagle-18.eagle.hpdd.intel.com kernel: Lustre: 1582:0:(client.c:1920:ptlrpc_expire_one_request()) Skipped 11 previous similar messages
Jun  9 11:01:01 eagle-18.eagle.hpdd.intel.com kernel: Lustre: Failing over testfs-OST0022
Jun  9 11:01:01 eagle-18.eagle.hpdd.intel.com kernel: Lustre: server umount testfs-OST004d complete
Jun  9 11:01:01 eagle-18.eagle.hpdd.intel.com kernel: Lustre: Skipped 1 previous similar message

Apparently (if my log reading is not too rusty) these OSTs that got hung up being stopped got timeouts trying to communicate with the MDT, presumably because the MDT beat these OSTs to the stopped state. Is my analysis here accurate? If so a couple of questions:

What is this connection from the OST to the MDT being used for?

Is this a connection that the OST initiates to the MDT or vice versa?

I had always understood that the ideal order for shutting down Lustre was to shut down the MDT(s) first and then the OST(s) so as to not leave the MDT up and running providing references to OSTs that are no longer up and able to service requests. If that understanding is correct how does that square with the timeouts trying to shut down an OST after the MDT is down?



 Comments   
Comment by Andreas Dilger [ 10/Jun/15 ]

Brian,
You are right that shutting down the MDS first is probably best. I think shutting the MDS and OSS down at the same time causes some RPCs to be accepted but dropped rather than rejected outright.

The OSS->MDS connection is needed for quota and FLDB service, and is separate from the MDS->OSS connection.

Generated at Sat Feb 10 02:02:30 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.