Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8045

MDT fails to allow client mounts if one MDT is not connected

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.8.0
    • TOSS 2 (RHEL 6.7 based)
      kernel 2.6.32-573.22.1.1chaos.ch5.4.x86_64
      Lustre 2.8.0+patches 2.8-llnl-preview1
      zfs-0.6.5.4-1.ch5.4.x86_64
      1 MGS - separate server
      40 MDTs - each on separate server
      10 OSTs - each on separate server
    • 3
    • 9223372036854775807

    Description

      See LU-8044
      With many MDTs, if MDT0000 cannot connect with one of the other MDTs, (perhaps only on initial startup, I don't know), MDT0000 appears to ignore connection requests from clients.

      Seems as if MDT0000 ought to be able to allow mounts, and the filesystem should simply function without the apparently broken MDT.

      Attachments

        Activity

          [LU-8045] MDT fails to allow client mounts if one MDT is not connected
          di.wang Di Wang added a comment -
          It looks to me like the code requires that all MDTs successfully connect with each other before any of them will accept connections from clients. Not just the first time they are started, but any time.
          

          Actually, it does not require all MDTs to be connect, but it does require the config log of one MDT is executed, before it can accept the connection request. Sorry, I did not make it clear in the last comments.

          Suppose that there is a power outage and all the MDSs go down, and when power is restored one does not come up (not counting MDT0000 which is of course special). Why not accept connections on the MDTs that are up? Depending on how the namespace is distributed across MDTs, it may be possible to do work.
          

          Yes, this example does make sense. But if the user know one or some MDTs can not get back, it needs to manually deactivate these MDTs on client and other MDTs (which probably cause the failure of this ticket)

          lctl --device xxx-mdc-xxxx deactivate
          

          then the recovery efforts on these MDTs will be stopped, and those recovery MDTs will be able to accept the connection from clients, and of course clients will only be able to access the file on restored MDTs. Sorry again, I might gave the obscure information in the last comment.

          And there are even such test cases in conf-sanity.sh 70c and 70d, please check. Thanks.

          di.wang Di Wang added a comment - It looks to me like the code requires that all MDTs successfully connect with each other before any of them will accept connections from clients. Not just the first time they are started, but any time. Actually, it does not require all MDTs to be connect, but it does require the config log of one MDT is executed, before it can accept the connection request. Sorry, I did not make it clear in the last comments. Suppose that there is a power outage and all the MDSs go down, and when power is restored one does not come up (not counting MDT0000 which is of course special). Why not accept connections on the MDTs that are up? Depending on how the namespace is distributed across MDTs, it may be possible to do work. Yes, this example does make sense. But if the user know one or some MDTs can not get back, it needs to manually deactivate these MDTs on client and other MDTs (which probably cause the failure of this ticket) lctl --device xxx-mdc-xxxx deactivate then the recovery efforts on these MDTs will be stopped, and those recovery MDTs will be able to accept the connection from clients, and of course clients will only be able to access the file on restored MDTs. Sorry again, I might gave the obscure information in the last comment. And there are even such test cases in conf-sanity.sh 70c and 70d, please check. Thanks.

          Di,

          It looks to me like the code requires that all MDTs successfully connect with each other before any of them will accept connections from clients. Not just the first time they are started, but any time.

          If I am correct, then I would say that yes, it is an important issue. Suppose that there is a power outage and all the MDSs go down, and when power is restored one does not come up (not counting MDT0000 which is of course special). Why not accept connections on the MDTs that are up? Depending on how the namespace is distributed across MDTs, it may be possible to do work.

          But maybe I'm mistaken about some of that. If so, let me know.

          thanks,
          Olaf

          ofaaland Olaf Faaland added a comment - Di, It looks to me like the code requires that all MDTs successfully connect with each other before any of them will accept connections from clients. Not just the first time they are started, but any time. If I am correct, then I would say that yes, it is an important issue. Suppose that there is a power outage and all the MDSs go down, and when power is restored one does not come up (not counting MDT0000 which is of course special). Why not accept connections on the MDTs that are up? Depending on how the namespace is distributed across MDTs, it may be possible to do work. But maybe I'm mistaken about some of that. If so, let me know. thanks, Olaf
          di.wang Di Wang added a comment -

          Well, in current implementation, only prepare succeeds (at the end of server_start_targets()), then the target is allowed to be connected (obd_no_conn is set to be 0). I am guessing with disconnected MDTs, it will block the prepare or configuration process (see server_start_targets()), so client can not connect to the MDT. Not sure how easy to fix this. Is this an important issue?

          di.wang Di Wang added a comment - Well, in current implementation, only prepare succeeds (at the end of server_start_targets()), then the target is allowed to be connected (obd_no_conn is set to be 0). I am guessing with disconnected MDTs, it will block the prepare or configuration process (see server_start_targets()), so client can not connect to the MDT. Not sure how easy to fix this. Is this an important issue?
          ofaaland Olaf Faaland added a comment -

          Yes, thank you Peter.

          ofaaland Olaf Faaland added a comment - Yes, thank you Peter.
          pjones Peter Jones added a comment -

          That ok Olaf?

          pjones Peter Jones added a comment - That ok Olaf?
          ofaaland Olaf Faaland added a comment -

          The issue summary I wrote is wrong; it seems to me like it's any MDT, not just MDT0000. I don't have the ability to change ticket summaries, so one of you Intel folk could fix it, please, that would be great.

          ofaaland Olaf Faaland added a comment - The issue summary I wrote is wrong; it seems to me like it's any MDT, not just MDT0000. I don't have the ability to change ticket summaries, so one of you Intel folk could fix it, please, that would be great.
          ofaaland Olaf Faaland added a comment - - edited

          All the clients are unable to connect to the MDTs; the imports on the client show repeated connection attempts, even though all but one MDT seems to have started normally.

          Here is one example:

          ==> ./mdc/lustre-MDT0001-mdc-ffff880fc4ec5400/state <==
          current_state: DISCONN
          state_history:
           - [ 1461090634, CONNECTING ]
           - [ 1461090634, DISCONN ]
           - [ 1461090659, CONNECTING ]
           - [ 1461090659, DISCONN ]
           - [ 1461090684, CONNECTING ]
           - [ 1461090684, DISCONN ]
           - [ 1461090709, CONNECTING ]
           - [ 1461090709, DISCONN ]
           - [ 1461090734, CONNECTING ]
           - [ 1461090734, DISCONN ]
           - [ 1461090759, CONNECTING ]
           - [ 1461090759, DISCONN ]
           - [ 1461090784, CONNECTING ]
           - [ 1461090784, DISCONN ]
           - [ 1461090809, CONNECTING ]
           - [ 1461090809, DISCONN ]
          
          ofaaland Olaf Faaland added a comment - - edited All the clients are unable to connect to the MDTs; the imports on the client show repeated connection attempts, even though all but one MDT seems to have started normally. Here is one example: ==> ./mdc/lustre-MDT0001-mdc-ffff880fc4ec5400/state <== current_state: DISCONN state_history: - [ 1461090634, CONNECTING ] - [ 1461090634, DISCONN ] - [ 1461090659, CONNECTING ] - [ 1461090659, DISCONN ] - [ 1461090684, CONNECTING ] - [ 1461090684, DISCONN ] - [ 1461090709, CONNECTING ] - [ 1461090709, DISCONN ] - [ 1461090734, CONNECTING ] - [ 1461090734, DISCONN ] - [ 1461090759, CONNECTING ] - [ 1461090759, DISCONN ] - [ 1461090784, CONNECTING ] - [ 1461090784, DISCONN ] - [ 1461090809, CONNECTING ] - [ 1461090809, DISCONN ]

          People

            laisiyao Lai Siyao
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: