Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4971

sanity-scrub test_2: ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT)failed

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0
    • Lustre 2.6.0, Lustre 2.5.4
    • None
    • 3
    • 13763

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite run:
      http://maloo.whamcloud.com/test_sets/598caeaa-cd2c-11e3-b548-52540035b04c
      https://maloo.whamcloud.com/test_sets/fc94ece0-a552-11e3-9fee-52540035b04c

      The sub-test test_2 failed with the following error:

      test failed to respond and timed out

      Info required for matching: sanity-scrub 2

      Attachments

        Issue Links

          Activity

            [LU-4971] sanity-scrub test_2: ldlm_lock2desc()) ASSERTION( lock->l_policy_data.l_inodebits.bits == (MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT)failed

            "Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs."

            That is something should never happen, IMHO.

            di.wang Di Wang (Inactive) added a comment - "Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs." That is something should never happen, IMHO.

            Thanks James!

            yong.fan nasf (Inactive) added a comment - Thanks James!
            jamesanunez James Nunez (Inactive) added a comment - - edited

            I've tested the patch at http://review.whamcloud.com/#/c/10958/ (version 2) and, after running sanity-scrub three times, I have not seen this assertion.

            The patch fixes the crash/error for the configuration I've been testing with.

            jamesanunez James Nunez (Inactive) added a comment - - edited I've tested the patch at http://review.whamcloud.com/#/c/10958/ (version 2) and, after running sanity-scrub three times, I have not seen this assertion. The patch fixes the crash/error for the configuration I've been testing with.

            Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs.

            yong.fan nasf (Inactive) added a comment - Even though we drop the incompatible ibits, we still needs to prevent the user to use the non-initialised export, otherwise, there may be other potential bugs.

            Because the key issue is that we are checking un-iniitialized export flags.

            Yes, but that is for checking if the server support inodebit lock. But since 2.6 client will be only connected to inodebit enable server, so why do we need this check? TBH, I really do not like add "connection wait logic" above ptlrpc, which might bring us troubles or even worse.

            di.wang Di Wang (Inactive) added a comment - Because the key issue is that we are checking un-iniitialized export flags. Yes, but that is for checking if the server support inodebit lock. But since 2.6 client will be only connected to inodebit enable server, so why do we need this check? TBH, I really do not like add "connection wait logic" above ptlrpc, which might bring us troubles or even worse.

            Wangdi, I have verified the race conditions locally. But I am not sure whether removing the incompatible bits is a suitable solution or not. Because the key issue is that we are checking un-iniitialized export flags.

            yong.fan nasf (Inactive) added a comment - Wangdi, I have verified the race conditions locally. But I am not sure whether removing the incompatible bits is a suitable solution or not. Because the key issue is that we are checking un-iniitialized export flags.

            Ah, I see. Hmm, I still think remove those incompatible inodebit handling is a better choice, or can we do sth like this

            root@testnode lustre-release]# git diff lustre/ldlm/
            diff --git a/lustre/ldlm/ldlm_request.c b/lustre/ldlm/ldlm_request.c
            index 76f329d..2c0d05d 100644
            --- a/lustre/ldlm/ldlm_request.c
            +++ b/lustre/ldlm/ldlm_request.c
            @@ -909,7 +909,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp,
                                          OBD_CONNECT_IBITS))
                                             lock->l_policy_data.l_inodebits.bits =
                                                     MDS_INODELOCK_LOOKUP |
            -                                        MDS_INODELOCK_UPDATE;
            +                                        MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT;
                                     else
                                             lock->l_policy_data = *policy;
                             }
            
            di.wang Di Wang (Inactive) added a comment - Ah, I see. Hmm, I still think remove those incompatible inodebit handling is a better choice, or can we do sth like this root@testnode lustre-release]# git diff lustre/ldlm/ diff --git a/lustre/ldlm/ldlm_request.c b/lustre/ldlm/ldlm_request.c index 76f329d..2c0d05d 100644 --- a/lustre/ldlm/ldlm_request.c +++ b/lustre/ldlm/ldlm_request.c @@ -909,7 +909,7 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, OBD_CONNECT_IBITS)) lock->l_policy_data.l_inodebits.bits = MDS_INODELOCK_LOOKUP | - MDS_INODELOCK_UPDATE; + MDS_INODELOCK_UPDATE | MDS_INODELOCK_LAYOUT; else lock->l_policy_data = *policy; }

            No, for single MDT case, there is no such race condition. Because the mount process will call md_getstatus() on the ROOT object firstly, it is NOT ldlm_enqueue RPC. Then when the mount return to user-space, it can guarantee that the connection between the client and the MDT0 has been established successfully.

            yong.fan nasf (Inactive) added a comment - No, for single MDT case, there is no such race condition. Because the mount process will call md_getstatus() on the ROOT object firstly, it is NOT ldlm_enqueue RPC. Then when the mount return to user-space, it can guarantee that the connection between the client and the MDT0 has been established successfully.

            Hmm, I think this is the problem for single MDT FS too, i.e. ldlm_cli_enqueue should not use exp connection flag before it make sure the connection is established?
            This inodebit connect flag is being added even before 1.6 I guess? At least this inodebit interop code is added in 2007

            commit 113303973ec9f8484eb2355a1a6ef3c4c7fd6a56
            Author: nathan <nathan>
            Date:   Sat Feb 10 06:33:41 2007 +0000
            
                land b1_5 onto HEAD
            

            Which I thought it use exp connection flag in a wrong way. So we probably just remove these code to save some trouble here. Since I do not think we plan to let 2.6 lustre client to connect some non-inodebit support server (1.4 ?)

            di.wang Di Wang (Inactive) added a comment - Hmm, I think this is the problem for single MDT FS too, i.e. ldlm_cli_enqueue should not use exp connection flag before it make sure the connection is established? This inodebit connect flag is being added even before 1.6 I guess? At least this inodebit interop code is added in 2007 commit 113303973ec9f8484eb2355a1a6ef3c4c7fd6a56 Author: nathan <nathan> Date: Sat Feb 10 06:33:41 2007 +0000 land b1_5 onto HEAD Which I thought it use exp connection flag in a wrong way. So we probably just remove these code to save some trouble here. Since I do not think we plan to let 2.6 lustre client to connect some non-inodebit support server (1.4 ?)

            James, would you please to verify the new patch? Thanks!
            http://review.whamcloud.com/#/c/10958/

            yong.fan nasf (Inactive) added a comment - James, would you please to verify the new patch? Thanks! http://review.whamcloud.com/#/c/10958/

            I have run related tests locally for hundreds of times, but cannot reproduce the failure by myself. So I have to go though all related code. Finally, I found that: it may be not related with data corruption, but more seems race condition between the asynchronous connection and synchronous lock_enqueue operation. Consider the following scenario:

            1) There are two MDTs in the system, when the client mount, it will trigger MDS_CONNECT request to each of the MDT. But the connect request will be handled asynchronously. The mount processing will go ahead when the connection between the client and MDT0 is established, but will not wait for the connection between the client and the MDT1 to be established.

            2) After the client mount returned, the application tries to "stat $MOUNT/a/b", the directory "a" resides on the MDT0, the "b" resides on the MDT1. So the client will send lock_enqueue RPC to the MDT0 firstly.

            3) The RPC handler on the MDT0 finds that the target object "b" resides on the remote MDT1, then it replies the client to try to talk with the MDT1.

            4) The client gets the MDT0's reply, then calls lmv_intent_remote() to forward the request to the MDT1 via MDC1, but at that time, the connection between the client and the MDT1 has not been established yet.

            5) For most of other RPCs, above case is not a problem, because the client-side RPC sender (in ptlrpc) will handle kinds of connection issues. But for the ldlm_enqueue, it is some different, because before giving the ldlm_enqueue RPC to the sender (or to the ptlrpc), the sponsor will check the export connect flags that will not be set until the connection between the client and the MDT has been established. So at such time point, the sponsor of the ldlm_enqueue request will get "0" flags from the export connection flags, then trigger the ASSERTION as described in this ticket.

            So it is not a special bug for OI scrub, but more general issue for the whole Lustre framework. Because the sanity-scrub will access some cross-MDTs objects immediately after the client mount (without any wait), then it is more easy to reproduce the failure.

            About the fixing, I do not want to change the general connection/ldlm framework, because I am afraid of introducing some potential bugs. Instead, the patch will make the ldlm_request request to wait the connection to be established for such race case.

            yong.fan nasf (Inactive) added a comment - I have run related tests locally for hundreds of times, but cannot reproduce the failure by myself. So I have to go though all related code. Finally, I found that: it may be not related with data corruption, but more seems race condition between the asynchronous connection and synchronous lock_enqueue operation. Consider the following scenario: 1) There are two MDTs in the system, when the client mount, it will trigger MDS_CONNECT request to each of the MDT. But the connect request will be handled asynchronously. The mount processing will go ahead when the connection between the client and MDT0 is established, but will not wait for the connection between the client and the MDT1 to be established. 2) After the client mount returned, the application tries to "stat $MOUNT/a/b", the directory "a" resides on the MDT0, the "b" resides on the MDT1. So the client will send lock_enqueue RPC to the MDT0 firstly. 3) The RPC handler on the MDT0 finds that the target object "b" resides on the remote MDT1, then it replies the client to try to talk with the MDT1. 4) The client gets the MDT0's reply, then calls lmv_intent_remote() to forward the request to the MDT1 via MDC1, but at that time, the connection between the client and the MDT1 has not been established yet. 5) For most of other RPCs, above case is not a problem, because the client-side RPC sender (in ptlrpc) will handle kinds of connection issues. But for the ldlm_enqueue, it is some different, because before giving the ldlm_enqueue RPC to the sender (or to the ptlrpc), the sponsor will check the export connect flags that will not be set until the connection between the client and the MDT has been established. So at such time point, the sponsor of the ldlm_enqueue request will get "0" flags from the export connection flags, then trigger the ASSERTION as described in this ticket. So it is not a special bug for OI scrub, but more general issue for the whole Lustre framework. Because the sanity-scrub will access some cross-MDTs objects immediately after the client mount (without any wait), then it is more easy to reproduce the failure. About the fixing, I do not want to change the general connection/ldlm framework, because I am afraid of introducing some potential bugs. Instead, the patch will make the ldlm_request request to wait the connection to be established for such race case.

            People

              yong.fan nasf (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: