Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5557

enqueue and reint RPC are not tracked in MDS stats

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.6.0, Lustre 2.5.2, Lustre 2.4.3
    • 15496

    Description

      MDS stats proc file /proc/fs/lustre/mds/MDS/mdt/stats does not track information about LDLM_ENQUEUE and MDS_REINT RPCs.
      This class of RPC covers most of "modifying" RPCs on MDS. This file displays mostly RPC that "read" data from MDT device and which is not "writing" on the device.

      $ cat /proc/fs/lustre/mds/MDS/mdt/stats
      snapshot_time             1409239309.161365 secs.usecs
      req_waittime              182 samples [usec] 17 420 19191 2604647
      req_qdepth                182 samples [reqs] 0 1 3 3
      req_active                182 samples [reqs] 1 3 251 403
      req_timeout               182 samples [sec] 1 10 209 479
      reqbuf_avail              463 samples [bufs] 64 64 29632 1896448
      ldlm_ibits_enqueue        5 samples [reqs] 1 1 5 5
      mds_getattr               1 samples [usec] 83 83 83 6889
      mds_connect               6 samples [usec] 20 197 439 54031
      mds_getstatus             1 samples [usec] 76 76 76 5776
      mds_statfs                2 samples [usec] 74 95 169 14501
      obd_ping                  167 samples [usec] 12 130 5875 249977
      

      These class of RPCs are explicitly blacklisted in the code for a very long time.

      +++ b/lustre/ptlrpc/service.c
      @@ -2110,7 +2110,7 @@ put_conn:
               if (likely(svc->srv_stats != NULL && request->rq_reqmsg != NULL)) {
                       __u32 op = lustre_msg_get_opc(request->rq_reqmsg);
                       int opc = opcode_offset(op);
                       if (opc > 0 && !(op == LDLM_ENQUEUE || op == MDS_REINT)) {
                               LASSERT(opc < LUSTRE_MAX_OPCODES);
                               lprocfs_counter_add(svc->srv_stats,
                                                   opc + EXTRA_MAX_OPCODES,
      

      Is there some specific reasons to prevent that?

      Could we consider enabling them?

      Attachments

        Activity

          [LU-5557] enqueue and reint RPC are not tracked in MDS stats

          Could we consider this for 2.5.4 ?

          adegremont Aurelien Degremont (Inactive) added a comment - Could we consider this for 2.5.4 ?
          pjones Peter Jones added a comment -

          Landed for 2.7

          pjones Peter Jones added a comment - Landed for 2.7
          jhammond John Hammond added a comment -

          Please see http://review.whamcloud.com/11924 for the reint stats.

          jhammond John Hammond added a comment - Please see http://review.whamcloud.com/11924 for the reint stats.
          jhammond John Hammond added a comment -

          > Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS. I don't think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result.

          Then MDS_ENQUEUE_OPEN.

          jhammond John Hammond added a comment - > Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS. I don't think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result. Then MDS_ENQUEUE_OPEN.

          I also recently found http://review.whamcloud.com/342 which fixes up some of this same code.

          adilger Andreas Dilger added a comment - I also recently found http://review.whamcloud.com/342 which fixes up some of this same code.

          Once upon a time, there was a filesystem named Intermezzo that allowed clients to disconnect from the server while using and optionally modifying their locally cached copy of the data. When the client reconnected to the server, it would reintegrate the log of changes that it had made locally to get the server copy back in sync with the client. The thought for Lustre was to allow clients to eventually do the same thing.

          Initially, Lustre clients would only send individual reintegration records to the MDT to change the metadata, but in the future it would be possible to reintegrate a series of changes efficiently, allowing either writeback caching (WBC) clients and/or disconnected operation. In that case, the type of any individual operation isn't known in advance, and there may in fact be multiple different operations sent in the same RPC. Hence, there is only the MDS_REINT RPC type instead of separate RPC handlers for each update type. That said, it would be possible to send different RPC types for statistical purposes, and have all of the RPC handlers be the same piece of code.

          Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS. I don't think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result.

          adilger Andreas Dilger added a comment - Once upon a time, there was a filesystem named Intermezzo that allowed clients to disconnect from the server while using and optionally modifying their locally cached copy of the data. When the client reconnected to the server, it would reintegrate the log of changes that it had made locally to get the server copy back in sync with the client. The thought for Lustre was to allow clients to eventually do the same thing. Initially, Lustre clients would only send individual reintegration records to the MDT to change the metadata, but in the future it would be possible to reintegrate a series of changes efficiently, allowing either writeback caching (WBC) clients and/or disconnected operation. In that case, the type of any individual operation isn't known in advance, and there may in fact be multiple different operations sent in the same RPC. Hence, there is only the MDS_REINT RPC type instead of separate RPC handlers for each update type. That said, it would be possible to send different RPC types for statistical purposes, and have all of the RPC handlers be the same piece of code. Similarly, while LDLM_ENQUEUE today is commonly used for open (along with an open intent), it may be used for other kinds of locking operations on the MDS (e.g re-enqueue a lock in revalidate after it has been cancelled due to conflict) as well as extent locks on the OSS. I don't think it would be possible to change LDLM_ENQUEUE to MDS_OPEN as a result.
          jhammond John Hammond added a comment -

          I did. I'll restore it and look at addressing your comments.

          On this subject, is it in out long term interest to replace these jumbo opcodes (MDS_REINT and LDLM_ENQUEUE) with specific opcodes (MDS_OPEN, MDS_CREATE, MDS_UNLINK, ...)? It has been pointed out that this would make RPC traces much more useful. I'm not sure what "reint" means and I don't think that if I knew it would help anything.

          jhammond John Hammond added a comment - I did. I'll restore it and look at addressing your comments. On this subject, is it in out long term interest to replace these jumbo opcodes (MDS_REINT and LDLM_ENQUEUE) with specific opcodes (MDS_OPEN, MDS_CREATE, MDS_UNLINK, ...)? It has been pointed out that this would make RPC traces much more useful. I'm not sure what "reint" means and I don't think that if I knew it would help anything.

          I think John already has a patch to fix this.

          adilger Andreas Dilger added a comment - I think John already has a patch to fix this.

          People

            jhammond John Hammond
            adegremont Aurelien Degremont (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: