Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4533

rpc_stats histogram does not support max_rpcs_in_flight greater than 31

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.5.0
    • 3
    • 12395

    Description

      The "rpcs in flight" histogram which is displayed by reading the proc file /proc/fs/lustre/osc/*/rpc_stats does not show values higher than 31. When max_rpcs_in_flight is set to a value greater than 31, we should see rows for "rpcs in flight" values greater than 31. Instead, all rpcs which are sent when there are 31 or more rpcs in flight are accounted for in the 31st bucket of the histogram.

                              read                    write
      rpcs in flight        rpcs   % cum % |       rpcs   % cum %
      0:                       0   0   0   |          0   0   0
      1:                     504   5   5   |        621  30  30
      2:                     330   3   8   |        405  20  51
      3:                     337   3  12   |          1   0  51
      4:                     349   3  16   |          1   0  51
      5:                     338   3  19   |          1   0  51
      6:                     325   3  23   |          1   0  51
      7:                     327   3  26   |          1   0  51
      8:                     324   3  30   |          1   0  51
      9:                     307   3  33   |          1   0  51
      10:                    306   3  36   |          1   0  51
      11:                    306   3  40   |          1   0  51
      12:                    301   3  43   |          1   0  51
      13:                    291   3  46   |          1   0  51
      14:                    283   3  49   |          1   0  51
      15:                    278   2  52   |          2   0  51
      16:                    276   2  55   |          1   0  51
      17:                    270   2  58   |          1   0  51
      18:                    270   2  61   |          1   0  52
      19:                    266   2  63   |          1   0  52
      20:                    265   2  66   |          1   0  52
      21:                    262   2  69   |          2   0  52
      22:                    263   2  72   |          4   0  52
      23:                    262   2  75   |          3   0  52
      24:                    263   2  77   |          1   0  52
      25:                    262   2  80   |          2   0  52
      26:                    261   2  83   |          1   0  52
      27:                    260   2  86   |          1   0  52
      28:                    259   2  89   |          3   0  52
      29:                    256   2  91   |          2   0  53
      30:                    256   2  94   |          1   0  53
      31:                    512   5 100   |        939  46 100
      

      According to the current version of the Lustre manual, the valid range for max_rpcs_in_flight is between 1 and 256. Those values should be supported by this histogram. The maximum value for max_rpcs_in_flight is determined by the value of this preprocessor macro:

      #define OSC_MAX_RIF_MAX         256
      

      The size of the obd_histogram struct is determined by a preprocessor macro as well:

      /* if we find more consumers this could be generalized */
      #define OBD_HIST_MAX 32
      struct obd_histogram {
              spinlock_t      oh_lock;
              unsigned long   oh_buckets[OBD_HIST_MAX];
      };
      

      It looks like the histogram for recording the number of rpcs in flight has the greatest space requirements, so it would be a sufficient fix if we defined OBD_HIST_MAX to OSC_MAX_RIF_MAX. However, this would increase the size of every obd_histogram by about a factor of 8. I'm not sure yet if this would be a significant increase.

      Another option would be to generalize the obd_histogram struct to use a flexible array for oh_buckets, but this would require a lot more work, and all obd_histogram structures would need to be dynamically allocated.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              haasken Ryan Haasken
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: