Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11422

Make LNet Selftest post Health backward compatible

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Lustre 2.12.0
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      LNet Selftest post LNet Health landing loses backward compatibility, which means lnet-selftest cannot be run between cross-version peers (Lustre 2.12 and pre Lustre 2.12). We should fix that.

      In LNet Health feature, new health related stats have been added which changes the struct lnet_counters that we previously had (patch https://review.whamcloud.com/32949 "LU-9120 lnet: add global health statistics"). Due to this, struct srpc_stat_reply is changed as it looks like this -

       struct srpc_stat_reply {
              __u32                   str_status;
              struct lst_sid          str_sid;
              struct sfw_counters     str_fw; 
              struct srpc_counters    str_rpc;
              struct lnet_counters    str_lnet;
       } WIRE_ATTR;
      
       struct lnet_counters {
              __u32   msgs_alloc;
              __u32   msgs_max;
      +       __u32   rst_alloc;
              __u32   errors;
              __u32   send_count;
              __u32   recv_count;
              __u32   route_count;
              __u32   drop_count;
      +       __u32   resend_count;
      +       __u32   response_timeout_count;
      +       __u32   local_interrupt_count;
      +       __u32   local_dropped_count;
      +       __u32   local_aborted_count;
      +       __u32   local_no_route_count;
      +       __u32   local_timeout_count;
      +       __u32   local_error_count;
      +       __u32   remote_dropped_count;
      +       __u32   remote_error_count;
      +       __u32   remote_timeout_count;
      +       __u32   network_timeout_count;
              __u64   send_length;
              __u64   recv_length;
              __u64   route_length;
              __u64   drop_length; 
      } WIRE_ATTR;
      

       

      Amir's idea - 
      "What we can do is make a copy of the structure which is similar to the older one. And in the post health selftest we can have a translation function which takes the new structure and copies the relevant fields to the old one. This way selftest remains backwards compatible"

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sharmaso Sonia Sharma (Inactive)
                Reporter:
                sharmaso Sonia Sharma (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: