Test Plan for the "suppress_pings" ptlrpc Module Parameter

1. Introduction

In Lustre 2.4, a new ptlrpc module parameter, "suppress_pings", is
introduced to provide an option for reducing excessive OBD_PING
messages in large clusters.  The parameter is a switch and affects all
MDTs and OSTs on a node.  (MGS pings can not be suppressed.)  By
default, it is off (zero), giving a behavior identical to previous
implementations.  If it is on (non-zero), all clients of the affected
targets who understand OBD_CONNECT_PINGLESS will know, at connect
time, that pings are not required and will suppress keep-alive pings.
In production environments, when suppressing pings, there must be an
external mechanism to notify the targets of client deaths, via the
targets' "evict_client" procfs entries.  In addition, a highly
available standalone MGS is also recommended when suppressing pings,
so that clients are notified (through Imperative Recovery) of target
recoveries.

2. Test Cases

The following configurations will be referenced by later sections:

  Normal.  All servers do not have "suppress_pings" set to any value.
    No other requirements.  For example, either a combined or
    standalone MGS will do.

  Suppressed.  All servers have "suppress_pings" set to "1" or any
    non-zero value.  The MGS is standalone.

2.1. Pings not suppressed by default

On a "normal" cluster, stop all workloads and check the numbers of
"obd_ping" (or "ping", on the OSSs) samples by reading the following
procfs files:

  MGS:  /proc/fs/lustre/mgs/MGS/mgs/stats
  MDSs: /proc/fs/lustre/mds/MDS/mdt/stats
  OSSs: /proc/fs/lustre/obdfilter/*/stats

Wait a while (e.g., an obd_timeout) and check the numbers again.  All
the numbers should have grown larger.

2.2. Pings suppressed on request

On a "suppressed" cluster, stop all workloads, wait a minute (for all
the transactions to be committed and for all the clients to learn that
fact), check the same procfs files as in section 2.1, wait a while
(e.g., an obd_timeout), and check the numbers again.  The number on
the MGS should have grown, while all the other numbers should have
remained constant.

2.3. Clients notified of target recoveries in the absence of pings

On a "suppressed" cluster, stop all workloads, wait a minute (for all
the transactions to be committed and for all the clients to learn that
fact), and restart an OST.  Monitor the corresponding OSC states on the
clients by reading this procfs file:

  /proc/fs/lustre/osc/<the_one_belong_to_the_restarted_ost>/state

All OSCs should be notified of the OST restart (i.e., turning into
"DISCONN") and reconnect (i.e., turning then through the recovery
states into "FULL") eventually.

Do the same thing again to an MDT and check the MDC states on the
clients by reading this procefs file:

  /proc/fs/lustre/mdc/<the_one_belong_to_the_restarted_mdt>/state

All MDCs should behave in the same way as the OSCs.

2.4. Pings unsuppressible when uncommitted requests exist

On a "suppressed" cluster, stop all workloads, wait a minute (for all
the transactions to be committed and for all the clients to learn that
fact), and check the same procfs files as in section 2.1.  Create a
file and get the FIDs and versions of its MDT object and all OST
objects with the following commands:

  Client: lfs setstripe --count=-1 <file>
  Client: dd if=/dev/zero of=<file> bs=1M count=<number_of_osts> oflag=sync
  Client: lfs getstripe <file>
  Client: lfs path2fid <file>
  MDS:    lctl --device <mdt> getobjversion <mdt_object_fid>
  OSSs:   lctl --device <ost> getobjversion -i <id> -g <group> (LU-2783)
  Client: echo -n ><file>

Check the "peer_committed" transaction numbers in these procfs files
on the clients:

  /proc/fs/lustre/mdc/<the_one_belong_to_the_mdt>/import
  /proc/fs/lustre/osc/*/import

The transaction numbers should eventually grow to corresponding
versions, which are transaction numbers themselves.  Check the same
procfs files as in section 2.1 again.  The numbers of pings should
have grown.

2.5. Normal workloads not affected with pings suppressed

On a "suppressed" cluster, run the usual benchmarks (i.e., the "IO section" of SWL) with the same parameters as in the release test plan.  No errors, OOMs, panics should happen.