Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5409

add OBD_FAULT_CHECK

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • None
    • 15050

    Description

      It would be good to have a variant of the existing OBD_FAIL_CHECK() macro that was designed specifically for randomized fault injection. It's possible that what I want can be accommodated by what we have now but I suspect that it can't. On issue is that not all OBD_FAIL_XXX locations are suitable for randomized fault injection. Because of this we cannot grep for OBD_FAIL_XXX from obd_support.h and inject them in turn during various workloads.

      Here's what I want:

      1. OBD_FAULT_CHECK() should accept existing fail locs.
      2. If OBD_FAIL_CHECK(loc) returns true then so should OBD_FAULT_CHECK(loc).
      3. If a given location is deemed good for randomized fault injection then we just replace OBD_FAIL_CHECK() with OBD_FAULT_CHECK() and we're good.
      4. OBD_FAULT_CHECK() should also be triggered by setting CFS_FAULT (0x02000000) in cfs_fail_loc. This allows randomly triggering any site that uses OBD_FAULT_CHECK().
      5. We should expect (enforce) that triggered OBD_FAULT_CHECKs be recovered from.

      It may be worthwhile to add a cfs_fail_err for use with OBD_{FAIL,FAULT}_CHECK().

      extern long cfs_fail_err;
      
      int dt_declare_bankruptcy(const struct lu_env *env, ...)
      {
              if (OBD_FAULT_CHECK(OBD_FAIL_DT_DECLARE_BANKRUPTCY))
                      RETURN(cfs_fail_err)
      
              ...
      }
      
      void *lu_alloc_gater(const struct lu_env *env, ...)
      {
              if (OBD_FAULT_CHECK(OBD_FAIL_LU_ALLOC_GATER))
                      RETURN(ERR_PTR(cfs_fail_err));
      
              ...
      }
      

      I welcome any suggestions here.

      Attachments

        Issue Links

          Activity

            People

              jhammond John Hammond
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: