Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5604

Lots of FAIL_ID checking are lost

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0, Lustre 2.13.0
    • Lustre 2.6.0, Lustre 2.7.0
    • 3
    • 15676

    Description

      It looks to me that lots of FAIL_ID checking are lost from time to time, take the replay-single.sh as an example:

      • test_73c() checked OBD_FAIL_TGT_LAST_REPLAY, but this FAIL_ID is never being checked in Lustre code from the day one it was introduced.
      • test_73b() checked OBD_FAIL_LDLM_REPLY, but this FAIL_ID is now only checked in mdt_reint_open(), I think it should be checked for every lock enqueue as well.
      • test_73a() checked OBD_FAIL_LDLM_ENQUEUE_NET, but this FAIL_ID is not being checked in Lustre code anymore.
      • test_80c() checked OBD_FAIL_UPDATE_OBJ_NET_REP, but this FAIL_ID has been removed from Lustre code.
      • test_83a() checked OBD_FAIL_MDS_FAIL_LOV_LOG_ADD, but this FAIL_ID isn't checked in Lustre code.
        ...

      To make sure the error injection test working as expected, I think we'd go through all the fail IDs, and add back all the missed fail_id checking. If some FAIL_ID is obsolete already, we'd remove or improve the corresponding test case.

      Attachments

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              niu Niu Yawei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: