Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3647 HSM _not only_ small fixes and to do list goes here
  3. LU-3685

some paths in ll_ioc_copy_{start,end} set hpk_errval non-zero but don't set HP_FLAG_COMPLETED

    XMLWordPrintable

Details

    • Technical task
    • Resolution: Fixed
    • Major
    • Lustre 2.5.0
    • Lustre 2.5.0
    • 9517

    Description

      Running racer with HSM operations I see messages of the form:

      LustreError: 4158:0:(ldlm_resource.c:1188:ldlm_resource_get()) lustre-OST0001: lvbo_init failed for resource 0x1936:0x0: rc = -2
      LustreError: 11-0: lustre-OST0001-osc-ffff8801f01ff000: Communicating with 0@lo, operation ost_getattr failed with -12.
      LustreError: 4140:0:(mdt_coordinator.c:1500:mdt_hsm_update_request_state()) lustre-MDT0000: Progress on [0x200000401:0x972f:0x0] for cookie 0x51faf3c6 action=ARCHIVE is not coherent (err=12 and not completed (flags=2))
      

      after which the coordinator just stops sending actions to the copytool.

      The coordinator seems to just drop these incoherent progress kernels. Is there a use case for a HPK with hpk_errval != 0 but which is not complete?

      Do not be distracted by the specific errno here. The node is not really OOM, it's just that somewhere in the OST code a NULL something is misinterpreted as meaning -ENOMEM, whereas really it means -ENOENT or something.

      Attachments

        Activity

          People

            jay Jinshan Xiong (Inactive)
            jhammond John Hammond
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: