Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8010

lfs hsm command hangs up after lfs hsm_cancel

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.9.0
    • Lustre 2.8.0
    • 3
    • HSM
    • 9223372036854775807

    Description

      When hsm_restore are canceled, the following lfs hsm command hangs up.
      To reproduce, setup HSM and do the following
      (We found 2 patterns. They seem to be hanged at different points.):

      (1)

      # lfs hsm_state /lustre/file
      /lustre/file: (0x0000000d) released exists archived, archive_id:1
      # lfs hsm_restore /lustre/file
      # lfs hsm_cancel /lustre/file
      # lfs hsm_restore /lustre/file (quickly after lfs hsm_cancel)
      # lfs hsm_restore /lustre/file  *hang up
      

      The call trace is as following:

      PID: 9550   TASK: ffff880c33f86040  CPU: 13  COMMAND: "lfs"
       #0 [ffff880c1ca8f568] schedule at ffffffff81539170
       #1 [ffff880c1ca8f640] schedule_timeout at ffffffff8153a042
       #2 [ffff880c1ca8f6f0] ldlm_completion_ast at ffffffffa0847fc9 [ptlrpc]
       #3 [ffff880c1ca8f7a0] ldlm_cli_enqueue_fini at ffffffffa08420e6 [ptlrpc]
       #4 [ffff880c1ca8f840] ldlm_cli_enqueue at ffffffffa08429a1 [ptlrpc]
       #5 [ffff880c1ca8f8f0] mdc_enqueue at ffffffffa0a6e8aa [mdc]
       #6 [ffff880c1ca8fa40] lmv_enqueue at ffffffffa0a25bfb [lmv]
       #7 [ffff880c1ca8fac0] ll_layout_refresh_locked at ffffffffa0b084f6 [lustre]
       #8 [ffff880c1ca8fc00] ll_layout_refresh at ffffffffa0b09159 [lustre]
       #9 [ffff880c1ca8fc50] vvp_io_init at ffffffffa0b5227f [lustre]
      #10 [ffff880c1ca8fcc0] cl_io_init0 at ffffffffa06a8e78 [obdclass]
      #11 [ffff880c1ca8fd00] cl_io_init at ffffffffa06abdf4 [obdclass]
      #12 [ffff880c1ca8fd40] cl_glimpse_size0 at ffffffffa0b4bc05 [lustre]
      #13 [ffff880c1ca8fda0] ll_getattr at ffffffffa0b07cb8 [lustre]
      #14 [ffff880c1ca8fe40] vfs_getattr at ffffffff81197c61
      #15 [ffff880c1ca8fe80] vfs_fstatat at ffffffff81197cf4
      #16 [ffff880c1ca8fee0] vfs_lstat at ffffffff81197d9e
      #17 [ffff880c1ca8fef0] sys_newlstat at ffffffff81197dc4
      #18 [ffff880c1ca8ff80] tracesys at ffffffff8100b2e8 (via system_call)
      

      (2)

      # lfs hsm_state /lustre/file
      /lustre/file: (0x0000000d) released exists archived, archive_id:1
      # lfs hsm_restore /lustre/file
      <MDS Immediate reset>
      <MDS recover and HSM setup>
      # lfs hsm_action /lustre/file
      /lustre/file: RESTORE waiting (from 0 to EOF)
      # lfs hsm_state /lustre/file
      /lustre/file: (0x0000000d) released exists archived, archive_id:1
      # lfs hsm_cancel /lustre/file
      # lfs hsm_restore /lustre/file   *hang up
      

      The call trace is as following:

      PID: 3731   TASK: ffff880c35087520  CPU: 13  COMMAND: "lfs"
       #0 [ffff880c327df598] schedule at ffffffff81539170
       #1 [ffff880c327df670] schedule_timeout at ffffffff8153a042
       #2 [ffff880c327df720] ptlrpc_set_wait at ffffffffa0860c41 [ptlrpc]
       #3 [ffff880c327df7e0] ptlrpc_queue_wait at ffffffffa0861301 [ptlrpc]
       #4 [ffff880c327df800] mdc_ioc_hsm_request at ffffffffa0a65138 [mdc]
       #5 [ffff880c327df830] mdc_iocontrol at ffffffffa0a66df9 [mdc]
       #6 [ffff880c327df950] obd_iocontrol at ffffffffa0a16aa5 [lmv]
       #7 [ffff880c327df9a0] lmv_iocontrol at ffffffffa0a2d9dc [lmv]
       #8 [ffff880c327dfb90] obd_iocontrol at ffffffffa0af2d15 [lustre]
       #9 [ffff880c327dfbe0] ll_dir_ioctl at ffffffffa0af9ae8 [lustre]
      #10 [ffff880c327dfe60] vfs_ioctl at ffffffff811a7972
      #11 [ffff880c327dfea0] do_vfs_ioctl at ffffffff811a7b14
      #12 [ffff880c327dff30] sys_ioctl at ffffffff811a8091
      #13 [ffff880c327dff80] tracesys at ffffffff8100b2e8 (via system_call)
      

      Attachments

        Activity

          People

            bfaccini Bruno Faccini (Inactive)
            takamura Tatsushi Takamura
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: