Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18246

ZFS OSD - fix for colocated client deadlock

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Some HSM solutions for Lustre use a client on the MDT to create directories and files. In case of small sized instances and during memory pressure, the allocations from ZFS could trigger a deadlock during an inline memory free by calling Lustre mdc APIs. The threads got stuck waiting for RPCs. To avoid this, use spl_fstrans_mark and spl_fstrans_unmark to disable inline memory reclaim. This is based on suggestion from the openZFS community - https://github.com/openzfs/zfs/issues/15786

      [<0>] __switch_to+0x80/0xb0
      [<0>] obd_get_mod_rpc_slot+0x310/0x578 [obdclass]
      [<0>] ptlrpc_get_mod_rpc_slot+0x3c/0x60 [ptlrpc]
      [<0>] mdc_close+0x220/0xe5c [mdc]
      [<0>] lmv_close+0x1ac/0x480 [lmv]
      [<0>] ll_close_inode_openhandle+0x398/0xc8c [lustre]
      [<0>] ll_md_real_close+0xa8/0x288 [lustre]
      [<0>] ll_clear_inode+0x1a4/0x7e0 [lustre]
      [<0>] ll_delete_inode+0x74/0x260 [lustre]
      [<0>] evict+0xe0/0x23c
      [<0>] dispose_list+0x5c/0x7c
      [<0>] prune_icache_sb+0x68/0xa0
      [<0>] super_cache_scan+0x158/0x1c0
      [<0>] do_shrink_slab+0x19c/0x360
      [<0>] shrink_slab+0xc0/0x144
      [<0>] shrink_node_memcgs+0x1e4/0x240
      [<0>] shrink_node+0x154/0x5e0
      [<0>] shrink_zones+0x9c/0x220
      [<0>] do_try_to_free_pages+0xb0/0x300
      [<0>] try_to_free_pages+0x128/0x260
      [<0>] __alloc_pages_slowpath.constprop.0+0x3e0/0x7fc
      [<0>] __alloc_pages_nodemask+0x2bc/0x310
      [<0>] alloc_pages_current+0x90/0x148
      [<0>] allocate_slab+0x3cc/0x4f0
      [<0>] new_slab_objects+0xa4/0x164
      [<0>] ___slab_alloc+0x1b8/0x304
      [<0>] __slab_alloc+0x28/0x60
      [<0>] __kmalloc_node+0x140/0x3e0
      [<0>] spl_kmem_alloc_impl+0xd4/0x134 [spl]
      [<0>] spl_kmem_zalloc+0x20/0x38 [spl]
      [<0>] sa_modify_attrs+0xfc/0x368 [zfs]
      [<0>] sa_attr_op+0x144/0x3d4 [zfs]
      [<0>] sa_bulk_update_impl+0x6c/0x110 [zfs]
      [<0>] sa_update+0x8c/0x170 [zfs]
      [<0>] __osd_sa_xattr_update+0x12c/0x270 [osd_zfs]
      [<0>] osd_object_sa_dirty_rele+0x1a0/0x1a4 [osd_zfs]
      [<0>] osd_trans_stop+0x370/0x720 [osd_zfs]
      [<0>] top_trans_stop+0xb4/0x1024 [ptlrpc]
      [<0>] lod_trans_stop+0x70/0x108 [lod]
      [<0>] mdd_trans_stop+0x3c/0x2ec [mdd]
      [<0>] mdd_create_data+0x43c/0x73c [mdd]
      [<0>] mdt_create_data+0x224/0x39c [mdt]
      [<0>] mdt_mfd_open+0x2e0/0xdc0 [mdt]
      [<0>] mdt_finish_open+0x558/0x840 [mdt]
      [<0>] mdt_open_by_fid_lock+0x468/0xb24 [mdt]
      [<0>] mdt_reint_open+0x824/0x1fb8 [mdt]
      [<0>] mdt_reint_rec+0x168/0x300 [mdt]
      [<0>] mdt_reint_internal+0x5e8/0x9e0 [mdt]
      [<0>] mdt_intent_open+0x170/0x43c [mdt]
      [<0>] mdt_intent_opc+0x16c/0x65c [mdt]
      [<0>] mdt_intent_policy+0x234/0x3b8 [mdt]
      [<0>] ldlm_lock_enqueue+0x4b0/0x97c [ptlrpc]
      [<0>] ldlm_handle_enqueue0+0xa20/0x1b2c [ptlrpc]
      [<0>] tgt_enqueue+0x88/0x2d4 [ptlrpc]
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            timday Tim Day
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: