Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1248

all mdt_rdpg_* threads busy in osd_ea_fid_get()

    XMLWordPrintable

Details

    • 3
    • 6428

    Description

      The load average on the MDS for a classified production 2.1 filesystem jumped to over 400. Top showed mdt_rdpg_* threads all using 4-7% CPU time. This may have been due to a pathological workload, but we were wondering if there's something like an overly contended lock in ldiskfs going on here.

      Most of the stacks looked like this:

      __cond_resched
      _cond_resched
      ifind_fast
      iget_locked
      ldiskfs_iget
      ? generic_detach_inode
      osd_iget
      osd_ea_fid_get
      osd_it_ea_rec
      mdd_readpage
      cml_readpage
      mdt_readpage
      ? mdt_unpack_req_pack_rep
      mdt_handle_common
      ? lustre_msg_get_transno
      mdt_readpage_handle
      ptlrpc_main
      child_rip

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            nedbass Ned Bass (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: