Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12504

Lustre stalls with "slow creates" on disabled OST

    XMLWordPrintable

Details

    • Question/Request
    • Resolution: Fixed
    • Minor
    • None
    • Lustre 2.5.5
    • None
    • 9223372036854775807

    Description

      Greetings,

      We had an OST which was physically damaged recently on our Lustre 2.5.5 system. We were able to deactivate new file creation on the OST from the MDS (using lctl --device data-OST0036-osc-MDT0000 deactivate) , and lfs_migrate the data off, but then there were still quota problems when contacting the damaged OST. So, we tried to disable the OST from the client side as well.

      That worked, but now there are stray messages from our MDS warning of “slow creates” to this supposedly disabled OST, and filesystem creates are now very slow:

      Jul 2 08:40:21 mds1 kernel: Lustre: data-OST0036-osc-MDT0000: slow creates, last=[0x100360000:0xe4f61:0x0], next=[0x100360000:0xe4f61:0x0], reserved=0, syn_changes=0, syn_rpc_in_progress=0, status=-19

      All of the below have been tried to fix this on the MDS:

      lctl --device data-OST0036-osc-MDT0000 deactivate
      lctl conf_param data-OST0036-osc-MDT0000.osc.active=0
      lctl conf_param data-OST0036.osc.active=0
      lctl set_param osp.data-OST0036-osc-MDT0000.active=0
      lctl set_param osp.data-OST0036-*.max_create_count=0

      On clients, the OST is disabled, and the logs show “Lustre: setting import data-OST0036_UUID INACTIVE by administrator request”:

      client$ lctl get_param osc.*-OST0036*.active
      osc.data-OST0036-osc-ffff882023331800.active=0

      The MDS also believes this OST is inactive:

      mds$ cat /proc/fs/lustre/osp/data-OST0036-osc-MDT0000/active
      0

      However, the slow creates message persists on the MDS, about one every 10 minutes, always with the same “last” and “next” ids. Is there something we have missed, or some other way this should have been resolved to permanently remove this OST?

      We have not yet tried standing up a new OST at the same index, or restarting the MDS.

      (Update: Standing up a new OST to replace the defunct blank one, and setting back to active, cleaned this up.  It still would be nice to know the proper way to handle this situation, though.)

      Thanks for any advice you may have,

      Chris

       

      Attachments

        Activity

          People

            hannac Chris Hanna
            hannac Chris Hanna
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: