Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2585

some dd threads can not be stopped after racer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.4.0
    • None
    • 3
    • 6026

    Description

      After racer is finished, I saw some dd threads can not be stopped

      17:10:58:Stopping client client-32vm1.lab.whamcloud.com /mnt/lustre2 opts:
      17:10:59:COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
      17:10:59:dd 1756 root 1w REG 1273,181606 49091584 144115205306079415 /mnt/lustre2/racer/19
      17:11:01:dd 9219 root 1w REG 1273,181606 8193024 144115205306056725 /mnt/lustre/racer/15
      17:11:01:dd 9224 root 1w REG 1273,181606 8193024 144115205306056725 /mnt/lustre2/racer/15
      17:11:02:dd 9225 root 1w REG 1273,181606 8193024 144115205306056725 /mnt/lustre2/racer/15
      17:11:02:dd 11097 root 1w REG 1273,181606 245867520 144115205255725671 /mnt/lustre/racer/13
      17:11:02:Stopping client client-32vm2.lab.whamcloud.com /mnt/lustre2 opts:
      17:11:02:/mnt/lustre2 is still busy, wait one second
      17:11:02:COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
      17:11:02:dd 12954 root 1w REG 1273,181606 119452672 144115205255740618 /mnt/lustre/racer/11
      17:11:03:dd 13442 root 1w REG 1273,181606 116897792 144115205289288929 /mnt/lustre2/racer/0
      17:11:03:dd 18335 root 1w REG 1273,181606 245867520 144115205255725671 /mnt/lustre/racer/10
      17:11:03:dd 22305 root 1w REG 1273,181606 65881088 144115205272518234 /mnt/lustre2/racer/15 (deleted)
      17:11:03:/mnt/lustre2 is still busy, wait one second
      17:11:03:/mnt/lustre2 is still busy, wait one second
      17:11:03:/mnt/lustre2 is still busy, wait one second
      17:11:03:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second
      17:11:05:/mnt/lustre2 is still busy, wait one second

      .....

      It happened to me that these threads were doing single page RPC to flush the dirty data to the server, as I investigated the log before. Unfortunately, I do not have debug log right now.

      Attachments

        Activity

          People

            wc-triage WC Triage
            di.wang Di Wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: