Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4733

All mdt thread stuck in cfs_waitq_wait

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.1.6
    • None
    • 3
    • 13008

    Description

      Hi,

      We are seeing all mdt threads on the MDS stuck in "cfs_waitq_wait". At the same time, we have a lot of rpc request (15k/s).
      Looking closely at the console and the 'bt' from crash, we can see that those threads are coming from qos_statfs_update() where they block in l_wait_event and never wake up.

      What is strange is that cfs_time_beforeq_64(max_age, obd->obd_osfs_age) should be true.

      This issue was hit 4 times during February.

      Please find attached the dmesg and 'foreach bt' outputs.

      Sebastien.

      Attachments

        1. backtrace-20140407
          1.65 MB
        2. bt-all.txt
          604 kB
        3. dmesg(1).txt
          173 kB
        4. lctl-dk.tgz
          322 kB
        5. log_2014-03-02-03.tgz
          1.36 MB
        6. log_OSS_2014-03-02_04.log2.gz
          105 kB

        Activity

          People

            niu Niu Yawei (Inactive)
            sebastien.buisson Sebastien Buisson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: