Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4090

OST unavailable due to possible deadlock

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Lustre 2.7.0, Lustre 2.5.3
    • Lustre 1.8.8
    • None
    • 3
    • 10994

    Description

      One OST became unavailable ane kept on dumping stack traces until its service is taken over by another OSS. This issue occured a couple of time on different servers.

      After some inverstigation, we found that a lot of service theads hang at different places. Here is a list of where they stuck.

      ll_ost_01:10226,-ll_ost_07:10232,-ll_ost_09:10234,-ll_ost_11:10236,-ll_ost_13:10238,-ll_ost_15:10240,-ll_ost_18:10243
      filter_lvbo_init
      --filter_fid2dentry
      ----filter_parent_lock
      ------filter_lock_dentry
      -------LOCK_INODE_MUTEX(dparent>d_inode);

      ll_ost_06:10231,-ll_ost_16:10241,-ll_ost_484,-ll_ost_io_129,-ll_ost_io_123,-ll_ost_383
      fsfilt_ext3_start
      --ext3_journal_start
      ----journal_start
      ------start_this_handle
      ----------__jbd2_log_wait_for_space
      -----------mutex_lock(&journal>j_checkpoint_mutex);

      ll_ost_17:10242
      filter_lvbo_init
      --filter_fid2dentry
      ----filter_parent_lock
      ----lookup_one_len
      ------__lookup_hash
      -------inode>i_op->lookup-=-ext4_lookup
      ----------ext4_iget
      ------------iget_locked
      --------------ifind_fast
      ----------------find_inode_fast
      ------------------__wait_on_freeing_inode
      -------------------?ldiskfs_bread...-Child-dentry's-inode__I_LOCK

      ll_ost_io_15
      ost_brw_write
      --filter_commitrw_write
      ----fsfilt_ext3_commit_wait
      ------autoremove_wake_function
      -------fsfilt_log_wait_commit=-jbd2_log_wait_commit

      We think that is not neccessarily the problem of Lustre codes. And we found a nearly merged patch which fixes a similar deadlock problem in __jbd2_log_wait_for_space(). Maybe it is the root cause?

      https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/fs/jbd2/checkpoint.c?id=0ef54180e0187117062939202b96faf04c8673bc

      Attachments

        Activity

          [LU-4090] OST unavailable due to possible deadlock
          bobijam Zhenyu Xu added a comment -

          I deleted the old patch and updated it.

          bobijam Zhenyu Xu added a comment - I deleted the old patch and updated it.
          bobijam Zhenyu Xu added a comment -

          found such backtrace in ALPL202.messages_20150518.txt

          May 17 11:57:05 ALPL202 kernel: INFO: task ll_ost_io_144:20839 blocked for more than 120 seconds.
          May 17 11:57:05 ALPL202 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          May 17 11:57:05 ALPL202 kernel: ll_ost_io_144 D ffff810c3e08f7a0     0 20839      1         20840 20838 (L-TLB)
          May 17 11:57:05 ALPL202 kernel:  ffff810911f63800 0000000000000046 ffff81034d1e1460 ffff8105ee5276c0
          May 17 11:57:05 ALPL202 kernel:  ffff8105ee5276c0 000000000000000a ffff8108d1fee7e0 ffff810c3e08f7a0
          May 17 11:57:05 ALPL202 kernel:  00129c413d2f65fb 000000000071882c ffff8108d1fee9c8 000000035a5a5a5a
          May 17 11:57:05 ALPL202 kernel: Call Trace:
          May 17 11:57:05 ALPL202 kernel:  [<ffffffff8002e4d5>] __wake_up+0x38/0x4f
          May 17 11:57:05 ALPL202 kernel:  [<ffffffff88aa0840>] :jbd2:jbd2_log_wait_commit+0xa3/0xf5
          ...
          

          and I checked Linux kernel patch, this patch commit 3469a32a1e948c54204b5dd6f7476a7d11349e9e fits to the symptom

              jbd2: don't hold j_state_lock while calling wake_up()
              
              The j_state_lock is one of the hottest locks in the jbd2 layer and
              thus one of its scalability bottlenecks.
              
              We don't need to be holding the j_state_lock while we are calling
              wake_up(&journal->j_wait_commit), so release the lock a little bit
              earlier.
          

          so I include it in this patch, would you mind give it a try?

          bobijam Zhenyu Xu added a comment - found such backtrace in ALPL202.messages_20150518.txt May 17 11:57:05 ALPL202 kernel: INFO: task ll_ost_io_144:20839 blocked for more than 120 seconds. May 17 11:57:05 ALPL202 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 17 11:57:05 ALPL202 kernel: ll_ost_io_144 D ffff810c3e08f7a0 0 20839 1 20840 20838 (L-TLB) May 17 11:57:05 ALPL202 kernel: ffff810911f63800 0000000000000046 ffff81034d1e1460 ffff8105ee5276c0 May 17 11:57:05 ALPL202 kernel: ffff8105ee5276c0 000000000000000a ffff8108d1fee7e0 ffff810c3e08f7a0 May 17 11:57:05 ALPL202 kernel: 00129c413d2f65fb 000000000071882c ffff8108d1fee9c8 000000035a5a5a5a May 17 11:57:05 ALPL202 kernel: Call Trace: May 17 11:57:05 ALPL202 kernel: [<ffffffff8002e4d5>] __wake_up+0x38/0x4f May 17 11:57:05 ALPL202 kernel: [<ffffffff88aa0840>] :jbd2:jbd2_log_wait_commit+0xa3/0xf5 ... and I checked Linux kernel patch, this patch commit 3469a32a1e948c54204b5dd6f7476a7d11349e9e fits to the symptom jbd2: don't hold j_state_lock while calling wake_up() The j_state_lock is one of the hottest locks in the jbd2 layer and thus one of its scalability bottlenecks. We don't need to be holding the j_state_lock while we are calling wake_up(&journal->j_wait_commit), so release the lock a little bit earlier. so I include it in this patch, would you mind give it a try?

          Hi,

          It is Rhel5 and kernel version is 2.6.18-348.3.1-xxx

          Regards,

          wangshilong Wang Shilong (Inactive) added a comment - Hi, It is Rhel5 and kernel version is 2.6.18-348.3.1-xxx Regards,
          bobijam Zhenyu Xu added a comment -

          What linux distro does the OST use?

          bobijam Zhenyu Xu added a comment - What linux distro does the OST use?

          Hi,

          After applying following patch:
          https://jira.hpdd.intel.com/secure/attachment/17458/0001-LU-4090-fsfilt-don-t-wait-forever-for-stale-tid.patch

          We still hit this issue, please check latest uploaded logs(xxx20150518.txt). And i am wondering whether following thread is
          related to this bug?

          https://lkml.org/lkml/2012/7/11/255

          wangshilong Wang Shilong (Inactive) added a comment - Hi, After applying following patch: https://jira.hpdd.intel.com/secure/attachment/17458/0001-LU-4090-fsfilt-don-t-wait-forever-for-stale-tid.patch We still hit this issue, please check latest uploaded logs(xxx20150518.txt). And i am wondering whether following thread is related to this bug? https://lkml.org/lkml/2012/7/11/255
          bobijam Zhenyu Xu added a comment -

          I think what you described could be the reason some processes hang waiting for transaction commit, and would you give this patch a try? It changes fsfilt_ext3_commit_async() to avoid call jbd2_log_start_commit() with a stale tid.

          bobijam Zhenyu Xu added a comment - I think what you described could be the reason some processes hang waiting for transaction commit, and would you give this patch a try? It changes fsfilt_ext3_commit_async() to avoid call jbd2_log_start_commit() with a stale tid.

          Zhengyu, yes, it is.

          lixi Li Xi (Inactive) added a comment - Zhengyu, yes, it is.
          bobijam Zhenyu Xu added a comment -

          So the system still uses 1.8 version?

          bobijam Zhenyu Xu added a comment - So the system still uses 1.8 version?

          People

            bobijam Zhenyu Xu
            lixi Li Xi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: