Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6527

Journal commit callback opitmization

Details

    • Improvement
    • Resolution: Fixed
    • Major
    • Lustre 2.9.0
    • None
    • 9223372036854775807

    Description

      Faced with following soft lockup's on mds(mds had internal journal with commit interval 5):

      Feb 18 14:03:28 snx11127n003 kernel: BUG: soft lockup - CPU#2 stuck for 67s! [jbd2/md66-8:132029]
      ...
      Feb 18 14:03:28 snx11127n003 kernel: Pid: 132029, comm: jbd2/md66-8 Not tainted 2.6.32-431.17.1.x2.0.47.x86_64 #1 Intel Corporation S2600JF/S2600JF
      Feb 18 14:03:28 snx11127n003 kernel: RIP: 0010:[<ffffffffa08c9e99>]  [<ffffffffa08c9e99>] ptlrpc_commit_replies+0xb9/0x320 [ptlrpc]
      Feb 18 14:03:28 snx11127n003 kernel: RSP: 0018:ffff880791763c20  EFLAGS: 00000206
      Feb 18 14:03:28 snx11127n003 kernel: RAX: ffff880d260c6030 RBX: ffff880791763c80 RCX: 0000000000000000
      Feb 18 14:03:28 snx11127n003 kernel: RDX: ffff880bc7b01030 RSI: ffff880d1974f880 RDI: ffff881014c32928
      Feb 18 14:03:28 snx11127n003 kernel: RBP: ffffffff8100bb8e R08: 0000000000000002 R09: 5a5a5a5a5a5a5a5a
      Feb 18 14:03:28 snx11127n003 kernel: R10: 5a5a5a5a5a5a5a5a R11: 5a5a5a5a5a5a5a5a R12: ffff88083fcb02c0
      Feb 18 14:03:28 snx11127n003 kernel: R13: ffff88083febc140 R14: ffff88083febccc0 R15: 000001b100000000
      Feb 18 14:03:28 snx11127n003 kernel: FS:  0000000000000000(0000) GS:ffff880044640000(0000) knlGS:0000000000000000
      Feb 18 14:03:28 snx11127n003 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      Feb 18 14:03:28 snx11127n003 kernel: CR2: 00007fb86046c518 CR3: 0000000001a85000 CR4: 00000000000407e0
      Feb 18 14:03:28 snx11127n003 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      Feb 18 14:03:28 snx11127n003 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Feb 18 14:03:28 snx11127n003 kernel: Process jbd2/md66-8 (pid: 132029, threadinfo ffff880791762000, task ffff8807ef3e4ae0)
      Feb 18 14:03:28 snx11127n003 kernel: Stack:
      Feb 18 14:03:28 snx11127n003 kernel:  0000000000000018 ffff881014c32928 ffff880791763c30 ffff880791763c30
      Feb 18 14:03:28 snx11127n003 kernel: <d> 0000000000000000 0000000000000000 ffff880791763c80 ffff880bcc9fe240
      Feb 18 14:03:28 snx11127n003 kernel: <d> 0000000000000000 ffff8807d7ac0000 ffff880d1974f900 ffff880d1974f900
      Feb 18 14:03:28 snx11127n003 kernel: Call Trace:
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa0914258>] ? tgt_cb_last_committed+0x298/0x410 [ptlrpc]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa0f6dba4>] ? osd_trans_commit_cb+0xb4/0x2b0 [osd_ldiskfs]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa0f1c9ba>] ? ldiskfs_journal_commit_callback+0x8a/0xc0 [ldiskfs]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa03df8ef>] ? jbd2_journal_commit_transaction+0x116f/0x15a0 [jbd2]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff81084a8b>] ? try_to_del_timer_sync+0x7b/0xe0
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa03e4c08>] ? kjournald2+0xb8/0x220 [jbd2]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff8109b010>] ? autoremove_wake_function+0x0/0x40
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffffa03e4b50>] ? kjournald2+0x0/0x220 [jbd2]
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff8109ac66>] ? kthread+0x96/0xa0
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff8109abd0>] ? kthread+0x0/0xa0
      Feb 18 14:03:28 snx11127n003 kernel:  [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      Analysis shows that there where about 28 millions jce:

      crash> transaction_t 0xffff880c0331ccc0 | grep handle_c
        t_handle_count = 28881407,

      When kjournald2 is busy, it can't mark running transaction as T_LOCKED, so ldiskfs writers may open transaction handles , add blocks to the transaction and add transaction commit hooks, making the commit / checkpont of the transaction even more complex and time consuming.

      Attachments

        Activity

          [LU-6527] Journal commit callback opitmization
          ys Yang Sheng added a comment -

          Patch landed. Close ticket.

          ys Yang Sheng added a comment - Patch landed. Close ticket.

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18835/
          Subject: LU-6527 ldiskfs: Journal commit callback optimization
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 16ce6a7ecb1558c5a18e8ba4751fef71e6bb97bc

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18835/ Subject: LU-6527 ldiskfs: Journal commit callback optimization Project: fs/lustre-release Branch: master Current Patch Set: Commit: 16ce6a7ecb1558c5a18e8ba4751fef71e6bb97bc

          Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/18835
          Subject: LU-6527 ldiskfs: Journal commit callback optimization
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 5d92db8dd8e147cc1360e409b15cfa0eb83e5864

          gerrit Gerrit Updater added a comment - Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/18835 Subject: LU-6527 ldiskfs: Journal commit callback optimization Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 5d92db8dd8e147cc1360e409b15cfa0eb83e5864
          pjones Peter Jones added a comment -

          Yang Sheng

          Could you please look into what is required here for SLES/RHEL7 and upstream?

          Thanks

          Peter

          pjones Peter Jones added a comment - Yang Sheng Could you please look into what is required here for SLES/RHEL7 and upstream? Thanks Peter

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14611/
          Subject: LU-6527 ext4: journal_commit_callback optimization
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 55afbf5934e145e48a31193de601abd58e605a42

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14611/ Subject: LU-6527 ext4: journal_commit_callback optimization Project: fs/lustre-release Branch: master Current Patch Set: Commit: 55afbf5934e145e48a31193de601abd58e605a42

          It looks like they will need to be separate patches for those kernels, since they no longer have the journal_conmit_callback patch. Also, if would be good to submit this upstream at the same time so that we don't need to carry these patches forever.

          adilger Andreas Dilger added a comment - It looks like they will need to be separate patches for those kernels, since they no longer have the journal_conmit_callback patch. Also, if would be good to submit this upstream at the same time so that we don't need to carry these patches forever.

          Hi Andreas,

          Do you want me to add fixes for SLES11 and RHEL7 to existed patches ? Or should I prepare separate patches ?

          scherementsev Sergey Cheremencev added a comment - Hi Andreas, Do you want me to add fixes for SLES11 and RHEL7 to existed patches ? Or should I prepare separate patches ?

          Please make versions of these two patches for SLES11 and RHEL7. I suspect the same SLES11 patch will also apply to RHEL7 without problems.

          Please also submit the http://review.whamcloud.com/14610 patch upstream to linux-ext4, so that we don't have to carry it in the ldiskfs patch list forever. I'm not sure whether the http://review.whamcloud.com/14611 patch would be accepted upstream or not, but you could try to submit that separately (so that it doesn't prevent 14610 from landing).

          adilger Andreas Dilger added a comment - Please make versions of these two patches for SLES11 and RHEL7. I suspect the same SLES11 patch will also apply to RHEL7 without problems. Please also submit the http://review.whamcloud.com/14610 patch upstream to linux-ext4, so that we don't have to carry it in the ldiskfs patch list forever. I'm not sure whether the http://review.whamcloud.com/14611 patch would be accepted upstream or not, but you could try to submit that separately (so that it doesn't prevent 14610 from landing).

          I see http://review.whamcloud.com/14610 only changes the el6 version of the journal-callback patch. Does the sles11 version need a similar change?

          I think yes. But I didn't make tests on sles.
          If it is needed I can prepare patch for sles11 also.

          scherementsev Sergey Cheremencev added a comment - I see http://review.whamcloud.com/14610 only changes the el6 version of the journal-callback patch. Does the sles11 version need a similar change? I think yes. But I didn't make tests on sles. If it is needed I can prepare patch for sles11 also.

          I see http://review.whamcloud.com/14610 only changes the el6 version of the journal-callback patch. Does the sles11 version need a similar change?

          bogl Bob Glossman (Inactive) added a comment - I see http://review.whamcloud.com/14610 only changes the el6 version of the journal-callback patch. Does the sles11 version need a similar change?

          People

            ys Yang Sheng
            scherementsev Sergey Cheremencev
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: