Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6471

Unexpected Lustre Client LBUG in llog_write()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.5.3
    • None
    • 3
    • 9223372036854775807

    Description

      Some IEEL2.2.x and Lustre 2.5 installations have triggered this unexpected LBUG on Lustre Clients consuming ChangeLogs :

      <3>LustreError: 31536:0:(llog_cat.c:192:llog_cat_id2handle()) store2-MDT0000-mdc-ffff88206515bc00: error opening log id 0x2def:1:0: rc = -2
      <3>LustreError: 31536:0:(llog_cat.c:565:llog_cat_process_cb()) store2-MDT0000-mdc-ffff88206515bc00: cannot find handle for llog 0x2def:1: -2
      <0>LustreError: 31536:0:(llog.c:861:llog_write()) ASSERTION( loghandle->lgh_obj != ((void *)0) ) failed:
      <0>LustreError: 31536:0:(llog.c:861:llog_write()) LBUG
      <4>Pid: 31536, comm: mdc_clg_send_th
      <4>
      <4>Call Trace:
      <4> [<ffffffffa0437895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4> [<ffffffffa0437e97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4> [<ffffffffa0553c52>] llog_write+0x352/0x420 [obdclass]
      <4> [<ffffffffa0553e0c>] llog_cancel_rec+0xbc/0x7c0 [obdclass]
      <4> [<ffffffffa0558bcc>] llog_cat_cleanup+0x9c/0x260 [obdclass]
      <4> [<ffffffffa0559fd8>] llog_cat_process_cb+0x588/0x630 [obdclass]
      <4> [<ffffffffa0554e6b>] llog_process_thread+0x95b/0xd70 [obdclass]
      <4> [<ffffffffa0559a50>] ? llog_cat_process_cb+0x0/0x630 [obdclass]
      <4> [<ffffffffa0556a67>] llog_process_or_fork+0x127/0x550 [obdclass]
      <4> [<ffffffffa0557f09>] llog_cat_process_or_fork+0x89/0x350 [obdclass]
      <4> [<ffffffff8152aede>] ? mutex_lock+0x1e/0x50
      <4> [<ffffffffa094fe10>] ? changelog_kkuc_cb+0x0/0x250 [mdc]
      <4> [<ffffffffa05581e9>] llog_cat_process+0x19/0x20 [obdclass]
      <4> [<ffffffffa09541eb>] mdc_changelog_send_thread+0x4cb/0x810 [mdc]
      <4> [<ffffffffa0953d20>] ? mdc_changelog_send_thread+0x0/0x810 [mdc]
      <4> [<ffffffff8109e66e>] kthread+0x9e/0xc0
      <4> [<ffffffff8100c20a>] child_rip+0xa/0x20
      <4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
      <4>
      <0>Kernel panic - not syncing: LBUG
      <4>Pid: 31536, comm: mdc_clg_send_th Not tainted 2.6.32-504.12.2.el6.x86_64 #1
      <4>Call Trace:
      <4> [<ffffffff8152933c>] ? panic+0xa7/0x16f
      <4> [<ffffffffa0437eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      <4> [<ffffffffa0553c52>] ? llog_write+0x352/0x420 [obdclass]
      <4> [<ffffffffa0553e0c>] ? llog_cancel_rec+0xbc/0x7c0 [obdclass]
      <4> [<ffffffffa0558bcc>] ? llog_cat_cleanup+0x9c/0x260 [obdclass]
      <4> [<ffffffffa0559fd8>] ? llog_cat_process_cb+0x588/0x630 [obdclass]
      <4> [<ffffffffa0554e6b>] ? llog_process_thread+0x95b/0xd70 [obdclass]
      <4> [<ffffffffa0559a50>] ? llog_cat_process_cb+0x0/0x630 [obdclass]
      <4> [<ffffffffa0556a67>] ? llog_process_or_fork+0x127/0x550 [obdclass]
      <4> [<ffffffffa0557f09>] ? llog_cat_process_or_fork+0x89/0x350 [obdclass]
      <4> [<ffffffff8152aede>] ? mutex_lock+0x1e/0x50
      <4> [<ffffffffa094fe10>] ? changelog_kkuc_cb+0x0/0x250 [mdc]
      <4> [<ffffffffa05581e9>] ? llog_cat_process+0x19/0x20 [obdclass]
      <4> [<ffffffffa09541eb>] ? mdc_changelog_send_thread+0x4cb/0x810 [mdc]
      <4> [<ffffffffa0953d20>] ? mdc_changelog_send_thread+0x0/0x810 [mdc]
      <4> [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
      <4> [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      <4> [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
      <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      The problem seems to have been introduced by patch for LU-5195, which does not handle the case where code is running on a Lustre Client (ie, cathandle->lgh_obj == NULL) and is only a ChangeLog/LLOG consumer/replicator.

      This problem had not been seen before since concerned change is only being executed when triggering some catalog inconsistency. Looks like master is also impacted, will push a patch soon.

      Attachments

        Issue Links

          Activity

            People

              bfaccini Bruno Faccini (Inactive)
              bfaccini Bruno Faccini (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: