Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7100

conf-sanity test_84 LBUGS with “(llog_osd.c:811:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index )”

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • Lustre 2.10.0
    • Lustre 2.8.0
    • None
    • Tests run in the autotest environment
    • 3
    • 9223372036854775807

    Description

      conf-sanity test 84 hangs at mount. We’ve seen this test LBUG with the stack trace below three times in the past month. Logs for an interop occurrence are at https://testing.hpdd.intel.com/test_sets/9145fb1a-51a8-11e5-9249-5254006e85c2

      From the MDS log:

      00:44:38:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
      00:44:38:LustreError: 18100:0:(llog_osd.c:811:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed: 
      00:44:38:LustreError: 18100:0:(llog_osd.c:811:llog_osd_next_block()) LBUG
      00:44:38:Pid: 18100, comm: llog_process_th
      00:44:38:
      00:44:38:Call Trace:
      00:44:38: [<ffffffffa046c875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      00:44:38: [<ffffffffa046ce77>] lbug_with_loc+0x47/0xb0 [libcfs]
      00:44:38: [<ffffffffa058ed25>] llog_osd_next_block+0xb75/0xbf0 [obdclass]
      00:44:38: [<ffffffffa0580bae>] llog_process_thread+0x2de/0xfc0 [obdclass]
      00:44:38: [<ffffffffa05cc3a5>] ? keys_fill+0xd5/0x1b0 [obdclass]
      00:44:38: [<ffffffffa0581ed5>] llog_process_thread_daemonize+0x45/0x70 [obdclass]
      00:44:38: [<ffffffffa0581e90>] ? llog_process_thread_daemonize+0x0/0x70 [obdclass]
      00:44:38: [<ffffffff8109e78e>] kthread+0x9e/0xc0
      00:44:38: [<ffffffff8100c28a>] child_rip+0xa/0x20
      00:44:38: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
      00:44:38: [<ffffffff8100c280>] ? child_rip+0x0/0x20
      00:44:38:
      00:44:38:Kernel panic - not syncing: LBUG
      00:44:38:Pid: 18100, comm: llog_process_th Not tainted 2.6.32-504.30.3.el6_lustre.g339e9ad.x86_64 #1
      

      In a different occurrence and in a DNE setup, with logs at https://testing.hpdd.intel.com/test_sets/2eae8eae-4f7d-11e5-bc53-5254006e85c2, the MDS console has a few more errors before the LBUG:

      22:49:38:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
      22:49:38:LustreError: 14217:0:(llog_osd.c:788:llog_osd_next_block()) lustre-MDT0000-osd: invalid llog tail at log id 0x4:10/0 offset 16384
      22:49:38:LustreError: 14198:0:(mgs_llog.c:457:mgs_find_or_make_fsdb()) Can't get db from client log -22
      22:49:38:LustreError: 14198:0:(mgs_llog.c:496:mgs_check_index()) Can't get db for lustre
      22:49:38:LustreError: 14219:0:(llog_osd.c:778:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed: 
      22:49:38:LustreError: 14219:0:(llog_osd.c:778:llog_osd_next_block()) LBUG
      22:49:38:Pid: 14219, comm: llog_process_th
      22:49:38:
      22:49:38:Call Trace:
      22:49:38: [<ffffffffa046c875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      22:49:38: [<ffffffffa046ce77>] lbug_with_loc+0x47/0xb0 [libcfs]
      22:49:38: [<ffffffffa058ed15>] llog_osd_next_block+0xb75/0xbf0 [obdclass]
      22:49:38: [<ffffffffa0580b4e>] llog_process_thread+0x2de/0xfc0 [obdclass]
      22:49:38: [<ffffffffa05cc0e5>] ? keys_fill+0xd5/0x1b0 [obdclass]
      22:49:38: [<ffffffffa0581e75>] llog_process_thread_daemonize+0x45/0x70 [obdclass]
      22:49:38: [<ffffffffa0581e30>] ? llog_process_thread_daemonize+0x0/0x70 [obdclass]
      22:49:38: [<ffffffff8109e78e>] kthread+0x9e/0xc0
      22:49:38: [<ffffffff8100c28a>] child_rip+0xa/0x20
      22:49:38: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
      22:49:38: [<ffffffff8100c280>] ? child_rip+0x0/0x20
      22:49:38:
      22:49:38:Kernel panic - not syncing: LBUG
      22:49:38:Pid: 14219, comm: llog_process_th Not tainted 2.6.32-504.30.3.el6_lustre.gc67434c.x86_64 #1
      

      Another set of logs on review-dne-part-1 are at https://testing.hpdd.intel.com/test_sets/189b85b6-38a5-11e5-9f03-5254006e85c2

      Attachments

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: