Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3766

ASSERTION( stripe < lio->lis_stripe_count )

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.1.5
    • None
    • Linux 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP
    • 3
    • 9700

    Description

      We have a kernel crash on Lustre Client 2.1.5 with the following assertion:

      LustreError: 31091:0:(lov_io.c:214:lov_sub_get()) ASSERTION( stripe < lio->lis_stripe_count ) failed:
      LustreError: 31091:0:(lov_io.c:214:lov_sub_get()) LBUG

      It very similar to:

      LU-2652
      LU-3524

      This bug has been fixed in 2.4? If so, any plans to fix it in 2.1? And how can you get around the error (perhaps by configuring) without updating?

      [root@r03 lustre_2.1.5]# crash /usr/lib/debug/lib/modules/2.6.32-279.19.1.el6_lustre.x86_64/vmlinux /var/crash/127.0.0.1-2013-08-13-10\:15\:56/vmcore

      crash 6.0.4-2.el6
      Copyright (C) 2002-2012 Red Hat, Inc.
      Copyright (C) 2004, 2005, 2006 IBM Corporation
      Copyright (C) 1999-2006 Hewlett-Packard Co
      Copyright (C) 2005, 2006 Fujitsu Limited
      Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
      Copyright (C) 2005 NEC Corporation
      Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
      Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
      This program is free software, covered by the GNU General Public License,
      and you are welcome to change it and/or distribute copies of it under
      certain conditions. Enter "help copying" to see the conditions.
      This program has absolutely no warranty. Enter "help warranty" for details.

      GNU gdb (GDB) 7.3.1
      Copyright (C) 2011 Free Software Foundation, Inc.
      License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law. Type "show copying"
      and "show warranty" for details.
      This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/2.6.32-279.19.1.el6_lustre.x86_64/vmlinux
      DUMPFILE: /var/crash/127.0.0.1-2013-08-13-10:15:56/vmcore [PARTIAL DUMP]
      CPUS: 16
      DATE: Tue Aug 13 10:14:51 2013
      UPTIME: 4 days, 12:04:11
      LOAD AVERAGE: 0.00, 0.11, 0.12
      TASKS: 513
      NODENAME: r03
      RELEASE: 2.6.32-279.19.1.el6_lustre.x86_64
      VERSION: #1 SMP Wed Mar 20 16:37:18 PDT 2013
      MACHINE: x86_64 (2400 Mhz)
      MEMORY: 12 GB
      PANIC: "Kernel panic - not syncing: LBUG"
      PID: 31091
      COMMAND: "lrvfarmd"
      TASK: ffff88013cd3b500 [THREAD_INFO: ffff880149fd4000]
      CPU: 1
      STATE: TASK_RUNNING (PANIC)

      crash> log

      LustreError: 31091:0:(lov_io.c:214:lov_sub_get()) ASSERTION( stripe < lio->lis_stripe_count ) failed:
      LustreError: 31091:0:(lov_io.c:214:lov_sub_get()) LBUG
      Pid: 31091, comm: lrvfarmd

      Call Trace:
      [<ffffffffa034a785>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      [<ffffffffa034ad97>] lbug_with_loc+0x47/0xb0 [libcfs]
      [<ffffffffa099e93f>] lov_sub_get+0x47f/0x6f0 [lov]
      [<ffffffffa0998cfc>] lov_page_init_raid0+0x14c/0x770 [lov]
      [<ffffffff812754b4>] ? call_rwsem_down_read_failed+0x14/0x30
      [<ffffffffa0995a54>] lov_page_init+0x54/0xe0 [lov]
      [<ffffffffa04a415c>] cl_page_find0+0x1cc/0x850 [obdclass]
      [<ffffffffa04a4811>] cl_page_find+0x11/0x20 [obdclass]
      [<ffffffffa0a591d2>] ll_cl_init+0x152/0x560 [lustre]
      [<ffffffff8116b858>] ? mem_cgroup_cache_charge+0x118/0x130
      [<ffffffffa0a5962a>] ll_readpage+0x4a/0x200 [lustre]
      [<ffffffff811117ec>] generic_file_aio_read+0x1fc/0x700
      [<ffffffff8109672f>] ? up+0x2f/0x50
      [<ffffffffa0a80cdb>] vvp_io_read_start+0x13b/0x3e0 [lustre]
      [<ffffffffa04ac23a>] cl_io_start+0x6a/0x140 [obdclass]
      [<ffffffffa04b0a7c>] cl_io_loop+0xcc/0x190 [obdclass]
      [<ffffffffa0a31047>] ll_file_io_generic+0x3a7/0x560 [lustre]
      [<ffffffffa0a31339>] ll_file_aio_read+0x139/0x2c0 [lustre]
      [<ffffffffa0a317f9>] ll_file_read+0x169/0x2a0 [lustre]
      [<ffffffff81176cb5>] vfs_read+0xb5/0x1a0
      [<ffffffff81176df1>] sys_read+0x51/0x90
      [<ffffffff814ed03e>] ? do_device_not_available+0xe/0x10
      [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

      Kernel panic - not syncing: LBUG
      Pid: 31091, comm: lrvfarmd Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1
      Call Trace:
      [<ffffffff814e9811>] ? panic+0xa0/0x168
      [<ffffffffa034adeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
      [<ffffffffa099e93f>] ? lov_sub_get+0x47f/0x6f0 [lov]
      [<ffffffffa0998cfc>] ? lov_page_init_raid0+0x14c/0x770 [lov]
      [<ffffffff812754b4>] ? call_rwsem_down_read_failed+0x14/0x30
      [<ffffffffa0995a54>] ? lov_page_init+0x54/0xe0 [lov]
      [<ffffffffa04a415c>] ? cl_page_find0+0x1cc/0x850 [obdclass]
      [<ffffffffa04a4811>] ? cl_page_find+0x11/0x20 [obdclass]
      [<ffffffffa0a591d2>] ? ll_cl_init+0x152/0x560 [lustre]
      [<ffffffff8116b858>] ? mem_cgroup_cache_charge+0x118/0x130
      [<ffffffffa0a5962a>] ? ll_readpage+0x4a/0x200 [lustre]
      [<ffffffff811117ec>] ? generic_file_aio_read+0x1fc/0x700
      [<ffffffff8109672f>] ? up+0x2f/0x50
      [<ffffffffa0a80cdb>] ? vvp_io_read_start+0x13b/0x3e0 [lustre]
      [<ffffffffa04ac23a>] ? cl_io_start+0x6a/0x140 [obdclass]
      [<ffffffffa04b0a7c>] ? cl_io_loop+0xcc/0x190 [obdclass]
      [<ffffffffa0a31047>] ? ll_file_io_generic+0x3a7/0x560 [lustre]
      [<ffffffffa0a31339>] ? ll_file_aio_read+0x139/0x2c0 [lustre]
      [<ffffffffa0a317f9>] ? ll_file_read+0x169/0x2a0 [lustre]
      [<ffffffff81176cb5>] ? vfs_read+0xb5/0x1a0
      [<ffffffff81176df1>] ? sys_read+0x51/0x90
      [<ffffffff814ed03e>] ? do_device_not_available+0xe/0x10
      [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

      Attachments

        Activity

          People

            wc-triage WC Triage
            rustequal Rustem Bikboulatov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: