Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-106

unable to handle kernel paging request in lprocfs_stats_collect()

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.2.0, Lustre 2.1.2
    • Lustre 2.1.0
    • None
    • RHEL6 x86_64
    • 3
    • 4811

    Description

      This kernel oops happened while unmounting an OST. cerebrod is a monitoring daemon that gathers Lustre data from /proc.

      2011-03-01 13:06:10 Lustre: client ffff8802e6927000 umount complete
      2011-03-01 13:06:13 Lustre: Failing over lustre-MDT0000
      2011-03-01 13:06:13 Lustre: Failing over mdd_obd-lustre-MDT0000-0
      2011-03-01 13:06:13 Lustre: mdd_obd-lustre-MDT0000-0: shutting down for failover; client state will be preserved.
      2011-03-01 13:06:13 Lustre: MGS has stopped.
      2011-03-01 13:06:13 Lustre: server umount lustre-MDT0000 complete
      2011-03-01 13:06:15 Lustre: lustre-OST0000: shutting down for failover; client state will be preserved.
      2011-03-01 13:06:15 BUG: unable to handle kernel paging request at 00000000deadcb17
      2011-03-01 13:06:15 IP: [<ffffffffa0400f52>] lprocfs_stats_collect+0x102/0x160 [obdclass]
      2011-03-01 13:06:15 PGD 630662067 PUD 0
      2011-03-01 13:06:15 Oops: 0000 1 SMP
      2011-03-01 13:06:15 last sysfs file: /sys/module/lov/initstate
      2011-03-01 13:06:15 CPU 0
      2011-03-01 13:06:15 Modules linked in: lustre lmv obdfilter ost cmm osd_ldiskfs mdt mdd mds fsfilt_ldiskfs exportfs mgs mgc ldiskfs mbcache jbd2 lov osc mdc fid fld ko2iblnd ptlrpc obdclass lnet lvfs libcfs ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core sg sd_mod crc_t10dif dm_mirror dm_region_hash dm_log dm_mod video output sbs sbshc power_meter hwmon acpi_pad parport serio_raw i2c_i801 i2c_core ata_generic pata_acpi ata_piix iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core mpt2sas scsi_transport_sas raid_class ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_core igb dca [last unloaded: lustre]
      2011-03-01 13:06:15
      2011-03-01 13:06:15 Modules linked in: lustre lmv obdfilter ost cmm osd_ldiskfs mdt mdd mds fsfilt_ldiskfs exportfs mgs mgc ldiskfs mbcache jbd2 lov osc mdc fid fld ko2iblnd ptlrpc obdclass lnet lvfs libcfs ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core sg sd_mod crc_t10dif dm_mirror dm_region_hash dm_log dm_mod video output sbs sbshc power_meter hwmon acpi_pad parport serio_raw i2c_i801 i2c_core ata_generic pata_acpi ata_piix iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core mpt2sas scsi_transport_sas raid_class ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_core igb dca [last unloaded: lustre]
      2011-03-01 13:06:16 Pid: 8198, comm: cerebrod Tainted: G W ---------------- 2.6.32-14chaos #1 X8DTH-i/6/iF/6F
      2011-03-01 13:06:16 RIP: 0010:[<ffffffffa0400f52>] [<ffffffffa0400f52>] lprocfs_stats_collect+0x102/0x160 [obdclass]
      2011-03-01 13:06:16 RSP: 0018:ffff88061dffbd48 EFLAGS: 00010206
      2011-03-01 13:06:16 RAX: 00000000deadcacf RBX: ffff880609677180 RCX: 0000000000000be0
      2011-03-01 13:06:16 RDX: 0000000000000026 RSI: 0000000000000000 RDI: 0000000000000000
      2011-03-01 13:06:16 RBP: ffff88061dffbd88 R08: 0000000000000018 R09: 0000000000000000
      2011-03-01 13:06:16 R10: 7fffffffffffffff R11: 0000000000000bf0 R12: ffff88061dffbd98
      2011-03-01 13:06:16 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      2011-03-01 13:06:16 FS: 00007ffff7ff0700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
      2011-03-01 13:06:16 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      2011-03-01 13:06:16 CR2: 00000000deadcb17 CR3: 000000061d597000 CR4: 00000000000006f0
      2011-03-01 13:06:16 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      2011-03-01 13:06:16 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      2011-03-01 13:06:16 Process cerebrod (pid: 8198, threadinfo ffff88061dffa000, task ffff88063007f560)
      2011-03-01 13:06:16 Stack:
      2011-03-01 13:06:16 ffff880609677180 0000000000000026 0000000000000001 ffff88060970ebe0
      2011-03-01 13:06:16 <0> ffff880609677180 ffff8802e67eaac0 0000000000000000 ffff88061dffbe58
      2011-03-01 13:06:16 <0> ffff88061dffbe18 ffffffffa040100f 0000000000000000 0000000000000000
      2011-03-01 13:06:16 Call Trace:
      2011-03-01 13:06:16 [<ffffffffa040100f>] lprocfs_stats_seq_show+0x5f/0x170 [obdclass]
      2011-03-01 13:06:16 [<ffffffff8118e497>] seq_read+0x267/0x3f0
      2011-03-01 13:06:16 [<ffffffff811ced7e>] proc_reg_read+0x7e/0xc0
      2011-03-01 13:06:16 [<ffffffff8116caf5>] vfs_read+0xb5/0x1a0
      2011-03-01 13:06:16 [<ffffffff8116cc31>] sys_read+0x51/0x90
      2011-03-01 13:06:16 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b
      2011-03-01 13:06:16 Code: 28 4d 39 4c 24 30 7d 05 4d 89 4c 24 30 41 83 c0 01 4d 01 6c 24 38 44 39 c0 77 96 48 8b 55 c8 48 8d 04 92 48 c1 e0 04 48 03 43 10 <48> 8b 40 48 49 89 44 24 48 f6 43 04 01 74 0a 48 83 c3 08 66 ff
      2011-03-01 13:06:16 RIP [<ffffffffa0400f52>] lprocfs_stats_collect+0x102/0x160 [obdclass]
      2011-03-01 13:06:16 RSP <ffff88061dffbd48>
      2011-03-01 13:06:16 CR2: 00000000deadcb17
      2011-03-01 13:06:16 --[ end trace 210f520790780e8d ]--
      2011-03-01 13:06:16 Kernel panic - not syncing: Fatal exception
      2011-03-01 13:06:16 Pid: 8198, comm: cerebrod Tainted: G D W ---------------- 2.6.32-14chaos #1
      2011-03-01 13:06:16 Call Trace:
      2011-03-01 13:06:16 [<ffffffff814c62e3>] panic+0x78/0x137
      2011-03-01 13:06:16 [<ffffffff814ca3b4>] oops_end+0xe4/0x100
      2011-03-01 13:06:16 [<ffffffff8104651b>] no_context+0xfb/0x260
      2011-03-01 13:06:16 [<ffffffff810467a5>] __bad_area_nosemaphore+0x125/0x1e0
      2011-03-01 13:06:16 [<ffffffff810468ce>] bad_area+0x4e/0x60
      2011-03-01 13:06:16 [<ffffffff814cbf00>] do_page_fault+0x390/0x3a0
      2011-03-01 13:06:16 [<ffffffff814c9705>] page_fault+0x25/0x30
      2011-03-01 13:06:16 [<ffffffffa0400f52>] ? lprocfs_stats_collect+0x102/0x160 [obdclass]
      2011-03-01 13:06:16 [<ffffffffa0400eaa>] ? lprocfs_stats_collect+0x5a/0x160 [obdclass]
      2011-03-01 13:06:16 [<ffffffffa040100f>] lprocfs_stats_seq_show+0x5f/0x170 [obdclass]
      2011-03-01 13:06:16 [<ffffffff8118e497>] seq_read+0x267/0x3f0
      2011-03-01 13:06:16 [<ffffffff811ced7e>] proc_reg_read+0x7e/0xc0
      2011-03-01 13:06:16 [<ffffffff8116caf5>] vfs_read+0xb5/0x1a0
      2011-03-01 13:06:16 [<ffffffff8116cc31>] sys_read+0x51/0x90
      2011-03-01 13:06:16 [<ffffffff81013172>] system_call_fastpath+0x16/0x1b

      Attachments

        Issue Links

          Activity

            People

              laisiyao Lai Siyao
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: