Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1780

OSS kernel panics after upgrade

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 1.8.8
    • None
    • Sun Fire X4540, RHEL5.4
    • 1
    • 6347

    Description

      We've jsut upgraded our lustre servers from RHEL5.2 to RHEL5.4 and oracle lustre version 1.8.5 to whamcloud lustre version 1.8.8. We did not upgrade ofed, so the servers are running the old ofed-1.4.1
      The clients are running Centos5.8 with lustre-1.8.8
      when we generate IO to lustre, the OSS servers panic and the console shows :-

      Kernel BUG at fs/bio.c:222
      invalid opcode: 0000 [1] SMP
      last sysfs file: /class/infiniband_mad/umad0/port
      CPU 2
      Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ldiskfs(U) jbd2(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) raid456(U) xor(U) ipmi_devintf(U) ipmi_si(U) ipmi_msghandler(U) deflate(U) zlib_deflate(U) ccm(U) serpent(U) blowfish(U) twofish(U) ecb(U) xcbc(U) crypto_hash(U) cbc(U) md5(U) sha256(U) sha512(U) des(U) aes_generic(U) testmgr_cipher(U) testmgr(U) crypto_blkcipher(U) aes_x86_64(U) ah6(U) ah4(U) esp6(U) xfrm6_esp(U) esp4(U) xfrm4_esp(U) aead(U) crypto_algapi(U) xfrm4_tunnel(U) tunnel4(U) xfrm4_mode_tunnel(U) xfrm4_mode_transport(U) xfrm6_mode_transport(U) xfrm6_mode_tunnel(U) ipcomp(U) ipcomp6(U) xfrm6_tunnel(U) tunnel6(U) af_key(U) autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) ib_ipoib(U) ipoib_helper(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) cpufreq_ondemand(U) powernow_k8(U) freq_table(U) mperf(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) qlgc_vnic(U) ib_cm(U) ib_sa(U) ib_uverbs(U) ib_umad(U) iw_nes(U) iw_cxgb3(U) cxgb3(U) ib_ipath(U) mlx4_ib(U) mlx4_core(U) loop(U) dm_mirror(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) joydev(U) ib_mthca(U) tpm_tis(U) tpm(U) k10temp(U) sg(U) hwmon(U) forcedeth(U) amd64_edac_mod(U) ib_mad(U) tpm_bios(U) i2c_nforce2(U) edac_mc(U) i2c_core(U) 8021q(U) pcspkr(U) ib_core(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) shpchp(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) raid1(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
      Pid: 7206, comm: md12_raid5 Tainted: G ---- 2.6.18-308.4.1.el5_lustre #1
      RIP: 0010:[<ffffffff8002dd5b>] [<ffffffff8002dd5b>] bio_put+0xa/0x31
      RSP: 0018:ffff81084e347d08 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff810436ba6440 RCX: ffff81084e31a910
      RDX: ffff810436ba6440 RSI: ffff81045bb32040 RDI: ffff810436ba6440
      RBP: ffff81045bb32040 R08: 0000000000000000 R09: 000000000000003e
      R10: ffff810854affa00 R11: 0000000000000280 R12: ffff81045bb32000
      R13: ffff810854affa00 R14: 00000000ffffffff R15: 0000000000000000
      FS: 00002b5d980be6e0(0000) GS:ffff81010f759240(0000) knlGS:00000000f7d798d0
      CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00002b25b3265000 CR3: 0000000437930000 CR4: 00000000000006e0
      Process md12_raid5 (pid: 7206, threadinfo ffff81084e346000, task ffff810461e450c0)
      Stack: ffffffff80041d34 0000000000000000 ffffffff8892d2ab ffffffffffffffff
      0000000000000000 ffff810475c1bbf0 0000000000000002 0000000000000003
      0000000000000008 0000000000000000 000000000000000a 0000000000000000
      Call Trace:
      [<ffffffff80041d34>] end_bio_bh_io_sync+0x37/0x3b
      [<ffffffff8892d2ab>] :raid456:handle_stripe+0xfd1/0x2549
      [<ffffffff8022720c>] bitmap_daemon_work+0x329/0x33c
      [<ffffffff8002dee8>] __wake_up+0x38/0x4f
      [<ffffffff800a328f>] keventd_create_kthread+0x0/0xc4
      [<ffffffff800a328f>] keventd_create_kthread+0x0/0xc4
      [<ffffffff8892e97b>] :raid456:raid5d+0x158/0x18b
      [<ffffffff8003ab5e>] prepare_to_wait+0x34/0x61
      [<ffffffff80223492>] md_thread+0xf8/0x10e
      [<ffffffff800a34a7>] autoremove_wake_function+0x0/0x2e
      [<ffffffff8022339a>] md_thread+0x0/0x10e
      [<ffffffff80032652>] kthread+0xfe/0x132
      [<ffffffff8005dfb1>] child_rip+0xa/0x11
      [<ffffffff800a328f>] keventd_create_kthread+0x0/0xc4
      [<ffffffff80032554>] kthread+0x0/0x132
      [<ffffffff8005dfa7>] child_rip+0x0/0x11

      Code: 0f 0b 68 41 14 2c 80 c2 de 00 eb fe f0 ff 4f 50 0f 94 c0 84
      RIP [<ffffffff8002dd5b>] bio_put+0xa/0x31
      RSP <ffff81084e347d08>
      <0>Kernel panic - not syncing: Fatal exception

      Attachments

        Issue Links

          Activity

            People

              green Oleg Drokin
              hellenn Hellen (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: