Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4708

BUG: unable to handle kernel paging request in ldiskfs

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.2
    • Lustre 2.4.2
    • None
    • RHEL 6.4/ distro IB
      Kernel 2.6.32-358.23.2.el6.atlas.x86_64
      lustre-2.4.2-2.6.32_358.23.2.el6.atlas.x86_64.x86_64
       -- with LU4008
    • 3
    • 12945

    Description

      This filesystem was upgraded from lustre 1.8 to 2.4.2 about 3 weeks ago. It was recently rebooted with LU4008 yesterday. Can this crash be attributed to the upgrade or the LU4008 patch or something else?

      2014-03-04 07:56:40 [66446.201681] BUG: unable to handle kernel paging request at ffff8803b298b000
      2014-03-04 07:56:40 [66446.202606] IP: [<ffffffff81281ac0>] memcpy+0x10/0x120
      2014-03-04 07:56:40 [66446.202606] PGD 1a86063 PUD 3427067 PMD 35bc067 PTE 80000003b298b060
      2014-03-04 07:56:40 [66446.202606] Oops: 0000 1 SMP DEBUG_PAGEALLOC
      2014-03-04 07:56:40 [66446.202606] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:04.0/local_cpus
      2014-03-04 07:56:40 [66446.202606] CPU 0
      2014-03-04 07:56:40 [66446.202606] Modules linked in: osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) ldiskfs(U) mbcache lquota(U) jbd2 mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) autofs4 ipmi_devintf 8021q garp stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_REJECT xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad ib_core dm_mirror dm_region_hash dm_log dm_multipath dm_mod sg ses enclosure sd_mod crc_t10dif sr_mod cdrom aacraid microcode serio_raw k10temp amd64_edac_mod edac_core edac_mce_amd ata_generic pata_acpi pata_atiixp i2c_piix4 i2c_core ahci shpchp ipv6 nfs lockd fscache auth_rpcgss nfs_acl sunrpc mlx4_en mlx4_core igb dca ptp pps_core [last unloaded: scsi_wait_scan]
      2014-03-04 07:56:40 [66446.202606]
      2014-03-04 07:56:40 [66446.202606] Pid: 20122, comm: mdt_rdpg00_003 Not tainted 2.6.32-358.23.2.el6.atlas.x86_64 #1 Penguin Computing Altus 2800/H8DGU
      2014-03-04 07:56:40 [66446.202606] RIP: 0010:[<ffffffff81281ac0>] [<ffffffff81281ac0>] memcpy+0x10/0x120
      2014-03-04 07:56:40 [66446.202606] RSP: 0018:ffff8803b49b3908 EFLAGS: 00010202
      2014-03-04 07:56:40 [66446.202606] RAX: ffff8803e6e8a3ae RBX: 000000006a8ea4d4 RCX: 0000000000000001
      2014-03-04 07:56:40 [66446.202606] RDX: 0000000000000001 RSI: ffff8803b298b000 RDI: ffff8803e6e8a3b6
      2014-03-04 07:56:40 [66446.202606] RBP: ffff8803b49b3950 R08: 0000000000000000 R09: ffff8803e6e8a380
      2014-03-04 07:56:40 [66446.202606] R10: 0000000000000008 R11: 00000000a0cde035 R12: ffff8803e6e8a380
      2014-03-04 07:56:40 [66446.202606] R13: ffff8803b4211700 R14: ffff8803b298aff0 R15: ffff8803b4211700
      2014-03-04 07:56:40 [66446.202606] FS: 00007f928c43e700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
      2014-03-04 07:56:40 [66446.202606] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      2014-03-04 07:56:40 [66446.202606] CR2: ffff8803b298b000 CR3: 0000000418ff2000 CR4: 00000000000007f0
      2014-03-04 07:56:40 [66446.202606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      2014-03-04 07:56:40 [66446.202606] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      2014-03-04 07:56:40 [66446.202606] Process mdt_rdpg00_003 (pid: 20122, threadinfo ffff8803b49b2000, task ffff880417168080)
      2014-03-04 07:56:40 [66446.202606] Stack:
      2014-03-04 07:56:40 [66446.202606] ffffffffa0c354f0 6a8ea4d48cf0fcbe f46fc73300000001 ffff8803b49b3950
      2014-03-04 07:56:40 [66446.202606] <d> ffff8803b298aff0 ffff8803a4a367b0 ffff8803b49b3a30 ffff8807ac4293d0
      2014-03-04 07:56:40 [66446.202606] <d> 000000000000000e ffff8803b49b39c0 ffffffffa0c57718 0000000000000002
      2014-03-04 07:56:40 [66446.202606] Call Trace:
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c354f0>] ? ldiskfs_htree_store_dirent+0xb0/0x1c0 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c57718>] htree_dirblock_to_tree+0x128/0x190 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c5a0ba>] ldiskfs_htree_fill_tree+0x16a/0x280 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c35887>] ldiskfs_readdir+0x127/0x730 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c353a5>] ? call_filldir+0xb5/0x150 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0cc3170>] ? osd_ldiskfs_filldir+0x0/0x480 [osd_ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c35d09>] ? ldiskfs_readdir+0x5a9/0x730 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0cc3170>] ? osd_ldiskfs_filldir+0x0/0x480 [osd_ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0c74190>] ? htree_lock_try+0x40/0x80 [ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0cb597b>] osd_ldiskfs_it_fill+0xab/0x1e0 [osd_ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0cb5c26>] osd_it_ea_next+0x96/0x190 [osd_ldiskfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0e7f4f1>] lod_it_next+0x21/0x90 [lod]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0b3b171>] mdd_dir_page_build+0x91/0x210 [mdd]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0583b12>] dt_index_walk+0x162/0x3d0 [obdclass]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0b3b0e0>] ? mdd_dir_page_build+0x0/0x210 [mdd]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0b3ce2b>] mdd_readpage+0x38b/0x5a0 [mdd]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0dc728f>] mdt_readpage+0x47f/0x960 [mdt]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0db5b37>] mdt_handle_common+0x647/0x16d0 [mdt]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa070bd3c>] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa0df18c5>] mds_readpage_handle+0x15/0x20 [mdt]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa071b558>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa03af5de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa03c0d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa07128b9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      2014-03-04 07:56:40 [66446.202606] [<ffffffff81055c93>] ? __wake_up+0x53/0x70
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa071c8ee>] ptlrpc_main+0xace/0x1700 [ptlrpc]
      2014-03-04 07:56:40 [66446.202606] [<ffffffffa071be20>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
      2014-03-04 07:56:40 [66446.202606] [<ffffffff8100c0ca>] child_rip+0xa/0x20
      2014-03-04 07:56:40 [66446.202606] Code: c0 49 89 70 58 41 c6 40 4c 04 83 e0 fc 83 c0 08 41 88 40 4d c9 c3 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 f3 48 a5 89 d1 <f3> a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b
      2014-03-04 07:56:40 [66446.202606] RIP [<ffffffff81281ac0>] memcpy+0x10/0x120
      2014-03-04 07:56:40 [66446.202606] RSP <ffff8803b49b3908>
      2014-03-04 07:56:40 [66446.202606] CR2: ffff8803b298b000

      Attachments

        Activity

          People

            green Oleg Drokin
            blakecaldwell Blake Caldwell
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: