Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3918

kernel NULL pointer dereference at lod_initialize_objects+0x327

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.4.0
    • None
    • lustre-2.4.0-16chaos_2.6.32_358.14.1.2chaos.ch5.1.1.ch5.1.1.x86_64
    • 3
    • 10355

    Description

      Related to LU-3917, we were testing out the LU-3161 patch. Starting MDS after upgrading from 2.1 to 2.4 following a writeconf. We no longer hit the assertion from those bugs. Now we hit this following NULL pointer deref. Will try to obtain a full debug log and post here.

      (gdb) l *( lod_initialize_objects+0x327)
      0x8707 is in lod_initialize_objects (/usr/src/debug/lustre-2.4.0/lustre/lod/lod_lov.c:672).
      667             if (unlikely(!cfs_bitmap_check(md->lod_ost_bitmap, idx))) {
      668                     CERROR("%s: bad idx: %d\n", lod2obd(md)->obd_name, idx);
      669                     return -EINVAL;
      670             }
      671
      672             if (unlikely(OST_TGT(md, idx) == NULL)) {
      673                     CERROR("%s: bad lod_tgt_desc for idx: %d\n",
      674                            lod2obd(md)->obd_name, idx);
      675                     return -EINVAL;
      676             }
      
      2013-09-10 12:38:43 Lustre: Lustre: Build Version: 2.4.0-16chaos-16chaos--PRISTINE-2.6.32-358.14.1.2chaos.ch5.1.1.x86_64
      2013-09-10 12:38:43 LDISKFS-fs (sdb): mounted filesystem with ordered data mode. quota=off. Opts: 
      2013-09-10 12:38:44 Lustre: lsc-MDT0000: Not available for connect from 192.168.117.117@o2ib10 (not set up)
      2013-09-10 12:38:44 Lustre: lsc-MDT0000: Not available for connect from 192.168.116.198@o2ib5 (not set up)
      2013-09-10 12:38:45 Lustre: 4192:0:(mdt_handler.c:4947:mdt_process_config()) For interoperability, skip this mdd.quota_type. It is obsolete.
      2013-09-10 12:38:45 LustreError: 11-0: lsc-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
      2013-09-10 12:38:45 LustreError: 3988:0:(lod_lov.c:668:validate_lod_and_idx()) lsc-MDT0000-mdtlov: bad idx: 459
      2013-09-10 12:38:45 LustreError: 3988:0:(mdd_orphans.c:400:orph_key_test_and_del()) lsc-MDD0000: error unlinking orphan [0x7e4cf25127a:0x207:0x0] from PENDING: rc = -22
      2013-09-10 12:38:45 LustreError: 3988:0:(lod_lov.c:668:validate_lod_and_idx()) lsc-MDT0000-mdtlov: bad idx: 425
      2013-09-10 12:38:45 LustreError: 3988:0:(mdd_orphans.c:400:orph_key_test_and_del()) lsc-MDD0000: error unlinking orphan [0x7e4cf35c78c:0xb7b:0x0] from PENDING: rc = -22
      2013-09-10 12:38:45 BUG: unable to handle kernel NULL pointer dereference at 00000000000006c0
      2013-09-10 12:38:45 IP: [<ffffffffa11406d7>] lod_initialize_objects+0x327/0xd00 [lod]
      2013-09-10 12:38:45 PGD 633810067 PUD 63448e067 PMD 0 
      2013-09-10 12:38:45 Oops: 0000 [#1] SMP 
      2013-09-10 12:38:45 last sysfs file: /sys/module/fid/initstate
      2013-09-10 12:38:45 CPU 8 
      2013-09-10 12:38:45 Modules linked in: osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) ldiskfs(U) mbcache lquota(U) jbd2 mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate powernow_k8 freq_table mperf ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipmi_devintf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad ib_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm_amd kvm ppdev parport_pc parport sg sd_mod crc_t10dif radeon ttm drm_kms_helper drm i2c_algo_bit sr_mod cdrom serio_raw shpchp k10temp amd64_edac_mod edac_core edac_mce_amd 3w_9xxx sata_nv ata_generic pata_acpi pata_amd i2c_nforce2 i2c_core ipv6 nfs lockd fscache auth_rpcgss nfs_acl sunrpc e1000 mlx4_en mlx4_core [last unloaded: cpufreq_ondemand]
      2013-09-10 12:38:45 
      2013-09-10 12:38:45 Pid: 3988, comm: mount.lustre Tainted: P           ---------------    2.6.32-358.14.1.2chaos.ch5.1.1.x86_64 #1 Supermicro H8QM8/H8QM8
      2013-09-10 12:38:45 RIP: 0010:[<ffffffffa11406d7>]  [<ffffffffa11406d7>] lod_initialize_objects+0x327/0xd00 [lod]
      2013-09-10 12:38:45 RSP: 0018:ffff8804343198d8  EFLAGS: 00010213
      2013-09-10 12:38:45 RAX: 0000000000000000 RBX: ffff880632a65a00 RCX: 00000000ffffffff
      2013-09-10 12:38:45 RDX: 00000000000000d8 RSI: 000000000000005e RDI: ffff880831194900
      2013-09-10 12:38:45 RBP: ffff880434319958 R08: 00000000000001d8 R09: ffff880633e80ba0
      2013-09-10 12:38:45 R10: ffff880628ffd6c0 R11: 0000000000000000 R12: 0000000000000000
      2013-09-10 12:38:45 R13: 0000000000000000 R14: ffff880632a65a00 R15: ffff880831194000
      2013-09-10 12:38:45 FS:  00007ffff7fdb700(0000) GS:ffff88043f400000(0000) knlGS:0000000000000000
      2013-09-10 12:38:45 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      2013-09-10 12:38:45 CR2: 00000000000006c0 CR3: 0000000633983000 CR4: 00000000000007e0
      2013-09-10 12:38:45 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      2013-09-10 12:38:45 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      2013-09-10 12:38:45 Process mount.lustre (pid: 3988, threadinfo ffff880434318000, task ffff8804339ccaa0)
      2013-09-10 12:38:45 Stack:
      2013-09-10 12:38:45  ffff8806000001d8 0000000000000002 0002880400000002 0000000000000010
      2013-09-10 12:38:45 <d> ffff880632a65a30 ffff880632943f60 ffff880434319b78 ffff880633e80ba0
      2013-09-10 12:38:45 <d> ffff8808311940e8 ffff880632916998 0000a30128ffd6c0 ffff880632916998
      2013-09-10 12:38:45 Call Trace:
      2013-09-10 12:38:45  [<ffffffffa11412b1>] lod_parse_striping+0x201/0x300 [lod]
      2013-09-10 12:38:45  [<ffffffffa1142d00>] lod_load_striping+0x290/0x440 [lod]
      2013-09-10 12:38:45  [<ffffffffa114e73b>] lod_declare_object_destroy+0x16b/0x390 [lod]
      2013-09-10 12:38:45  [<ffffffffa0e180a9>] __mdd_orphan_cleanup+0x7d9/0xca0 [mdd]
      2013-09-10 12:38:45  [<ffffffffa0e2733d>] mdd_recovery_complete+0xed/0x170 [mdd]
      2013-09-10 12:38:45  [<ffffffffa1086ab0>] ? mdt_recovery_handle+0x0/0x120 [mdt]
      2013-09-10 12:38:45  [<ffffffffa1078cb5>] mdt_postrecov+0x35/0xd0 [mdt]
      2013-09-10 12:38:45  [<ffffffffa107d48b>] mdt_prepare+0x3cb/0x490 [mdt]
      2013-09-10 12:38:45  [<ffffffffa0884b86>] server_start_targets+0x1266/0x1960 [obdclass]
      2013-09-10 12:38:45  [<ffffffffa0854c15>] ? lustre_start_mgc+0x4b5/0x1ee0 [obdclass]
      2013-09-10 12:38:45  [<ffffffffa084c950>] ? class_config_llog_handler+0x0/0x1830 [obdclass]
      2013-09-10 12:38:45  [<ffffffffa0886164>] server_fill_super+0xee4/0x1580 [obdclass]
      2013-09-10 12:38:45  [<ffffffffa0856818>] lustre_fill_super+0x1d8/0x530 [obdclass]
      2013-09-10 12:38:45  [<ffffffffa0856640>] ? lustre_fill_super+0x0/0x530 [obdclass]
      2013-09-10 12:38:45  [<ffffffff8118539f>] get_sb_nodev+0x5f/0xa0
      2013-09-10 12:38:45  [<ffffffffa084e285>] lustre_get_sb+0x25/0x30 [obdclass]
      2013-09-10 12:38:45  [<ffffffff811849bb>] vfs_kern_mount+0x7b/0x1b0
      2013-09-10 12:38:45  [<ffffffff81184b62>] do_kern_mount+0x52/0x130
      2013-09-10 12:38:45  [<ffffffff811a4bf2>] do_mount+0x2d2/0x8d0
      2013-09-10 12:38:45  [<ffffffff811a5280>] sys_mount+0x90/0xe0
      2013-09-10 12:38:45  [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
      2013-09-10 12:38:45 Code: f0 08 00 00 44 8b 45 80 44 0f a3 40 08 19 c0 85 c0 0f 84 48 06 00 00 44 89 c0 41 0f b6 d0 c1 f8 08 48 98 49 8b 84 c7 e8 00 00 00 <48> 8b 04 d0 48 85 c0 0f 84 64 07 00 00 4c 8b 30 4d 85 f6 0f 84 
      2013-09-10 12:38:45 RIP  [<ffffffffa11406d7>] lod_initialize_objects+0x327/0xd00 [lod]
      2013-09-10 12:38:45  RSP <ffff8804343198d8>
      2013-09-10 12:38:45 CR2: 00000000000006c0
      2013-09-10 12:38:45 REWRITING MCP55 CFG REG
      2013-09-10 12:38:45 CFG = c1
      

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: