Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7866

BUG: unable to handle kernel NULL pointer dereference at (null)

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      Error occurred during soak testing of build '20160309' (b2_8 RC5) (see: https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160309 also). DNE is enabled. MDTs had been formatted using ldiskfs, OSTs using zfs. MDS nodes are configured in active - active HA failover configuration. (For teset set-up configuration see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-Configuration)

      Sequence of events:

      • mds_restart : 2016-03-11 03:41:05,597 - 2016-03-11 03:54:49,109 lola-8
      • 2016-03-11 03:56 Lustre client lola-32 crashed with the following error:
        <1>BUG: unable to handle kernel NULL pointer dereference at (null)
        <1>IP: [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
        <4>PGD 38a867067 PUD 775372067 PMD 0
        <4>Oops: 0000 [#1] SMP
        <4>last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:06:00.0/infiniband_mad/umad0/port
        <4>CPU 1
        <4>Modules linked in: osc(U) mgc(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) nfsi]
        <4>
        <4>Pid: 201682, comm: simul Not tainted 2.6.32-504.30.3.el6.x86_64 #1 Intel Corporation S2600GZ/S2600GZ
        <4>RIP: 0010:[<ffffffffa0a0241f>]  [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
        <4>RSP: 0018:ffff8801d3f298e8  EFLAGS: 00010286
        <4>RAX: 0000000000000000 RBX: ffff8807f7271a00 RCX: ffff88102dd13ca0
        <4>RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88102f395000
        <4>RBP: ffff8801d3f29928 R08: ffff88034f47e9c0 R09: 0000000000000000
        <4>R10: 0000000000000010 R11: 0000000000000000 R12: 0000000000000000
        <4>R13: ffff88082dd02c00 R14: ffff88081a5fc800 R15: ffff8801d3f29988
        <4>FS:  00007f72f89d3700(0000) GS:ffff880045e20000(0000) knlGS:0000000000000000
        <4>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        <4>CR2: 0000000000000000 CR3: 00000003b89f2000 CR4: 00000000001407e0
        <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        <4>Process simul (pid: 201682, threadinfo ffff8801d3f28000, task ffff8804c6d73520)
        <4>Stack:
        <4> ffff8801d3f298f8 0000000000000000 ffff8801d3f29928 0000000000000001
        <4><d> ffff88082dd02c00 ffff88081a5fc800 ffff8809648898c0 ffff8801d3f29988
        <4><d> ffff8801d3f299f8 ffffffffa0a09c4a fffffffffffffffb 00ff880a531788c0
        <4>Call Trace:
        <4> [<ffffffffa0a09c4a>] ll_prep_inode+0x20a/0xc40 [lustre]
        <4> [<ffffffffa07804b2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
        <4> [<ffffffffa0a214f0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre]
        <4> [<ffffffffa0a21fe1>] ll_lookup_it_finish+0x321/0x12e0 [lustre]
        <4> [<ffffffffa0723dd0>] ? ldlm_lock_decref_internal+0x2e0/0xa80 [ptlrpc]
        <4> [<ffffffffa0569925>] ? class_handle2object+0x95/0x190 [obdclass]
        <4> [<ffffffff81174ab3>] ? kmem_cache_alloc_trace+0x1b3/0x1c0
        <4> [<ffffffffa0a1e439>] ? ll_i2suppgid+0x19/0x30 [lustre]
        <4> [<ffffffffa0a1e47e>] ? ll_i2gids+0x2e/0xd0 [lustre]
        <4> [<ffffffffa0a214f0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre]
        <4> [<ffffffffa0a23226>] ll_lookup_it+0x286/0xda0 [lustre]
        <4> [<ffffffffa0a23dc9>] ll_lookup_nd+0x89/0x4f0 [lustre]
        <4> [<ffffffff8119e055>] do_lookup+0x1a5/0x230
        <4> [<ffffffff8119ece4>] __link_path_walk+0x7a4/0x1000
        <4> [<ffffffff8119f7fa>] path_walk+0x6a/0xe0
        <4> [<ffffffff8119fa0b>] filename_lookup+0x6b/0xc0
        <4> [<ffffffff8122daa6>] ? security_file_alloc+0x16/0x20
        <4> [<ffffffff811a0ee4>] do_filp_open+0x104/0xd20
        <4> [<ffffffff81063c63>] ? perf_event_task_sched_out+0x33/0x70
        <4> [<ffffffff8129943a>] ? strncpy_from_user+0x4a/0x90
        <4> [<ffffffff811ae392>] ? alloc_fd+0x92/0x160
        <4> [<ffffffff8118b157>] do_sys_open+0x67/0x130
        <4> [<ffffffff8118b260>] sys_open+0x20/0x30
        <4> [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
        <4>Code: ba 38 01 00 00 31 f6 e8 30 62 b6 ff f6 05 fd 6b a9 ff 10 74 0d 80 3d f0 6b a9 ff 00 0f 88 ba 01 00 00 48 85 db 0f 84 16 02 00 00 <49> 8b 04 24 48 89 03 49
        <1>RIP  [<ffffffffa0a0241f>] ll_open_cleanup+0xaf/0x600 [lustre]
        <4> RSP <ffff8801d3f298e8>
        <4>CR2: 0000000000000000
        

      Attached client (lola-32) message, console and vmcore-dmesg.txt file.

      Attachments

        Activity

          People

            hongchao.zhang Hongchao Zhang
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: