Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7844

Rolling upgrade: sanity test_61 FAIL: BUG: unable to handle kernel NULL pointer dereference at (null)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.8.0
    • MDS and OSS 2.8.0-RRC4
      client1: 2.8.0-RC4
      client2: 2.7
    • 3
    • 9223372036854775807

    Description

      MDS hit the BUG: unable to handle kernel NULL pointer dereference at (null) and reboot when running sanity test_61
      MDS and OSS were upgraded from 2.7 RHEL6.7 to 2.8.0-RC4 RHEL6.7 ldiskfs
      client1 was upgraded from 2.7 RHEL6.7 to 2.8.0-RC4 RHEL6.7
      client2 was remained as 2.7 RHEL6.7

      MDS console

      Lustre: DEBUG MARKER: == sanity test 61: mmap() writes don't make sync hang ================== 16:27:47 (1457051267)
      Lustre: *** cfs_fail_loc=15b, val=0***
      Lustre: Skipped 1 previous similar message
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff8153b933>] down_write+0x23/0x40
      PGD 0 
      Oops: 0002 [#1] SMP 
      last sysfs file: /sys/devices/system/cpu/online
      CPU 1 
      Modules linked in: osp(U) mdd(U) lod(U) mdt(U) lfsck(U) mgs(U) mgc(U) osd_ldiskfs(U) ldiskfs(U) jbd2 lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic crc32c_intel libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs autofs4 cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm sg microcode iTCO_wdt iTCO_vendor_support sb_edac edac_core joydev i2c_i801 lpc_ich mfd_core ioatdma igb dca i2c_algo_bit i2c_core shpchp ext3 jbd mbcache sd_mod crc_t10dif isci libsas scsi_transport_sas ahci mlx4_ib ib_sa ib_mad ib_core ib_addr ipv6 mlx4_en ptp pps_core mlx4_core wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: llog_test]
      
      Pid: 42865, comm: mdt00_000 Not tainted 2.6.32-573.12.1.el6_lustre.x86_64 #1 Intel Corporation S2600GZ/S2600GZ
      RIP: 0010:[<ffffffff8153b933>]  [<ffffffff8153b933>] down_write+0x23/0x40
      RSP: 0018:ffff880815c93700  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: ffffffff00000001 RSI: ffff8808345fa040 RDI: 0000000000000000
      RBP: ffff880815c93710 R08: ffff880815c90000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000064 R12: ffff8804074a2ec0
      R13: 0000000000000000 R14: 0000000000000000 R15: ffff88040e522b80
      FS:  0000000000000000(0000) GS:ffff880038620000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000000 CR3: 0000000001a8d000 CR4: 00000000000407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process mdt00_000 (pid: 42865, threadinfo ffff880815c90000, task ffff8808345fa040)
      Stack:
       ffff88040e522b80 0000000000000000 ffff880815c93770 ffffffffa055bad3
      <d> ffff880815c93770 0000000000000000 ffff880434fe6b40 ffff88042c7b7440
      <d> ffff88040d227024 ffff8804074a2ec0 ffff88042c7b7440 ffff880434fe6b40
      Call Trace:
       [<ffffffffa055bad3>] llog_cat_add_rec+0x403/0x7b0 [obdclass]
       [<ffffffffa0552239>] llog_add+0x89/0x1c0 [obdclass]
       [<ffffffffa0ff2e2e>] ? lod_sub_object_index_insert+0x1fe/0x340 [lod]
       [<ffffffffa104a084>] mdd_changelog_store+0x154/0x320 [mdd]
       [<ffffffffa104a421>] mdd_changelog_ns_store+0x1d1/0x620 [mdd]
       [<ffffffffa1061826>] ? mdd_attr_set_internal+0xd6/0x2c0 [mdd]
       [<ffffffffa1061a8f>] ? mdd_update_time+0x7f/0x1c0 [mdd]
       [<ffffffffa10568c1>] mdd_create+0x1351/0x1770 [mdd]
       [<ffffffffa0f1e4c8>] mdo_create+0x18/0x50 [mdt]
       [<ffffffffa0f26d85>] mdt_reint_open+0x1f55/0x2f50 [mdt]
       [<ffffffffa07f7bdd>] ? null_alloc_rs+0xcd/0x320 [ptlrpc]
       [<ffffffffa05b6cbc>] ? upcall_cache_get_entry+0x29c/0x880 [obdclass]
       [<ffffffffa05bbbf0>] ? lu_ucred+0x20/0x30 [obdclass]
       [<ffffffffa0f0b57f>] ? ucred_set_jobid+0x5f/0x70 [mdt]
       [<ffffffffa0f0f1fd>] mdt_reint_rec+0x5d/0x200 [mdt]
       [<ffffffffa0efae4b>] mdt_reint_internal+0x62b/0x9f0 [mdt]
       [<ffffffffa0efb406>] mdt_intent_reint+0x1f6/0x430 [mdt]
       [<ffffffffa0ef98be>] mdt_intent_policy+0x4be/0xc70 [mdt]
       [<ffffffffa076f6c7>] ldlm_lock_enqueue+0x127/0x990 [ptlrpc]
       [<ffffffffa079a827>] ldlm_handle_enqueue0+0x807/0x14d0 [ptlrpc]
       [<ffffffffa080dfe1>] ? tgt_lookup_reply+0x31/0x190 [ptlrpc]
       [<ffffffffa0820171>] tgt_enqueue+0x61/0x230 [ptlrpc]
       [<ffffffffa0820c2c>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
       [<ffffffffa07cdc61>] ptlrpc_main+0xd21/0x1800 [ptlrpc]
       [<ffffffffa07ccf40>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
       [<ffffffff810a0fce>] kthread+0x9e/0xc0
       [<ffffffff8100c28a>] child_rip+0xa/0x20
       [<ffffffff810a0f30>] ? kthread+0x0/0xc0
       [<ffffffff8100c280>] ? child_rip+0x0/0x20
      Code: c3 e8 a2 ba b3 ff 00 00 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 48 89 fb e8 ba e2 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 48 85 d2 74 05 e8 ce 29 d6 ff 48 83 c4 08 5b c9 
      RIP  [<ffffffff8153b933>] down_write+0x23/0x40
       RSP <ffff880815c93700>
      CR2: 0000000000000000
      Initializing cgroup subsys cpuset
      Initializing cgroup subsys cpu
      Linux version 2.6.32-573.12.1.el6_lustre.x86_64 (jenkins@onyx-7-sdf1-el6-x8664.onyx.hpdd.intel.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) #1 SMP Thu Feb 18 11:08:53 PST 2016
      Command line: ro root=UUID=f50605c1-7b71-4192-8f8b-afcd8aae7478 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD console=tty0 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM console=ttyS0,115200 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off acpi_no_memhotplug disable_cpu_apicid=0 memmap=exactmap memmap=574K@4K memmap=133550K@49726K elfcorehdr=183276K memmap=4K$0K memmap=62K$578K memmap=128K$896K memmap=42200K$3067876K memmap=992K#3110076K memmap=488K#3111068K memmap=568K#3111556K memmap=516K#3112124K memmap=294912K$3112960K memmap=4K$4173824K memmap=4K$4174948K memmap=16K$4174960K memmap=4K$4175872K memmap=6016K$4188288K
      KERNEL supported cpus:
      

      Attachments

        Issue Links

          Activity

            People

              di.wang Di Wang
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: