Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-409

Oops: RIP: _spin_lock_irq+0x15/0x40

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.1.0, Lustre 1.8.6
    • Lustre 2.1.0, Lustre 1.8.6
    • None
    • 3
    • 4271

    Description

      After mounting and unmounting Lustre filesystem, running lustre_rmmod caused the Lustre client node crash as follows:

      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff814dcf35>] _spin_lock_irq+0x15/0x40
      PGD 31ae08067 PUD 312eae067 PMD 0
      Oops: 0002 [#1] SMP
      last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
      CPU 2
      Modules linked in: llite_lloop(-)(U) lustre(U) mgc(U) lov(U) osc(U) mdc(U) lquota(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) nfs lockd fscache(T
      ) nfs_acl auth_rpcgss autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_reg
      ion_hash dm_log mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb serio_raw ghes hed i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 
      jbd mbcache sd_mod crc_t10dif ahci dm_mod [last unloaded: microcode]
      
      Modules linked in: llite_lloop(-)(U) lustre(U) mgc(U) lov(U) osc(U) mdc(U) lquota(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) nfs lockd fscache(T
      ) nfs_acl auth_rpcgss autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_reg
      ion_hash dm_log mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb serio_raw ghes hed i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 
      jbd mbcache sd_mod crc_t10dif ahci dm_mod [last unloaded: microcode]
      Pid: 4826, comm: rmmod Tainted: G           ---------------- T 2.6.32-131.2.1.el6.x86_64 #1 X8DTT
      RIP: 0010:[<ffffffff814dcf35>]  [<ffffffff814dcf35>] _spin_lock_irq+0x15/0x40
      RSP: 0018:ffff880318cd9da8  EFLAGS: 00010092 
      RAX: 0000000000010000 RBX: ffff880328bda000 RCX: 000000000000b1a0
      RDX: 0000000000000000 RSI: ffff88031ce09a90 RDI: 0000000000000000
      RBP: ffff880318cd9da8 R08: 0000000000000001 R09: ffffffff817c3f86
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88031ce09800
      R13: ffff880328bda000 R14: ffff88031ce0b560 R15: 0000000000000001
      FS:  00007fb1de18d700(0000) GS:ffff880032e40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000000 CR3: 000000031ae78000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process rmmod (pid: 4826, threadinfo ffff880318cd8000, task ffff88032123ca80)
      Stack:
       ffff880318cd9dd8 ffffffff8125689c ffff880328bda000 ffff880328bda328
      <0> ffff880328bda328 ffff88031ce0b560 ffff880318cd9df8 ffffffff8124ba66
      <0> ffffffff81a8a820 ffff880328bda360 ffff880318cd9e28 ffffffff81264a2d
      Call Trace:
       [<ffffffff8125689c>] blk_throtl_exit+0x3c/0xd0
       [<ffffffff8124ba66>] blk_release_queue+0x26/0x80
       [<ffffffff81264a2d>] kobject_release+0x8d/0x240
       [<ffffffff812649a0>] ? kobject_release+0x0/0x240
       [<ffffffff81265fd7>] kref_put+0x37/0x70
       [<ffffffff812648a7>] kobject_put+0x27/0x60  
       [<ffffffff81247687>] blk_cleanup_queue+0x57/0x70
       [<ffffffffa08070b1>] lloop_exit+0x61/0x300 [llite_lloop]
       [<ffffffff81069012>] ? put_online_cpus+0x52/0x70
       [<ffffffff810a8ef8>] ? module_refcount+0x58/0x70
       [<ffffffff810a9a74>] sys_delete_module+0x194/0x260
       [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
      Code: c1 74 0e f3 90 0f b7 0f eb f5 83 3f 00 75 f4 eb df 48 89 d0 c9 c3 55 48 89 e5 0f 1f 44 00 00 fa 66 0f 1f 44 00 00 b8 00 00 01 00 <f0> 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 eb f5
      RIP  [<ffffffff814dcf35>] _spin_lock_irq+0x15/0x40
       RSP <ffff880318cd9da8>
      CR2: 0000000000000000
      

      This failure could be easily reproduced by running llmount.sh and then llmountcleanup.sh.

      Attachments

        Issue Links

          Activity

            People

              ys Yang Sheng
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: