Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4702

crash in idmap_destroy() when unload module

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0
    • Lustre 2.6.0
    • None
    • 3
    • 12937

    Description

      It's easy to reproduce it: sh llmount.sh; sh llmountcleanup.sh

      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffffa035545e>] idmap_destroy+0xe/0x1d0 [nodemap]
      PGD 6069a067 PUD ce0e067 PMD 0 
      Oops: 0000 [#1] SMP 
      last sysfs file: /sys/kernel/mm/ksm/run
      CPU 0 
      Modules linked in: nodemap(-) exportfs lquota lfsck jbd obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet sha512_generic sha256_generic crc32c_intel libcfs ebtable_nat ebtables xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT iptable_filter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 fuse dm_mirror dm_region_hash dm_log dm_mod uinput ppdev parport_pc parport e1000 snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc sg vmware_balloon i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix [last unloaded: mgs]
      
      Pid: 20413, comm: rmmod Not tainted 2.6.32431 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
      RIP: 0010:[<ffffffffa035545e>]  [<ffffffffa035545e>] idmap_destroy+0xe/0x1d0 [nodemap]
      RSP: 0018:ffff88007d3fdd98  EFLAGS: 00010292
      RAX: 0000000000000000 RBX: ffffffffffffffe0 RCX: 0000000000000003
      RDX: 0000000000000001 RSI: ffff880037e4f930 RDI: ffffffffffffffe0
      RBP: ffff88007d3fdda8 R08: ffffffff81c064c0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffe0
      R13: ffff880037e4f8e8 R14: 0000000000000000 R15: ffff880037e4f930
      FS:  00007f3e0c93e700(0000) GS:ffff88000c400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000000 CR3: 0000000060162000 CR4: 00000000000407f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process rmmod (pid: 20413, threadinfo ffff88007d3fc000, task ffff88007a18d500)
      Stack:
       ffff88007d3fde58 ffffffffffffffe0 ffff88007d3fddd8 ffffffffa0355663
      <d> 0000000000000000 ffffffff810d3419 ffff880037e4f8c0 ffff880037e4f8e8
      <d> ffff88007d3fde08 ffffffffa03532c6 ffff88007d3fdde8 ffff88007d443780
      Call Trace:
       [<ffffffffa0355663>] idmap_delete_tree+0x43/0x60 [nodemap]
       [<ffffffff810d3419>] ? __stop_cpus+0x59/0x80
       [<ffffffffa03532c6>] nodemap_destroy+0x56/0x210 [nodemap]
       [<ffffffffa03534ad>] nodemap_putref+0x2d/0xa0 [nodemap]
       [<ffffffffa0353532>] nodemap_hs_put_locked+0x12/0x20 [nodemap]
       [<ffffffffa040ac21>] cfs_hash_bd_del_locked+0x91/0x140 [libcfs]
       [<ffffffffa040c1d1>] cfs_hash_putref+0x191/0x480 [libcfs]
       [<ffffffffa03545ba>] nodemap_cleanup_all+0x2a/0x30 [nodemap]
       [<ffffffffa03545ce>] nodemap_mod_exit+0xe/0x20 [nodemap]
       [<ffffffff810b94d4>] sys_delete_module+0x194/0x260
       [<ffffffff810e16c7>] ? audit_syscall_entry+0x1d7/0x200
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Code: 74 d9 e9 6c ff ff ff 66 0f 1f 44 00 00 48 83 c7 48 e9 4c ff ff ff 0f 1f 80 00 00 00 00 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 <48> 8b 47 20 48 8d 57 20 48 89 fb 48 83 e0 fc 48 39 d0 0f 84 73 
      RIP  [<ffffffffa035545e>] idmap_destroy+0xe/0x1d0 [nodemap]
       RSP <ffff88007d3fdd98>
      CR2: 0000000000000000
      ---[ end trace 1c64b1dd0883bf35 ]---
      Kernel panic - not syncing: Fatal exception
      Pid: 20413, comm: rmmod Tainted: G      D    ---------------    2.6.32431 #1
      Call Trace:
       [<ffffffff81526137>] ? panic+0xa7/0x16f
       [<ffffffff8152a474>] ? oops_end+0xe4/0x100
       [<ffffffff8104a04b>] ? no_context+0xfb/0x260
       [<ffffffff8104a2d5>] ? __bad_area_nosemaphore+0x125/0x1e0
       [<ffffffff8104a3fe>] ? bad_area+0x4e/0x60
       [<ffffffff8104abaf>] ? __do_page_fault+0x3cf/0x480
       [<ffffffff81059b61>] ? update_curr+0xe1/0x1f0
       [<ffffffff81526850>] ? thread_return+0x4e/0x76e
       [<ffffffff81014979>] ? sched_clock+0x9/0x10
       [<ffffffff8152c39e>] ? do_page_fault+0x3e/0xa0
       [<ffffffff81529755>] ? page_fault+0x25/0x30
       [<ffffffffa035545e>] ? idmap_destroy+0xe/0x1d0 [nodemap]
       [<ffffffffa0355663>] ? idmap_delete_tree+0x43/0x60 [nodemap]
       [<ffffffff810d3419>] ? __stop_cpus+0x59/0x80
       [<ffffffffa03532c6>] ? nodemap_destroy+0x56/0x210 [nodemap]
       [<ffffffffa03534ad>] ? nodemap_putref+0x2d/0xa0 [nodemap]
       [<ffffffffa0353532>] ? nodemap_hs_put_locked+0x12/0x20 [nodemap]
       [<ffffffffa040ac21>] ? cfs_hash_bd_del_locked+0x91/0x140 [libcfs]
       [<ffffffffa040c1d1>] ? cfs_hash_putref+0x191/0x480 [libcfs]
       [<ffffffffa03545ba>] ? nodemap_cleanup_all+0x2a/0x30 [nodemap]
       [<ffffffffa03545ce>] ? nodemap_mod_exit+0xe/0x20 [nodemap]
       [<ffffffff810b94d4>] ? sys_delete_module+0x194/0x260
       [<ffffffff810e16c7>] ? audit_syscall_entry+0x1d7/0x200
       [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      

      Attachments

        Activity

          [LU-4702] crash in idmap_destroy() when unload module
          niu Niu Yawei (Inactive) made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]

          patch landed for 2.6

          niu Niu Yawei (Inactive) added a comment - patch landed for 2.6
          adilger Andreas Dilger made changes -
          Fix Version/s New: Lustre 2.6.0 [ 10595 ]
          adilger Andreas Dilger made changes -
          Affects Version/s New: Lustre 2.6.0 [ 10595 ]
          adilger Andreas Dilger made changes -
          Priority Original: Minor [ 4 ] New: Blocker [ 1 ]
          pjones Peter Jones made changes -
          Assignee Original: WC Triage [ wc-triage ] New: Niu Yawei [ niu ]

          Look into the code:

          #define nm_rbtree_postorder_for_each_entry_safe(pos, n,                 \
                                                          root, field)            \
                  for (pos = rb_entry(nm_rb_first_postorder(root), typeof(*pos),  \
                                      field),                                     \
                          n = rb_entry(nm_rb_next_postorder(&pos->field),         \
                          typeof(*pos), field);                                   \
                          &pos->field;                                            \
                          pos = n,                                                \
                          n = rb_entry(nm_rb_next_postorder(&pos->field),         \
                                       typeof(*pos), field))
          

          Shouldn't we check if nm_rb_first/next_postorder(root) returns NULL?

          niu Niu Yawei (Inactive) added a comment - Look into the code: #define nm_rbtree_postorder_for_each_entry_safe(pos, n, \ root, field) \ for (pos = rb_entry(nm_rb_first_postorder(root), typeof(*pos), \ field), \ n = rb_entry(nm_rb_next_postorder(&pos->field), \ typeof(*pos), field); \ &pos->field; \ pos = n, \ n = rb_entry(nm_rb_next_postorder(&pos->field), \ typeof(*pos), field)) Shouldn't we check if nm_rb_first/next_postorder(root) returns NULL?
          niu Niu Yawei (Inactive) created issue -

          People

            niu Niu Yawei (Inactive)
            niu Niu Yawei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: