Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.11.0
-
3
-
9223372036854775807
Description
I was testing the latest master branch (b2c8846, LU-6210 utils: Use C99 struct initializer for long_opt_start). All the Lustre servers and client runs on the same host. And when I umount the client. The kobject_put() crashed the kernel.
[ 118.118013] -----------[ cut here ]-----------
[ 118.118025] WARNING: at lib/kobject.c:612 kobject_put+0x50/0x60()
[ 118.118028] kobject: '(null)' (ffff88001240eec0): is not initialized, yet kobject_put() is being called.
[ 118.118030] Modules linked in: osc(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_mod ppdev sg pcspkr virtio_balloon i2c_piix4 i2c_core parport_pc parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi virtio_scsi virtio_net virtio_blk ata_piix serio_raw virtio_pci virtio_ring virtio libata floppy
[ 118.118140] CPU: 1 PID: 9487 Comm: umount Tainted: G OE ------------ 3.10.0-514.26.2.el7_lustre.2.10.50_69_g8793c5b.x86_64 #1
[ 118.118144] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 118.118151] ffff88001203f9e0 00000000ac9255be ffff88001203f998 ffffffff81687383
[ 118.118156] ffff88001203f9d0 ffffffff81085cb0 ffff88001240eec0 ffff8800123a1000
[ 118.118160] ffff88001240e410 ffff880012219300 ffff88001212d800 ffff88001203fa38
[ 118.118165] Call Trace:
[ 118.118174] [<ffffffff81687383>] dump_stack+0x19/0x1b
[ 118.118180] [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0
[ 118.118184] [<ffffffff81085d4c>] warn_slowpath_fmt+0x5c/0x80
[ 118.118188] [<ffffffff8131aee0>] kobject_put+0x50/0x60
[ 118.118239] [<ffffffffa04d1596>] lprocfs_obd_cleanup+0x56/0x70 [obdclass]
[ 118.118252] [<ffffffffa0f9dcc7>] osc_precleanup+0xe7/0x2c0 [osc]
[ 118.118295] [<ffffffffa04e4f91>] class_cleanup+0x2a1/0xcf0 [obdclass]
[ 118.118334] [<ffffffffa04e79e2>] class_process_config+0x1992/0x23f0 [obdclass]
[ 118.118352] [<ffffffffa0dfe9c5>] ? lov_putref+0x2f5/0xa80 [lov]
[ 118.118370] [<ffffffffa03a2b97>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[ 118.118408] [<ffffffffa04e8606>] class_manual_cleanup+0x1c6/0x710 [obdclass]
[ 118.118421] [<ffffffffa0dfe9d2>] lov_putref+0x302/0xa80 [lov]
[ 118.118434] [<ffffffffa0e05d92>] lov_disconnect+0x172/0x420 [lov]
[ 118.118461] [<ffffffffa0ecc853>] obd_disconnect+0xb3/0x330 [lustre]
[ 118.118483] [<ffffffffa0ecfc90>] ll_put_super+0x610/0xaa0 [lustre]
[ 118.118490] [<ffffffff81138fcd>] ? call_rcu_sched+0x1d/0x20
[ 118.118531] [<ffffffffa0efa30c>] ? ll_destroy_inode+0x1c/0x20 [lustre]
[ 118.118538] [<ffffffff8121a8f8>] ? destroy_inode+0x38/0x60
[ 118.118542] [<ffffffff8121aa26>] ? evict+0x106/0x170
[ 118.118546] [<ffffffff8121aace>] ? dispose_list+0x3e/0x50
[ 118.118550] [<ffffffff8121b724>] ? evict_inodes+0x114/0x140
[ 118.118557] [<ffffffff81200f72>] generic_shutdown_super+0x72/0xf0
[ 118.118562] [<ffffffff81201342>] kill_anon_super+0x12/0x20
[ 118.118602] [<ffffffffa04eaf15>] lustre_kill_super+0x45/0x50 [obdclass]
[ 118.118607] [<ffffffff812016f9>] deactivate_locked_super+0x49/0x60
[ 118.118611] [<ffffffff81201cf6>] deactivate_super+0x46/0x60
[ 118.118616] [<ffffffff8121f145>] mntput_no_expire+0xc5/0x120
[ 118.118622] [<ffffffff81220280>] SyS_umount+0xa0/0x3b0
[ 118.118627] [<ffffffff81697a49>] system_call_fastpath+0x16/0x1b
[ 118.118630] --[ end trace 8308964b9c22e228 ]--
[ 118.118641] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 118.128816] IP: [<ffffffff81333c5b>] __list_add+0x1b/0xc0
[ 118.135215] PGD 0
[ 118.137535] Oops: 0000 1 SMP
[ 118.141290] Modules linked in: osc(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_mod ppdev sg pcspkr virtio_balloon i2c_piix4 i2c_core parport_pc parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi virtio_scsi virtio_net virtio_blk ata_piix serio_raw virtio_pci virtio_ring virtio libata floppy
[ 118.191875] CPU: 1 PID: 9487 Comm: umount Tainted: G W OE ------------ 3.10.0-514.26.2.el7_lustre.2.10.50_69_g8793c5b.x86_64 #1
[ 118.201778] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 118.206430] task: ffff88003a2bbec0 ti: ffff88001203c000 task.ti: ffff88001203c000
[ 118.212428] RIP: 0010:[<ffffffff81333c5b>] [<ffffffff81333c5b>] __list_add+0x1b/0xc0
[ 118.218697] RSP: 0018:ffff88001203f9d8 EFLAGS: 00010046
[ 118.222891] RAX: ffff88001203fa00 RBX: ffff88001203fa18 RCX: ffff88001203ffd8
[ 118.228673] RDX: ffff88001240ef10 RSI: 0000000000000000 RDI: ffff88001203fa18
[ 118.234425] RBP: ffff88001203f9f0 R08: 0000000000000000 R09: 0000000000000259
[ 118.240045] R10: 0000000000000000 R11: ffff88001203f696 R12: ffff88001240ef10
[ 118.245753] R13: 0000000000000000 R14: ffff88003a2bbec0 R15: ffff88001212d800
[ 118.251453] FS: 00007f592e03f880(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
[ 118.257952] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 118.262534] CR2: 0000000000000000 CR3: 0000000039b90000 CR4: 00000000000006e0
[ 118.268281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 118.274125] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 118.279871] Stack:
[ 118.281497] ffff88001240ef00 ffff88001240ef08 7fffffffffffffff ffff88001203fa50
[ 118.287490] ffffffff8168cdfb 0000000000000001 ffff88003a2bbec0 ffffffff810c54e0
[ 118.293586] 0000000000000000 0000000000000000 00000000ac9255be ffff88001240df40
[ 118.299707] Call Trace:
[ 118.301684] [<ffffffff8168cdfb>] wait_for_completion+0xeb/0x170
[ 118.306465] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20
[ 118.311048] [<ffffffffa04d15a2>] lprocfs_obd_cleanup+0x62/0x70 [obdclass]
[ 118.316568] [<ffffffffa0f9dcc7>] osc_precleanup+0xe7/0x2c0 [osc]
[ 118.321477] [<ffffffffa04e4f91>] class_cleanup+0x2a1/0xcf0 [obdclass]
[ 118.326709] [<ffffffffa04e79e2>] class_process_config+0x1992/0x23f0 [obdclass]
[ 118.332633] [<ffffffffa0dfe9c5>] ? lov_putref+0x2f5/0xa80 [lov]
[ 118.337354] [<ffffffffa03a2b97>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[ 118.342771] [<ffffffffa04e8606>] class_manual_cleanup+0x1c6/0x710 [obdclass]
[ 118.348450] [<ffffffffa0dfe9d2>] lov_putref+0x302/0xa80 [lov]
[ 118.353088] [<ffffffffa0e05d92>] lov_disconnect+0x172/0x420 [lov]
[ 118.357984] [<ffffffffa0ecc853>] obd_disconnect+0xb3/0x330 [lustre]
[ 118.363140] [<ffffffffa0ecfc90>] ll_put_super+0x610/0xaa0 [lustre]
[ 118.368130] [<ffffffff81138fcd>] ? call_rcu_sched+0x1d/0x20
[ 118.372727] [<ffffffffa0efa30c>] ? ll_destroy_inode+0x1c/0x20 [lustre]
[ 118.377968] [<ffffffff8121a8f8>] ? destroy_inode+0x38/0x60
[ 118.382433] [<ffffffff8121aa26>] ? evict+0x106/0x170
[ 118.386409] [<ffffffff8121aace>] ? dispose_list+0x3e/0x50
[ 118.390857] [<ffffffff8121b724>] ? evict_inodes+0x114/0x140
[ 118.395288] [<ffffffff81200f72>] generic_shutdown_super+0x72/0xf0
[ 118.400239] [<ffffffff81201342>] kill_anon_super+0x12/0x20
[ 118.404689] [<ffffffffa04eaf15>] lustre_kill_super+0x45/0x50 [obdclass]
[ 118.409950] [<ffffffff812016f9>] deactivate_locked_super+0x49/0x60
[ 118.414997] [<ffffffff81201cf6>] deactivate_super+0x46/0x60
[ 118.419472] [<ffffffff8121f145>] mntput_no_expire+0xc5/0x120
[ 118.424018] [<ffffffff81220280>] SyS_umount+0xa0/0x3b0
[ 118.428227] [<ffffffff81697a49>] system_call_fastpath+0x16/0x1b
[ 118.433054] Code: ff e9 3b ff ff ff b8 f4 ff ff ff e9 31 ff ff ff 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 4c 8b 42 08 48 89 fb 49 39 f0 75 2a <4d> 8b 45 00 4d 39 c4 75 68 4c 39 e3 74 3e 4c 39 eb 74 39 49 89
[ 118.450312] RIP [<ffffffff81333c5b>] __list_add+0x1b/0xc0
[ 118.454795] RSP <ffff88001203f9d8>
[ 118.457571] CR2: 0000000000000000
If the client runs on a seperate host, everything will be fine.