[LU-2555] Null pointer deref in ldiskfs_ext_calc_credits_for_insert Created: 31/Dec/12  Updated: 20/Feb/13  Resolved: 18/Feb/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.5
Fix Version/s: Lustre 2.4.0, Lustre 2.1.5

Type: Bug Priority: Blocker
Reporter: Oleg Drokin Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: MB

Severity: 3
Rank (Obsolete): 5979

 Description   

Hit this oops while running racer in a loop, seems to be a call from quota code into ldiskfs.

[232945.455907] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
[232945.456006] IP: [<ffffffffa07478fe>] ldiskfs_ext_calc_credits_for_insert+0x2e/0x40 [ldiskfs]
[232945.456006] PGD 89330067 PUD 4651c067 PMD 0 
[232945.456006] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[232945.456006] last sysfs file: /sys/devices/system/cpu/possible
[232945.456006] CPU 0 
[232945.456006] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
[232945.456006] 
[232945.456006] Pid: 29361, comm: ll_ost_io00_002 Not tainted 2.6.32-debug #6 Bochs Bochs
[232945.456006] RIP: 0010:[<ffffffffa07478fe>]  [<ffffffffa07478fe>] ldiskfs_ext_calc_credits_for_insert+0x2e/0x40 [ldiskfs]
[232945.456006] RSP: 0018:ffff88005a53b620  EFLAGS: 00010212
[232945.456006] RAX: 0000000000000001 RBX: ffff88005a53b790 RCX: 5a5a5a5a5a5a5a60
[232945.456006] RDX: 0000000000000000 RSI: ffff8800147a7c80 RDI: ffff88002dae5b98
[232945.456006] RBP: ffff88005a53b620 R08: ffff88005a53b790 R09: ffff8800147a7c80
[232945.456006] R10: d84156c5635688c0 R11: 2222222222222222 R12: ffff88005a53b710
[232945.456006] R13: ffff88002dae5b98 R14: ffff8800147a7c80 R15: 0000000000000001
[232945.456006] FS:  00007f4c965c4700(0000) GS:ffff880006200000(0000) knlGS:0000000000000000
[232945.456006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[232945.456006] CR2: 0000000000000002 CR3: 0000000084a56000 CR4: 00000000000006f0
[232945.456006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[232945.456006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[232945.456006] Process ll_ost_io00_002 (pid: 29361, threadinfo ffff88005a53a000, task ffff88005a538380)
[232945.456006] Stack:
[232945.456006]  ffff88005a53b6d0 ffffffffa02c7509 5a5a5a5a5a5a5a60 ffff88002dae5a90
[232945.456006] <d> 0000000000000282 ffff88002dae5b88 ffff88002dae5a90 0000000000000011
[232945.456006] <d> ffff8800147a7c80 ffff88002dae5a90 0000000000000000 ffff88002dae5b98
[232945.456006] Call Trace:
[232945.456006]  [<ffffffffa02c7509>] ldiskfs_ext_new_extent_cb+0x1b9/0x6c8 [fsfilt_ldiskfs]
[232945.456006]  [<ffffffff8127c158>] ? call_rwsem_wake+0x18/0x30
[232945.456006]  [<ffffffffa07496ef>] ldiskfs_ext_walk_space+0x14f/0x340 [ldiskfs]
[232945.456006]  [<ffffffffa02c7350>] ? ldiskfs_ext_new_extent_cb+0x0/0x6c8 [fsfilt_ldiskfs]
[232945.456006]  [<ffffffffa02c7078>] fsfilt_map_nblocks+0xd8/0x100 [fsfilt_ldiskfs]
[232945.456006]  [<ffffffff81276936>] ? rwsem_wake+0x76/0x170
[232945.456006]  [<ffffffffa02c71b3>] fsfilt_ldiskfs_map_ext_inode_pages+0x113/0x220 [fsfilt_ldiskfs]
[232945.456006]  [<ffffffff811db553>] ? dquot_initialize+0xe3/0x1b0
[232945.456006]  [<ffffffffa02c7345>] fsfilt_ldiskfs_map_inode_pages+0x85/0x90 [fsfilt_ldiskfs]
[232945.456006]  [<ffffffffa077d283>] ? ldiskfs_dquot_initialize+0x73/0xc0 [ldiskfs]
[232945.456006]  [<ffffffffa07e6307>] osd_write_commit+0x327/0x640 [osd_ldiskfs]
[232945.456006]  [<ffffffffa09bdc8c>] ofd_commitrw_write+0x6cc/0x12a0 [ofd]
[232945.456006]  [<ffffffffa09bec15>] ofd_commitrw+0x3b5/0x9c0 [ofd]
[232945.456006]  [<ffffffffa08f66b8>] obd_commitrw+0x128/0x3d0 [ost]
[232945.456006]  [<ffffffffa08fd84d>] ost_brw_write+0xe4d/0x1500 [ost]
[232945.456006]  [<ffffffff8127a646>] ? vsnprintf+0x2b6/0x5f0
[232945.456006]  [<ffffffffa104a300>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
[232945.456006]  [<ffffffffa0902a55>] ost_handle+0x3125/0x4550 [ost]
[232945.456006]  [<ffffffffa0d873d4>] ? libcfs_id2str+0x74/0xb0 [libcfs]
[232945.456006]  [<ffffffffa10989c3>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc]
[232945.456006]  [<ffffffffa0d7b66e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
[232945.456006]  [<ffffffffa1091691>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
[232945.456006]  [<ffffffff81051f73>] ? __wake_up+0x53/0x70
[232945.456006]  [<ffffffffa109b4fd>] ptlrpc_main+0xb3d/0x18e0 [ptlrpc]
[232945.495125]  [<ffffffffa109a9c0>] ? ptlrpc_main+0x0/0x18e0 [ptlrpc]
[232945.495125]  [<ffffffff8100c14a>] child_rip+0xa/0x20
[232945.495125]  [<ffffffffa109a9c0>] ? ptlrpc_main+0x0/0x18e0 [ptlrpc]
[232945.495125]  [<ffffffffa109a9c0>] ? ptlrpc_main+0x0/0x18e0 [ptlrpc]
[232945.495125]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
[232945.495125] Code: e5 0f 1f 44 00 00 48 85 f6 74 2a 0f b7 87 fe fe ff ff 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 8b 54 30 28 b8 01 00 00 00 <0f> b7 4a 02 66 3b 4a 04 72 05 b8 18 00 00 00 c9 c3 90 55 48 89 

Crashdump is in /exports/crashdumps/192.168.10.220-2012-12-30-10:34:18



 Comments   
Comment by Peter Jones [ 02/Jan/13 ]

Niu

Could you please look into this one?

Thanks

Peter

Comment by Oleg Drokin [ 03/Jan/13 ]

Hm, I just hit an issue that might be related, also in racer, but this one is in b2_1:

[66139.377930] Lustre: 16068:0:(cmm_object.c:689:cml_rename_warn()) cml_rename failed for mdo_rename, should revoke: [mo_po [0x200000400:0x1:0x0]] [mo_pn [0x200000400:0x1:0x0]] [lf [0x200000401:0x67cb:0x0]] [sname 4] [mo_t [0x200000400:0x662e:0x0]] [tname 7] [err -39]
[66139.378939] Lustre: 16068:0:(cmm_object.c:689:cml_rename_warn()) Skipped 7 previous similar messages
[66221.622554] Slab corruption (Not tainted): ldiskfs_prealloc_space start=ffff88008bcc2d58, len=128
[66221.623116] Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
[66221.623353] Last user: [<ffffffffa05ef139>](ldiskfs_mb_pa_callback+0x19/0x20 [ldiskfs])
[66221.623818] 000: 48 69 c0 ff ff e8 ff ff 6b 6b 6b 6b 6b 6b 6b 6b
[66221.624333] Prev obj: start=ffff88008bcc2cc0, len=128
[66221.624641] Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
[66221.624922] Last user: [<ffffffffa05f2757>](ldiskfs_mb_new_group_pa+0x67/0x300 [ldiskfs])
[66221.625960] 000: e0 de 90 b6 00 88 ff ff c8 6a c0 ff ff e8 ff ff
[66221.626767] 010: 08 16 0b b0 00 88 ff ff d0 97 63 20 00 88 ff ff
[66221.627114] Next obj: start=ffff88008bcc2df0, len=128
[66221.627347] Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
[66221.627587] Last user: [<ffffffffa05ef139>](ldiskfs_mb_pa_callback+0x19/0x20 [ldiskfs])
[66221.628032] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
[66221.628368] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
[66242.372765] Slab corruption (Not tainted): ldiskfs_prealloc_space start=ffff88008bcc2d58, len=128
[66242.374951] Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
[66242.375197] Last user: [<ffffffffa05ef139>](ldiskfs_mb_pa_callback+0x19/0x20 [ldiskfs])
[66242.375635] 000: 48 69 c0 ff ff e8 ff ff 6b 6b 6b 6b 6b 6b 6b 6b
[66242.375966] Prev obj: start=ffff88008bcc2cc0, len=128
[66242.376202] Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
[66242.376442] Last user: [<ffffffffa05f2757>](ldiskfs_mb_new_group_pa+0x67/0x300 [ldiskfs])
[66242.376883] 000: e0 de 90 b6 00 88 ff ff c8 6a c0 ff ff e8 ff ff
[66242.381760] 010: 08 16 0b b0 00 88 ff ff d0 97 63 20 00 88 ff ff
[66242.385331] Next obj: start=ffff88008bcc2df0, len=128
[66242.385565] Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
[66242.385804] Last user: [<ffffffffa05ef139>](ldiskfs_mb_pa_callback+0x19/0x20 [ldiskfs])
[66242.386245] 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
[66242.386573] 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
[66242.604727] BUG: spinlock bad magic on CPU#4, ll_ost_io_121/15373 (Not tainted)
[66242.605217] BUG: unable to handle kernel paging request at ffffe8ffffc07028
[66242.605497] IP: [<ffffffff812772c9>] strnlen+0x9/0x40
[66242.605743] PGD b1b87067 PUD b1b86067 PMD b1b85067 PTE 0
[66242.608625] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[66242.609017] last sysfs file: /sys/devices/system/cpu/possible
[66242.609176] CPU 4 
[66242.609176] Modules linked in: lustre obdfilter ost osd_ldiskfs cmm fsfilt_ldiskfs mdt mdd mds mgs ldiskfs mgc lov osc mdc lmv fid fld lquota ptlrpc obdclass lvfs ksocklnd lnet libcfs jbd exportfs ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_net virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
[66242.609176] 
[66242.609176] Pid: 15373, comm: ll_ost_io_121 Not tainted 2.6.32-debug #6 Bochs Bochs
[66242.609176] RIP: 0010:[<ffffffff812772c9>]  [<ffffffff812772c9>] strnlen+0x9/0x40
[66242.609176] RSP: 0018:ffff88009fd1b130  EFLAGS: 00010086
[66242.609176] RAX: ffffffff81797d7c RBX: ffffffff81ddf600 RCX: 0000000000000002
[66242.609176] RDX: ffffe8ffffc07028 RSI: ffffffffffffffff RDI: ffffe8ffffc07028
[66242.609176] RBP: ffff88009fd1b130 R08: 0000000000000073 R09: 0000000000000000
[66242.609176] R10: 0000000000000007 R11: 000000000000000f R12: ffffffff81ddf236
[66242.609176] R13: ffffe8ffffc07028 R14: 00000000ffffffff R15: 0000000000000000
[66242.609176] FS:  00007fe8f9da0700(0000) GS:ffff880006300000(0000) knlGS:0000000000000000
[66242.609176] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[66242.609176] CR2: ffffe8ffffc07028 CR3: 00000000a3849000 CR4: 00000000000006e0
[66242.609176] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[66242.609176] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[66242.609176] Process ll_ost_io_121 (pid: 15373, threadinfo ffff88009fd1a000, task ffff88009fd18180)
[66242.609176] Stack:
[66242.609176]  ffff88009fd1b170 ffffffff81279160 ffff88009fd1b150 ffffffff81ddf236
[66242.609176] <d> ffffffff817c62fe ffffffff817c62fc ffff88009fd1b2e0 ffffffff81ddf600
[66242.609176] <d> ffff88009fd1b210 ffffffff8127a6a0 0000000000000004 0000000affffffff
[66242.609176] Call Trace:
[66242.609176]  [<ffffffff81279160>] string+0x40/0x100
[66242.609176]  [<ffffffff8127a6a0>] vsnprintf+0x310/0x5f0
[66242.609176]  [<ffffffff8127aa41>] vscnprintf+0x11/0x30
[66242.609176]  [<ffffffff81069cd6>] vprintk+0xc6/0x5a0
[66242.609176]  [<ffffffff814f76ed>] printk+0x41/0x44
[66242.609176]  [<ffffffff81280985>] spin_bug+0xa5/0x100
[66242.609176]  [<ffffffff81280b25>] _raw_spin_lock+0xa5/0x180
[66242.609176]  [<ffffffff814fafde>] _spin_lock+0xe/0x10
[66242.609176]  [<ffffffffa05eeb3c>] ldiskfs_mb_use_preallocated+0x13c/0x230 [ldiskfs]
[66242.609176]  [<ffffffffa05f6d3d>] ldiskfs_mb_new_blocks+0x2bd/0x660 [ldiskfs]
[66242.609176]  [<ffffffff811adc29>] ? __find_get_block+0xa9/0x200
[66242.609176]  [<ffffffffa07cc01a>] ldiskfs_ext_new_extent_cb+0x59a/0x6d0 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffffa05dd6df>] ldiskfs_ext_walk_space+0x14f/0x340 [ldiskfs]
[66242.609176]  [<ffffffffa07cba80>] ? ldiskfs_ext_new_extent_cb+0x0/0x6d0 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffffa07cb798>] fsfilt_map_nblocks+0xd8/0x100 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffffa0377263>] ? start_this_handle+0x433/0x550 [jbd2]
[66242.609176]  [<ffffffffa07cb8d3>] fsfilt_ldiskfs_map_ext_inode_pages+0x113/0x220 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffff81163bf1>] ? kmem_cache_alloc+0x141/0x240
[66242.609176]  [<ffffffffa07cba75>] fsfilt_ldiskfs_map_inode_pages+0x95/0xa0 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffffa0888c8b>] filter_alloc_iobuf+0x8eb/0x11e0 [obdfilter]
[66242.609176]  [<ffffffffa07ca5c8>] ? fsfilt_ldiskfs_brw_start+0x278/0x590 [fsfilt_ldiskfs]
[66242.609176]  [<ffffffffa088a84b>] filter_commitrw_write+0x12cb/0x2d40 [obdfilter]
[66242.609176]  [<ffffffffa0b5edb9>] ? lnet_send+0x3b9/0xa50 [lnet]
[66242.609176]  [<ffffffffa0b5f6ed>] ? LNetGet+0x29d/0x830 [lnet]
[66242.609176]  [<ffffffffa087ddca>] filter_commitrw+0x27a/0x2a0 [obdfilter]
[66242.609176]  [<ffffffff814faf3e>] ? _spin_unlock+0xe/0x10
[66242.609176]  [<ffffffffa0830bbd>] obd_commitrw+0x11d/0x3c0 [ost]
[66242.609176]  [<ffffffffa0838794>] ost_brw_write+0xcc4/0x1600 [ost]
[66242.609176]  [<ffffffff8127a646>] ? vsnprintf+0x2b6/0x5f0
[66242.609176]  [<ffffffffa11d4080>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
[66242.609176]  [<ffffffffa083dc86>] ost_handle+0x2b76/0x4280 [ost]
[66242.609176]  [<ffffffffa0abe524>] ? libcfs_id2str+0x74/0xb0 [libcfs]
[66242.609176]  [<ffffffffa1221789>] ptlrpc_main+0xc79/0x1aa0 [ptlrpc]
[66242.609176]  [<ffffffffa1220b10>] ? ptlrpc_main+0x0/0x1aa0 [ptlrpc]
[66242.609176]  [<ffffffff8100c14a>] child_rip+0xa/0x20
[66242.609176]  [<ffffffffa1220b10>] ? ptlrpc_main+0x0/0x1aa0 [ptlrpc]
[66242.609176]  [<ffffffffa1220b10>] ? ptlrpc_main+0x0/0x1aa0 [ptlrpc]
[66242.609176]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
[66242.609176] Code: 66 90 48 83 c2 01 80 3a 00 75 f7 48 89 d0 48 29 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 f6 48 89 e5 74 2e <80> 3f 00 74 29 48 83 ee 01 48 89 f8 eb 12 66 0f 1f 84 00 00 00 
[66242.609176] RIP  [<ffffffff812772c9>] strnlen+0x9/0x40
[66242.609176]  RSP <ffff88009fd1b130>
[66242.609176] CR2: ffffe8ffffc07028

crashdump and modules in /exports/crashdumps/192.168.10.220-2013-01-03-12\:15\:57

Comment by Niu Yawei (Inactive) [ 04/Jan/13 ]

This two crash looks little bit different, the master crashed in ldiskfs_ext_calc_credits_for_insert(), it seems because we didn't protect this function call by i_data_sem. (see ldiskfs_ext_new_extent_cb()). The 2.1 crashed in ldiskfs_mb_use_preallocated(), I'm not sure the reason yet, what's the kernel version did you run 2.1 test for?

BTW: where is the /exports/crashdumps located? Thanks.

Comment by Oleg Drokin [ 05/Jan/13 ]

if you think these are different, I can file another bug for the 2.1 crash.
the dir is on my test cluster

Comment by Niu Yawei (Inactive) [ 05/Jan/13 ]

Yes, please file another ticket for the 2.1 crash. It crashed on 3334 of mballoc.c, looks like some race on the lg_prealloc_list.

Comment by Niu Yawei (Inactive) [ 05/Jan/13 ]

In ldiskfs_mb_add_n_trim():

        /* Add the prealloc space to lg */
        rcu_read_lock();
        list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order],
                                                pa_inode_list) {
                spin_lock(&tmp_pa->pa_lock);
                if (tmp_pa->pa_deleted) {
                        spin_unlock(&tmp_pa->pa_lock);
                        continue;
                }
                if (!added && pa->pa_free < tmp_pa->pa_free) {
                        /* Add to the tail of the previous entry */
                        list_add_tail_rcu(&pa->pa_inode_list,
                                                &tmp_pa->pa_inode_list);
                        added = 1;
                        /*
                         * we want to count the total
                         * number of entries in the list
                         */
                }
                spin_unlock(&tmp_pa->pa_lock);
                lg_prealloc_count++;
        }
        if (!added)
                list_add_tail_rcu(&pa->pa_inode_list,
                                        &lg->lg_prealloc_list[order]);
        rcu_read_unlock();

Looks it doesn't take the lg_prealloc_lock when changing the lg_prealloc_list, not sure if it's correct. Alex, any comments? Thanks.

Comment by Niu Yawei (Inactive) [ 05/Jan/13 ]

patch against master for the first crash: http://review.whamcloud.com/4957

Comment by Alex Zhuravlev [ 05/Jan/13 ]

Niu, for the 4957 I'm not sure.. why do you think the path is invalid ? in which specific place?

as for the second problem I agree the list modification must be protected with the spinlock like in other places.

Comment by Niu Yawei (Inactive) [ 05/Jan/13 ]

Niu, for the 4957 I'm not sure.. why do you think the path is invalid ? in which specific place?

ext3_ext_calc_credits_for_insert() is accessing the tree 'path', however, the extent tree could probably has been changed (depth could be changed), so the path[depth] could be invalid, no?

Comment by Alex Zhuravlev [ 05/Jan/13 ]

ldiskfs_ext_calc_credits_for_insert():
depth = ext_depth(inode);

and depth can change by this moment. instead we should be taking it from path->p_depth like ldiskfs_ext_drop_refs()

though I'm pretty confused this hasn't happened before...

Comment by Niu Yawei (Inactive) [ 06/Jan/13 ]

patch against master for the second crash: http://review.whamcloud.com/4961

Comment by Niu Yawei (Inactive) [ 08/Jan/13 ]

patch for b2_1 (first crash): http://review.whamcloud.com/4977

Comment by Niu Yawei (Inactive) [ 08/Jan/13 ]

patch for b2_1 (second crash): http://review.whamcloud.com/4978

Comment by Andreas Dilger [ 18/Feb/13 ]

The patches for master are landed. The two patches for b2_1 are ready to land.

Comment by Peter Jones [ 18/Feb/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:26:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.