[LU-7061] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: osd_scrub_refresh_mapping+0x39d/0x410 Created: 31/Aug/15  Updated: 04/Dec/19  Resolved: 10/Sep/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.8.0, Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Andriy Skulysh Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004

IP: [] osd_scrub_refresh_mapping+0x39d/0x410 [osd_ldiskfs]

PGD 424f4067 PUD 379e9067 PMD 0

Oops: 0000 1 SMP

last sysfs file: /sys/devices/pci0000:00/0000:00:0b.0/virtio3/block/vdb/queue/scheduler

CPU 0

Modules linked in: osp(U) mdd(U) lfsck(U) lod(U) mdt(U) mgs(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipt_REJECT ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 virtio_balloon virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_defrag_ipv4]

Pid: 7379, comm: mount.lustre Not tainted 2.6.32-431.17.1.el6_lustreb_neo_stable_us_unlabeled #1 Red Hat KVM

RIP: 0010:[] [] osd_scrub_refresh_mapping+0x39d/0x410 [osd_ldiskfs]

RSP: 0018:ffff88005b9af858 EFLAGS: 00010202

RAX: 00000000ffffffff RBX: ffff88005b8b8000 RCX: 0000000000000000

RDX: ffffffffa0b97b49 RSI: ffffffffa0b9c2e8 RDI: ffffffffa0bb4820

RBP: ffff88005b9af8b8 R08: 20737365636f7250 R09: 0000000000000001

R10: 20737365636f7250 R11: 0a64657265746e65 R12: 00000000ffffffe2

R13: 0000000000000001 R14: ffffffffffffffe2 R15: ffffffffa0b97f88

FS: 00007f21e28ec700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000

CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b

CR2: 0000000000000004 CR3: 000000005df01000 CR4: 00000000000006f0

DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Process mount.lustre (pid: 7379, threadinfo ffff88005b9ae000, task ffff88005b9caaa0)

Stack:

ffffffffa0b9ee9c ffff88005b8b2000 ffff88005b9af898 ffffffffffffffe2

ffff88005b9af801 0000000000000000 ffff8800414c43c8 ffff88005b8b8000

ffffffffa0b97fb0 ffff88005b8b2000 ffff8800414c43c0 ffff88005d5c2d80

Call Trace:

[] osd_scrub_setup+0xe25/0xf30 [osd_ldiskfs]

[] ? lfsck_key_init+0xd9/0x190 [lfsck]

[] osd_device_alloc+0x717/0x990 [osd_ldiskfs]

[] obd_setup+0x1bf/0x290 [obdclass]

[] class_setup+0x208/0x870 [obdclass]

[] class_process_config+0xc5c/0x1ac0 [obdclass]

[] ? libcfs_log_return+0x28/0x40 [libcfs]

[] ? lustre_cfg_new+0x40b/0x6f0 [obdclass]

[] do_lcfg+0x158/0x450 [obdclass]

[] lustre_start_simple+0x94/0x200 [obdclass]

[] server_fill_super+0x1061/0x1b92 [obdclass]

[] ? libcfs_log_return+0x28/0x40 [libcfs]

[] lustre_fill_super+0x1d8/0x550 [obdclass]

[] ? lustre_fill_super+0x0/0x550 [obdclass]

[] get_sb_nodev+0x5f/0xa0

[] lustre_get_sb+0x25/0x30 [obdclass]

[] vfs_kern_mount+0x7b/0x1b0

[] do_kern_mount+0x52/0x130

[] do_mount+0x2fb/0x930

[] sys_mount+0x90/0xe0

[] system_call_fastpath+0x16/0x1b

Code: 48 c7 c7 20 48 bb a0 c7 05 71 d6 02 00 75 00 00 00 48 c7 05 72 d6 02 00 00 00 00 00 c7 05 60 d6 02 00 00 00 00 10 44 89 74 24 18 <8b> 41 04 48 8b 53 28 45 8b 4f 08 4d 8b 07 89 44 24 10 8b 01 48

RIP [] osd_scrub_refresh_mapping+0x39d/0x410 [osd_ldiskfs]



 Comments   
Comment by Gerrit Updater [ 31/Aug/15 ]

Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/16138
Subject: LU-7061 osd-ldiskfs: NULL pointer dereference in osd_scrub_refresh_mapping
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c57b3a36b97165a02a316289a70169ea9a11cb45

Comment by Andriy Skulysh [ 31/Aug/15 ]

patch: http://review.whamcloud.com/#/c/16138/

Comment by Peter Jones [ 02/Sep/15 ]

Fan Yong

Could you please review this proposed fix?

Thanks

Peter

Comment by nasf (Inactive) [ 03/Sep/15 ]

The patch looks good. The BUG is caused by accessing NULL pointer in osd_journal_start_sb() failure handler. But I also want to know what caused the osd_journal_start_sb() failure before the NULL pointer accessing, maybe EROFS? Andriy, would you please to show the new OI scrub debug logs after applying your patch? Thanks!

Comment by Gerrit Updater [ 10/Sep/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16138/
Subject: LU-7061 osd-ldiskfs: NULL pointer dereference in osd_scrub_refresh_mapping
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c0dafc483ccc7d0200adeec1a2b187644feed74a

Comment by Joseph Gmitter (Inactive) [ 10/Sep/15 ]

Landed for 2.8.0

Comment by Niu Yawei (Inactive) [ 10/Oct/15 ]

Hit this again: https://testing.hpdd.intel.com/test_sets/5d97088a-6e55-11e5-b983-5254006e85c2

Comment by Gerrit Updater [ 03/Jun/16 ]

Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: http://review.whamcloud.com/20620
Subject: LU-7061 osd-ldiskfs: NULL pointer in osd_scrub_refresh_mapping
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2a19bd52993056fe0bc878074e8ee727cc25c38b

Comment by Gerrit Updater [ 15/Aug/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20620/
Subject: LU-7061 osd-ldiskfs: NULL pointer in osd_scrub_refresh_mapping
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e13923f50d81f8923fbb10df4446666d725261c3

Generated at Sat Feb 10 07:30:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.