Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
This is from DDN-777, oss2c106 hit a crash:
[17555139.368778] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [17555139.371091] IP: [<ffffffffc08c165d>] class_export_get+0xd/0x90 [obdclass] [17555139.373230] PGD 0 [17555139.374784] Oops: 0002 [#1] SMP [17555139.376492] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_ssse3 sha512_generic crypto_null libcfs(OE) nfsv3 nfs fscache bonding rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) ptp pps_core mlx4_ib(OE) ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink sb_edac edac_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper sg i6300esb joydev ppdev parport_pc parport sfablkdriver(OE) cryptd i2c_piix4 pcspkr nfsd auth_rpcgss nfs_acl lockd knem(OE) grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod sr_mod [17555139.393461] cdrom crc_t10dif crct10dif_generic ata_generic pata_acpi bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix libata e1000 crct10dif_pclmul crct10dif_common crc32c_intel igbvf i2c_core floppy serio_raw dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mlxfw] [17555139.400726] CPU: 8 PID: 24841 Comm: lfsck Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.2.7.21.3.ddn19.g2ba2cd7.x86_64 #1 [17555139.404837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [17555139.408957] task: ffff88081580dee0 ti: ffff880dee718000 task.ti: ffff880dee718000 [17555139.411193] RIP: 0010:[<ffffffffc08c165d>] [<ffffffffc08c165d>] class_export_get+0xd/0x90 [obdclass] [17555139.413660] RSP: 0018:ffff880dee71bba8 EFLAGS: 00010286 [17555139.415636] RAX: ffff8814538b800c RBX: 0000000000000000 RCX: 000000000000001a [17555139.417801] RDX: 000000000000000e RSI: ffff880dee71bbd0 RDI: 0000000000000000 [17555139.419941] RBP: ffff880dee71bbb0 R08: 0000000000000031 R09: 0000000000000003 [17555139.422077] R10: 0000000000000000 R11: 000000000000000f R12: ffff881660650fc0 [17555139.424199] R13: ffff88149a1dadb4 R14: ffff881660650fd0 R15: ffff88145435a000 [17555139.426297] FS: 0000000000000000(0000) GS:ffff881664000000(0000) knlGS:0000000000000000 [17555139.428493] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17555139.430408] CR2: 0000000000000040 CR3: 0000000001a06000 CR4: 00000000003607e0 [17555139.432470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [17555139.434498] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [17555139.436498] Call Trace: [17555139.437987] [<ffffffffc091ee11>] lustre_find_lwp_by_index+0x161/0x190 [obdclass] [17555139.440034] [<ffffffffc10dad69>] lfsck_layout_slave_notify_master+0x199/0x9a0 [lfsck] [17555139.442104] [<ffffffffc10e0a97>] lfsck_layout_slave_double_scan+0xf7/0xe40 [lfsck] [17555139.444121] [<ffffffff810c7c70>] ? wake_up_state+0x20/0x20 [17555139.445889] [<ffffffffc10a312f>] lfsck_double_scan+0x5f/0x210 [lfsck] [17555139.447751] [<ffffffffc10479ed>] ? osd_otable_it_fini+0x1ad/0x3b0 [osd_ldiskfs] [17555139.449695] [<ffffffff811e3896>] ? kfree+0x106/0x140 [17555139.451391] [<ffffffffc10a88b6>] lfsck_master_engine+0x446/0x13f0 [lfsck] [17555139.453276] [<ffffffff810c7c70>] ? wake_up_state+0x20/0x20 [17555139.455000] [<ffffffffc10a8470>] ? lfsck_master_oit_engine+0x1d10/0x1d10 [lfsck] [17555139.456932] [<ffffffff810b4031>] kthread+0xd1/0xe0 [17555139.458556] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40 [17555139.460299] [<ffffffff816c1577>] ret_from_fork+0x77/0xb0 [17555139.461949] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40 [17555139.463658] Code: 85 3d e4 4b f0 ff 55 48 89 e5 74 0c 8b 05 dc 4b f0 ff c1 e8 05 83 e0 01 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb <f0> ff 47 40 f6 05 b4 4b f0 ff 40 74 09 f6 05 af 4b f0 ff 20 75 [17555139.469009] RIP [<ffffffffc08c165d>] class_export_get+0xd/0x90 [obdclass] [17555139.470849] RSP <ffff880dee71bba8> [17555139.472275] CR2: 0000000000000040
This should be an lfsck bug, I'll check whether it's fixed in master first.