[LU-12097] general protection fault in osd_idc_find_and_init Created: 21/Mar/19 Updated: 10/Jun/19 Resolved: 18/May/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Olaf Faaland | Assignee: | Sebastien Buisson |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
Crash occurred on MDT0001 (2 MDS system) |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
An lctl issued on the MDS triggered a crash. Probably lctl changelog_deregister. From the console log: [1309661.423362] general protection fault: 0000 [#1] SMP [1309661.429116] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) libcfs(OE) ib_ucm iw_cxgb4 iw_cxgb3 rpcrdma rdma_ucm ib_umad ib_uverbs ib_ipoib ib_iser rdma_cm iw_cm ib_cm iTCO_wdt iTCO_vendor_support mlx4_ib ib_core mlx4_en sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass joydev pcspkr dm_round_robin i2c_i801 ioatdma lpc_ich ses enclosure mlx4_core sg devlink ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq sch_fq_codel zfs(POE) binfmt_misc zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) msr_safe(OE) ip_tables nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache overlay(T) ext4 mbcache jbd2 dm_service_time sd_mod crc_t10dif crct10dif_generic [1309661.510303] be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i crct10dif_pclmul cxgb3 crct10dif_common crc32_pclmul crc32c_intel mgag200 ghash_clmulni_intel mdio 8021q drm_kms_helper syscopyarea garp libcxgbi sysfillrect aesni_intel sysimgblt mrp fb_sys_fops ttm lrw stp gf128mul llc libcxgb glue_helper ablk_helper drm qla4xxx cryptd isci ahci iscsi_boot_sysfs igb libsas libahci dca mpt2sas ptp libata drm_panel_orientation_quirks pps_core i2c_algo_bit raid_class scsi_transport_sas wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi CPU: 4 PID: 113823 Comm: lctl Kdump: loaded Tainted: P OE ------------ T 3.10.0-957.5.1.3chaos.ch6.x86_64 #1 Hardware name: appro gb812x-mds-llnl/S2600JF, BIOS SE5C600.86B.02.06.0002.101320150901 10/13/2015 task: ffff95c37d6d1040 ti: ffff95d09fe30000 task.ti: ffff95d09fe30000 RIP: 0010:[<ffffffffc179fa6c>] [<ffffffffc179fa6c>] osd_idc_find_and_init+0x3c/0x80 [osd_zfs] RSP: 0018:ffff95d09fe336d8 EFLAGS: 00010206 RAX: ffff95d937a80820 RBX: ffff95d8ff0184e0 RCX: 0000ffffffffffff RDX: 5a5a5a5a5a5a5a5a RSI: ffffffffc17b18e0 RDI: ffff95d09fe33d58 RBP: ffff95d09fe336f8 R08: 0000000000025c40 R09: ffff95d8ff01b810 R10: 0000000000003e00 R11: ffffcc545f8d4e00 R12: ffff95d09fe33d58 R13: ffff95db04b98000 R14: ffff95d8ff019c08 R15: ffff95d8ff018568 FS: 00002aaaaab111c0(0000) GS:ffff95cb3e100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6b9db3bdb8 CR3: 0000000deb844000 CR4: 00000000000607e0 Call Trace: [<ffffffffc1793f79>] osd_declare_destroy+0x309/0x410 [osd_zfs] [<ffffffffc1303571>] llog_osd_declare_destroy+0x331/0x650 [obdclass] [<ffffffffc12ecd64>] llog_declare_destroy+0x54/0x190 [obdclass] [<ffffffffc12ef9c8>] llog_cancel_rec+0x108/0x880 [obdclass] [<ffffffffc12f6840>] llog_cat_cancel_records+0x1d0/0x3d0 [obdclass] [<ffffffffc19d1784>] llog_changelog_cancel_cb+0xe4/0x1d0 [mdd] [<ffffffffc12f09bb>] llog_process_thread+0x87b/0x1470 [obdclass] [<ffffffffc1314707>] ? lprocfs_oh_tally+0x17/0x50 [obdclass] [<ffffffffc19d16a0>] ? mdd_hsm_actions_llog_fini+0x1a0/0x1a0 [mdd] [<ffffffffc12f166c>] llog_process_or_fork+0xbc/0x450 [obdclass] [<ffffffffc12f6e7d>] llog_cat_process_cb+0x43d/0x4e0 [obdclass] [<ffffffffc12f09bb>] llog_process_thread+0x87b/0x1470 [obdclass] [<ffffffffc12f6a40>] ? llog_cat_cancel_records+0x3d0/0x3d0 [obdclass] [<ffffffffc12f166c>] llog_process_or_fork+0xbc/0x450 [obdclass] [<ffffffffc12f6a40>] ? llog_cat_cancel_records+0x3d0/0x3d0 [obdclass] [<ffffffffc12f5f19>] llog_cat_process_or_fork+0x199/0x2a0 [obdclass] [<ffffffffc19d16a0>] ? mdd_hsm_actions_llog_fini+0x1a0/0x1a0 [mdd] [<ffffffffc12f604e>] llog_cat_process+0x2e/0x30 [obdclass] [<ffffffffc19d0a48>] llog_changelog_cancel+0x58/0x1d0 [mdd] [<ffffffffc19d24e0>] ? mdd_changelog_write_header+0x40/0x4b0 [mdd] [<ffffffffc12f7a07>] llog_cancel+0x57/0x250 [obdclass] [<ffffffffc19d2b59>] mdd_changelog_llog_cancel+0xd9/0x270 [mdd] [<ffffffffc19d5edf>] mdd_iocontrol+0x13af/0x16f0 [mdd] [<ffffffffc18968dc>] mdt_iocontrol+0x5ec/0xb00 [mdt] [<ffffffffc1305789>] class_handle_ioctl+0x19a9/0x1e50 [obdclass] [<ffffffffb15f6c8d>] ? handle_mm_fault+0x39d/0x9b0 [<ffffffffb170d1fe>] ? security_capable+0x1e/0x20 [<ffffffffc12ea5d2>] obd_class_ioctl+0xd2/0x170 [obdclass] [<ffffffffb1664b60>] do_vfs_ioctl+0x3d0/0x600 [<ffffffffb1b936fb>] ? __do_page_fault+0x23b/0x550 [<ffffffffb1664e31>] SyS_ioctl+0xa1/0xc0 [<ffffffffb1b98f5b>] system_call_fastpath+0x22/0x27 [<ffffffffb1b98ea1>] ? system_call_after_swapgs+0xae/0x146 |
| Comments |
| Comment by Olaf Faaland [ 21/Mar/19 ] |
|
For our own tracking: local ticket is TOSS-4474 |
| Comment by Peter Jones [ 22/Mar/19 ] |
|
Sébastien Could you please advise here? Thanks Peter |
| Comment by Sebastien Buisson [ 25/Mar/19 ] |
|
Would you by any chance be able to share a crash dump? My impression is, if this problem relates to other ticket Could you try to read the catalog file with llog_reader? llog_reader /zfs_mount/changelog_catalog It should output something like this: Header size : 8192 Time : Mon Mar 25 15:36:53 2019 Number of records: 4 Target uuid : ----------------------- #01 (064)id=[0x6:0x1:0x0]:0 path=oi.1/0x1:0x6:0x0 #02 (064)id=[0x7:0x1:0x0]:0 path=oi.1/0x1:0x7:0x0 #03 (064)id=[0x8:0x1:0x0]:0 path=oi.1/0x1:0x8:0x0 #04 (064)id=[0x9:0x1:0x0]:0 path=oi.1/0x1:0x9:0x0 Then it would be interesting to see if all the llog files referenced as path in the catalog file do exist, and if they do contain valid changelog records that can be read with llog_reader. for i in 6 7 8 9; do llog_reader /zfs_mount/oi.1/0x1:0x${i}:0x0; done
Thanks. |
| Comment by Olaf Faaland [ 25/Mar/19 ] |
|
Hi Sebastien, |
| Comment by Olaf Faaland [ 28/Mar/19 ] |
|
Hi Sebastien, The MDT that crashed with this stack has 2911 entries in changelog_catalog. The llog with the fewest records has 11590; the llog with the most records has 64101. ---------------- eporter82 ---------------- Changelog_catalog header: Header size : 8192 Number of records: 2911 Missing changelog files: 0 Existing changelog files: 2911 checking oi.1/0x1:0x2e14f:0x0 checking oi.1/0x1:0x2e150:0x0 <... 2909 rows removed ...> all files checked MAX 64101 MIN 11590 TOT 185533101 AVG 63735 |
| Comment by Olaf Faaland [ 10/Apr/19 ] |
|
Hi Sebastien, |
| Comment by Sebastien Buisson [ 11/Apr/19 ] |
|
That is good to hear, thanks. |
| Comment by Sebastien Buisson [ 03/May/19 ] |
|
So, I think that the issue is now fixed on site, and the actual fix in Lustre has been pushed to Gerrit under Please correct me if I am wrong. |
| Comment by Olaf Faaland [ 03/May/19 ] |
I do not see in the ticket that we ever determined why osd_idc_find_and_init() attempted to dereference an invalid pointer. There was speculation that the changelog_catalog was incoherent, but I was able to read the catalog with llog_reader, confirm that it pointed to existing llog files, and read those llog files with llog_reader to confirm they contained changelog type records. So I think that rules out the incoherency theory. But maybe I missed something - if so, let me know. Thanks! |
| Comment by Sebastien Buisson [ 06/May/19 ] |
|
Correct, I have not been able to find the origin of the invalid pointer dereference by osd_idc_find_and_init(). I was not able to reproduce locally, and unfortunately you are not able to share your crash dump. As I read this ticket again, I realize that this is in fact the 'lctl changelog_deregister' command that initially triggered the crash, and then left "orphaned" changelog entries. So I agree it is most likely that the crash is not due to an incoherent changelog. I need to look again at the code to see if I can see something odd. Thanks, |
| Comment by Olaf Faaland [ 06/May/19 ] |
|
Hi Sebastien, I get the following from inspecting the kmods with gdb. From zfs-osd: (gdb) l *(osd_idc_find_and_init+0x3c) 0x14a9c is in osd_idc_find_and_init (/usr/src/debug/lustre-2.10.6_2.chaos/lustre/osd-zfs/osd_oi.c:896). 891 idc = osd_idc_add(env, osd, fid); 892 if (IS_ERR(idc)) 893 return PTR_ERR(idc); 894 895 if (obj->oo_dn) 896 idc->oic_dnode = obj->oo_dn->dn_object; 897 898 return 0; 899 } and (gdb) l *(osd_declare_destroy+0x309) 0x8fa9 is in osd_declare_destroy (/usr/src/debug/lustre-2.10.6_2.chaos/lustre/osd-zfs/osd_object.c:603). 598 599 /* will help to find FID->ino when this object is being 600 * added to PENDING/ */ 601 osd_idc_find_and_init(env, osd, obj); 602 603 RETURN(0); 604 } 605 606 static int osd_destroy(const struct lu_env *env, struct dt_object *dt, 607 struct thandle *th) and below that in obdclass: (gdb) l *(llog_osd_declare_destroy+0x331)
0x195a1 is in llog_osd_declare_destroy (/usr/src/debug/lustre-2.10.6_2.chaos/lustre/obdclass/llog_osd.c:1722).
1717 rc = dt_declare_ref_del(env, o, th);
1718 if (rc < 0)
1719 GOTO(out_put, rc);
1720
1721 rc = dt_declare_destroy(env, o, th);
1722 if (rc < 0)
1723 GOTO(out_put, rc);
1724
1725 if (loghandle->lgh_ctxt->loc_flags & LLOG_CTXT_FLAG_NORMAL_FID) {
1726 rc = llog_osd_regular_fid_del_name_entry(env, o, th, true);
and (gdb) l *(llog_declare_destroy+0x54)
0x2d94 is in llog_declare_destroy (/usr/src/debug/lustre-2.10.6_2.chaos/libcfs/include/libcfs/libcfs_debug.h:219).
214 /**
215 * Filters out logging messages based on mask and subsystem.
216 */
217 static inline int cfs_cdebug_show(unsigned int mask, unsigned int subsystem)
218 {
219 return mask & D_CANTMASK ||
220 ((libcfs_debug & mask) && (libcfs_subsystem_debug & subsystem));
221 }
222
223 # define __CDEBUG(cdls, mask, format, ...) \
and (gdb) l *(llog_cancel_rec+0x108)
0x59f8 is in llog_cancel_rec (/usr/src/debug/lustre-2.10.6_2.chaos/lustre/obdclass/llog.c:231).
226 if (rc < 0)
227 GOTO(out_trans, rc);
228
229 if ((llh->llh_flags & LLOG_F_ZAP_WHEN_EMPTY)) {
230 rc = llog_declare_destroy(env, loghandle, th);
231 if (rc < 0)
232 GOTO(out_trans, rc);
233 }
234
235 th->th_wait_submit = 1;
|
| Comment by Sebastien Buisson [ 10/May/19 ] |
|
Thanks Olaf, this is really helpful. It seems the problem stems from the data structure pointed to by obj in osd_idc_find_and_init, that is being corrupted. In order to find out if the object is already freed, would you be able to dump the content of the obj variable as a struct osd_object? I am interested in the value of the oo_destroyed field in particular. Thanks, |
| Comment by Olaf Faaland [ 10/May/19 ] |
|
Hi Sebastien, I'm unable to get that struct dump for you, sorry. The core dump was not written completely because Pacemaker powered the node off while kdump was writing it out. Sounds like we may have gone as far with this as we can - if so feel free to close the ticket. Thanks
|
| Comment by Peter Jones [ 18/May/19 ] |
|
Seems like we're stuck on this one for the time being |