Details
-
Bug
-
Resolution: Incomplete
-
Minor
-
None
-
Lustre 2.10.6
-
Crash occurred on MDT0001 (2 MDS system)
zfs-0.7.11-5llnl.ch6
lustre-2.10.6_2.chaos
See https://github.com/llnl/lustre and https://github.com/llnl/zfs
Linux version 3.10.0-957.5.1.3chaos.ch6.x86_64
RHEL 7.6 based distro
-
3
-
9223372036854775807
Description
An lctl issued on the MDS triggered a crash. Probably lctl changelog_deregister. From the console log:
[1309661.423362] general protection fault: 0000 [#1] SMP [1309661.429116] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ko2iblnd(OE) lnet(OE) libcfs(OE) ib_ucm iw_cxgb4 iw_cxgb3 rpcrdma rdma_ucm ib_umad ib_uverbs ib_ipoib ib_iser rdma_cm iw_cm ib_cm iTCO_wdt iTCO_vendor_support mlx4_ib ib_core mlx4_en sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass joydev pcspkr dm_round_robin i2c_i801 ioatdma lpc_ich ses enclosure mlx4_core sg devlink ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq sch_fq_codel zfs(POE) binfmt_misc zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) msr_safe(OE) ip_tables nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache overlay(T) ext4 mbcache jbd2 dm_service_time sd_mod crc_t10dif crct10dif_generic [1309661.510303] be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i crct10dif_pclmul cxgb3 crct10dif_common crc32_pclmul crc32c_intel mgag200 ghash_clmulni_intel mdio 8021q drm_kms_helper syscopyarea garp libcxgbi sysfillrect aesni_intel sysimgblt mrp fb_sys_fops ttm lrw stp gf128mul llc libcxgb glue_helper ablk_helper drm qla4xxx cryptd isci ahci iscsi_boot_sysfs igb libsas libahci dca mpt2sas ptp libata drm_panel_orientation_quirks pps_core i2c_algo_bit raid_class scsi_transport_sas wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi CPU: 4 PID: 113823 Comm: lctl Kdump: loaded Tainted: P OE ------------ T 3.10.0-957.5.1.3chaos.ch6.x86_64 #1 Hardware name: appro gb812x-mds-llnl/S2600JF, BIOS SE5C600.86B.02.06.0002.101320150901 10/13/2015 task: ffff95c37d6d1040 ti: ffff95d09fe30000 task.ti: ffff95d09fe30000 RIP: 0010:[<ffffffffc179fa6c>] [<ffffffffc179fa6c>] osd_idc_find_and_init+0x3c/0x80 [osd_zfs] RSP: 0018:ffff95d09fe336d8 EFLAGS: 00010206 RAX: ffff95d937a80820 RBX: ffff95d8ff0184e0 RCX: 0000ffffffffffff RDX: 5a5a5a5a5a5a5a5a RSI: ffffffffc17b18e0 RDI: ffff95d09fe33d58 RBP: ffff95d09fe336f8 R08: 0000000000025c40 R09: ffff95d8ff01b810 R10: 0000000000003e00 R11: ffffcc545f8d4e00 R12: ffff95d09fe33d58 R13: ffff95db04b98000 R14: ffff95d8ff019c08 R15: ffff95d8ff018568 FS: 00002aaaaab111c0(0000) GS:ffff95cb3e100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6b9db3bdb8 CR3: 0000000deb844000 CR4: 00000000000607e0 Call Trace: [<ffffffffc1793f79>] osd_declare_destroy+0x309/0x410 [osd_zfs] [<ffffffffc1303571>] llog_osd_declare_destroy+0x331/0x650 [obdclass] [<ffffffffc12ecd64>] llog_declare_destroy+0x54/0x190 [obdclass] [<ffffffffc12ef9c8>] llog_cancel_rec+0x108/0x880 [obdclass] [<ffffffffc12f6840>] llog_cat_cancel_records+0x1d0/0x3d0 [obdclass] [<ffffffffc19d1784>] llog_changelog_cancel_cb+0xe4/0x1d0 [mdd] [<ffffffffc12f09bb>] llog_process_thread+0x87b/0x1470 [obdclass] [<ffffffffc1314707>] ? lprocfs_oh_tally+0x17/0x50 [obdclass] [<ffffffffc19d16a0>] ? mdd_hsm_actions_llog_fini+0x1a0/0x1a0 [mdd] [<ffffffffc12f166c>] llog_process_or_fork+0xbc/0x450 [obdclass] [<ffffffffc12f6e7d>] llog_cat_process_cb+0x43d/0x4e0 [obdclass] [<ffffffffc12f09bb>] llog_process_thread+0x87b/0x1470 [obdclass] [<ffffffffc12f6a40>] ? llog_cat_cancel_records+0x3d0/0x3d0 [obdclass] [<ffffffffc12f166c>] llog_process_or_fork+0xbc/0x450 [obdclass] [<ffffffffc12f6a40>] ? llog_cat_cancel_records+0x3d0/0x3d0 [obdclass] [<ffffffffc12f5f19>] llog_cat_process_or_fork+0x199/0x2a0 [obdclass] [<ffffffffc19d16a0>] ? mdd_hsm_actions_llog_fini+0x1a0/0x1a0 [mdd] [<ffffffffc12f604e>] llog_cat_process+0x2e/0x30 [obdclass] [<ffffffffc19d0a48>] llog_changelog_cancel+0x58/0x1d0 [mdd] [<ffffffffc19d24e0>] ? mdd_changelog_write_header+0x40/0x4b0 [mdd] [<ffffffffc12f7a07>] llog_cancel+0x57/0x250 [obdclass] [<ffffffffc19d2b59>] mdd_changelog_llog_cancel+0xd9/0x270 [mdd] [<ffffffffc19d5edf>] mdd_iocontrol+0x13af/0x16f0 [mdd] [<ffffffffc18968dc>] mdt_iocontrol+0x5ec/0xb00 [mdt] [<ffffffffc1305789>] class_handle_ioctl+0x19a9/0x1e50 [obdclass] [<ffffffffb15f6c8d>] ? handle_mm_fault+0x39d/0x9b0 [<ffffffffb170d1fe>] ? security_capable+0x1e/0x20 [<ffffffffc12ea5d2>] obd_class_ioctl+0xd2/0x170 [obdclass] [<ffffffffb1664b60>] do_vfs_ioctl+0x3d0/0x600 [<ffffffffb1b936fb>] ? __do_page_fault+0x23b/0x550 [<ffffffffb1664e31>] SyS_ioctl+0xa1/0xc0 [<ffffffffb1b98f5b>] system_call_fastpath+0x22/0x27 [<ffffffffb1b98ea1>] ? system_call_after_swapgs+0xae/0x146