Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.10.5
-
None
-
1
-
9223372036854775807
Description
server keeps crashing with the following error.
[ 981.957669] Lustre: nbp13-OST0008: trigger OI scrub by RPC for the [0x100080000:0x217edd:0x0] with flags 0x4a, rc = 0
[ 981.989579] Lustre: Skipped 11 previous similar messages
[ 1045.404615] ------------[ cut here ]------------
[ 1045.418484] kernel BUG at /tmp/rpmbuild-lustre-jlan-ItUrr9b3/BUILD/lustre-2.10.5/ldiskfs/ldiskfs.h:1907!
[ 1045.446989] invalid opcode: 0000 [#1] SMP
[ 1045.459302] Modules linked in: ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) dm_service_time ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) lpfc ib_iser(OE) libiscsi scsi_transport_iscsi crct10dif_generic scsi_transport_fc scsi_tgt rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) bonding ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) sunrpc dm_mirror dm_region_hash dm_log mlx5_ib(OE) ib_core(OE) intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul mgag200 ghash_clmulni_intel i2c_algo_bit ttm dm_multipath aesni_intel drm_kms_helper lrw syscopyarea gf128mul sysfillrect sysimgblt glue_helper fb_sys_fops ablk_helper mlx5_core(OE) mlxfw(OE) tg3 ses cryptd mlx_compat(OE) drm ptp ipmi_si enclosure mei_me i2c_core pps_core hpwdt hpilo ipmi_devintf lpc_ich dm_mod mfd_core mei shpchp pcspkr wmi ipmi_msghandler acpi_power_meter binfmt_misc tcp_bic ip_tables virtio_scsi virtio_ring virtio xfs libcrc32c ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_common sg usb_storage smartpqi(E) crc32c_intel scsi_transport_sas [last unloaded: pps_core]
[ 1045.776428] CPU: 5 PID: 11348 Comm: lfsck Tainted: G OE ------------ 3.10.0-693.21.1.el7.20180508.x86_64.lustre2105 #1
[ 1045.811992] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/15/2018
[ 1045.837624] task: ffff882ddca23f40 ti: ffff882bd280c000 task.ti: ffff882bd280c000
[ 1045.860117] RIP: 0010:[<ffffffffa10fbd04>] [<ffffffffa10fbd04>] ldiskfs_rec_len_to_disk.part.9+0x4/0x10 [ldiskfs]
[ 1045.891259] RSP: 0018:ffff882bd280f980 EFLAGS: 00010207
[ 1045.907218] RAX: 0000000000000000 RBX: ffff882bd280fb58 RCX: ffff882bd280f994
[ 1045.928666] RDX: 00000000ffffffac RSI: ffffffffffffff81 RDI: 00000000ffffff81
[ 1045.950113] RBP: ffff882bd280f980 R08: 00000000ffffff81 R09: ffffffffa10fded0
[ 1045.971560] R10: ffff88303f803b00 R11: 0000000000ffffff R12: 000000000000003c
[ 1045.993006] R13: ffff881e2eae7708 R14: ffff881e2eae7690 R15: 0000000000000000
[ 1046.014452] FS: 0000000000000000(0000) GS:ffff882f7ef40000(0000) knlGS:0000000000000000
[ 1046.038775] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1046.056039] CR2: 00007ffff20df034 CR3: 0000002ef4268000 CR4: 00000000003607e0
[ 1046.077485] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1046.098932] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1046.120378] Call Trace:
[ 1046.127717] [<ffffffffa10fe245>] htree_inlinedir_to_tree+0x445/0x450 [ldiskfs]
[ 1046.149690] [<ffffffff8123002e>] ? __generic_file_splice_read+0x4ee/0x5e0
[ 1046.170356] [<ffffffff81234cdd>] ? __getblk+0x2d/0x2e0
[ 1046.186052] [<ffffffff81234c4c>] ? __find_get_block+0xbc/0x120
[ 1046.203841] [<ffffffff81234cdd>] ? __getblk+0x2d/0x2e0
[ 1046.219541] [<ffffffffa10cdfa0>] ? __ldiskfs_get_inode_loc+0x110/0x3e0 [ldiskfs]
[ 1046.242039] [<ffffffffa10c89ef>] ? ldiskfs_xattr_find_entry+0x9f/0x130 [ldiskfs]
[ 1046.264536] [<ffffffffa10c0277>] ldiskfs_htree_fill_tree+0x137/0x2f0 [ldiskfs]
[ 1046.286507] [<ffffffff811df826>] ? kmem_cache_alloc_trace+0x1d6/0x200
[ 1046.306126] [<ffffffffa10ae5ec>] ldiskfs_readdir+0x61c/0x850 [ldiskfs]
[ 1046.326012] [<ffffffffa1147640>] ? osd_declare_ref_del+0x130/0x130 [osd_ldiskfs]
[ 1046.348507] [<ffffffff812256b2>] ? generic_getxattr+0x52/0x70
[ 1046.366036] [<ffffffffa1145cde>] osd_ldiskfs_it_fill+0xbe/0x260 [osd_ldiskfs]
[ 1046.387747] [<ffffffffa1145eb7>] osd_it_ea_load+0x37/0x100 [osd_ldiskfs]
[ 1046.408158] [<ffffffffa122808c>] lfsck_open_dir+0x11c/0x3a0 [lfsck]
[ 1046.427257] [<ffffffffa1228cb2>] lfsck_master_oit_engine+0x9a2/0x1190 [lfsck]
[ 1046.448969] [<ffffffff816946f7>] ? __schedule+0x477/0xa30
[ 1046.465453] [<ffffffffa1229d96>] lfsck_master_engine+0x8f6/0x1360 [lfsck]
[ 1046.486120] [<ffffffff810c4d40>] ? wake_up_state+0x20/0x20
[ 1046.502865] [<ffffffffa12294a0>] ? lfsck_master_oit_engine+0x1190/0x1190 [lfsck]
[ 1046.525360] [<ffffffff810b1131>] kthread+0xd1/0xe0
[ 1046.540011] [<ffffffff810b1060>] ? insert_kthread_work+0x40/0x40
[ 1046.558323] [<ffffffff816a14dd>] ret_from_fork+0x5d/0xb0
[ 1046.574540] [<ffffffff810b1060>] ? insert_kthread_work+0x40/0x40
[ 1046.592852] Code: 44 04 02 48 8d 44 03 c8 48 01 c7 e8 b7 f6 22 e0 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 0f 0b 0f 1f 40 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 85 f6 48
[ 1046.650192] RIP [<ffffffffa10fbd04>] ldiskfs_rec_len_to_disk.part.9+0x4/0x10 [ldiskfs]