Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.5
-
None
-
Ubuntu20.04
-
3
-
9223372036854775807
Description
During boot we have seen this BUG hit once following errors in LNET:
[ 1197.987795] LustreError: 168-f: lustrefs-OST0024: BAD WRITE CHECKSUM: from 12345-10.2.0.17@tcp inode [0x2c000040b:0x21a:0x0] object 0xa80000401:78 extent [1703936-3014655]: client csum e35e3479, server csum d9353d16
[ 1199.016520] LustreError: 168-f: lustrefs-OST0024: BAD WRITE CHECKSUM: from 12345-10.2.0.17@tcp inode [0x2c000040b:0x21a:0x0] object 0xa80000401:78 extent [1703936-3014655]: client csum e35e3479, server csum f1842a1d
[ 1199.025602] LustreError: Skipped 1 previous similar message
[ 1200.189285] BUG: Bad page state in process socknal_sd01_00 pfn:3fcbb2f
[ 1200.192962] LustreError: 168-f: lustrefs-OST0024: BAD WRITE CHECKSUM: from 12345-10.2.0.19@tcp inode [0x280000407:0x543:0x0] object 0xa80000400:157 extent [4718592-6029311]: client csum 3f5440b, server csum 89924342
[ 1200.202301] LustreError: Skipped 5 previous similar messages
[ 1200.817841] usercopy: Kernel memory exposure attempt detected from SLUB object 'uid_cache' (offset 0, size 3840)!
[ 1200.822937] kernel BUG at mm/usercopy.c:99!
[ 1200.824876] invalid opcode: 0000 1 SMP NOPTI
[ 1200.826960] CPU: 6 PID: 3449 Comm: socknal_sd01_01 Tainted: G B OE 5.4.0-1145-azure-fips #152+fips1-Ubuntu
[ 1200.831864] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 1200.836209] RIP: 0010:usercopy_abort+0x7b/0x7d
[ 1200.838263] Code: 4c 0f 45 de 51 4c 89 d1 48 c7 c2 b8 2c 54 99 57 48 c7 c6 31 f5 52 99 48 c7 c7 58 2c 54 99 48 0f 45 f2 4c 89 da e8 69 7f ff ff <0f> 0b 4c 89 e1 49 89 d8 44 89 ea 31 f6 48 29 c1 48 c7 c7 fa 2c 54
[ 1200.846905] RSP: 0018:ffffb7afc269ba60 EFLAGS: 00010246
[ 1200.849310] RAX: 0000000000000065 RBX: 0000000000000f00 RCX: 0000000000000000
[ 1200.852593] RDX: 0000000000000000 RSI: ffff8cb3ff59b588 RDI: ffff8cb3ff59b588
[ 1200.855824] RBP: ffffb7afc269ba78 R08: ffff8cb3ff59b588 R09: 00000000ff010101
[ 1200.859112] R10: ffff8cb3e763c750 R11: 0000000000000001 R12: ffff8cb3cbd60100
[ 1200.862435] R13: 0000000000000001 R14: ffff8cb3cbd61000 R15: 0000000000000000
[ 1200.865696] FS: 0000000000000000(0000) GS:ffff8cb3ff580000(0000) knlGS:0000000000000000
[ 1200.869392] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1200.872181] CR2: 00007f1ad400a000 CR3: 0000003f581e4001 CR4: 0000000000370ee0
[ 1200.875429] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1200.878729] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1200.881995] Call Trace:
[ 1200.883172] ? show_regs.cold+0x1a/0x1f
[ 1200.884987] ? __die+0x90/0xd9
[ 1200.886414] ? die+0x30/0x50
[ 1200.887769] ? do_trap+0x85/0xf0
[ 1200.889281] ? do_error_trap+0x7c/0xb0
[ 1200.891042] ? usercopy_abort+0x7b/0x7d
[ 1200.892975] ? do_invalid_op+0x3c/0x50
[ 1200.894722] ? usercopy_abort+0x7b/0x7d
[ 1200.896536] ? invalid_op+0x28/0x30
[ 1200.898187] ? usercopy_abort+0x7b/0x7d
[ 1200.899968] __check_heap_object+0xe6/0x120
[ 1200.902814] __check_object_size+0x13f/0x150
[ 1200.905739] simple_copy_to_iter+0x2b/0x50
[ 1200.908509] __skb_datagram_iter+0x19d/0x2d0
[ 1200.911541] ? __sk_queue_drop_skb+0xf0/0xf0
[ 1200.914508] skb_copy_datagram_iter+0x40/0x90
[ 1200.917526] tcp_recvmsg+0x6cc/0xb30
[ 1200.920082] ? lnet_parse_local+0x38a/0xe40 [lnet]
[ 1200.923451] inet6_recvmsg+0x5e/0xf0
[ 1200.926101] sock_recvmsg+0x54/0x80
[ 1200.928626] kernel_recvmsg+0x54/0x70
[ 1200.931164] ksocknal_lib_recv_kiov+0x190/0x390 [ksocklnd]
[ 1200.934694] ksocknal_new_packet+0x75e/0x10d0 [ksocklnd]
[ 1200.938126] ksocknal_scheduler+0x1f0/0x18c0 [ksocklnd]
[ 1200.941583] ? __wake_up_pollfree+0x40/0x40
[ 1200.944365] kthread+0x104/0x140
[ 1200.946825] ? ksocknal_recv+0x2c0/0x2c0 [ksocklnd]
[ 1200.950000] ? kthread_park+0x90/0x90
[ 1200.952513] ret_from_fork+0x35/0x40
[ 1200.954996] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) quota_v2 quota_tree ldiskfs(OE) xt_state iptable_filter xt_CT xt_multiport iptable_raw ksocklnd(OE) lnet(OE) libcfs(OE) sunrpc xt_tcpudp xt_owner xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_security bpfilter udf crc_itu_t nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua skx_edac_common kvm_intel mlx5_ib kvm crct10dif_pclmul crc32_pclmul ib_uverbs ib_core ghash_clmulni_intel aesni_intel mlx5_core crypto_simd cryptd glue_helper tls mlxfw joydev hid_generic hv_netvsc hid_hyperv hid serio_raw hyperv_keyboard hv_utils hv_balloon pata_acpi hyperv_fb sch_fq_codel ramoops drm reed_solomon i2c_core efi_pstore ip_tables x_tables autofs4
[last unloaded: lnet_selftest]
Vmcore was not retrievable. Version of lustre is very close to 21921fb111b28223828d6bed3377831429d0d7de