Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.6
-
3
-
9223372036854775807
Description
This issue was created by maloo for jianyu <yujian@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d95891fc-a86e-4ffb-978d-d1500587c8ea
test_77l failed with the following error:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [khugepaged:34] Modules linked in: obdecho(OE) ptlrpc_gss(OE) osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr virtio_balloon i2c_piix4 sunrpc ext4 mbcache jbd2 ata_generic ata_piix libata crc32c_intel serio_raw virtio_blk virtio_net net_failover failover [last unloaded: lnet_selftest] CPU: 0 PID: 34 Comm: khugepaged Kdump: loaded Tainted: G OE --------- - - 4.18.0-513.24.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:copy_page+0x7/0x10 Code: 75 11 65 48 89 1e 65 48 89 4e 08 9d b0 01 c3 cc cc cc cc 9d 30 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 66 90 b9 00 02 00 00 <f3> 48 a5 c3 cc cc cc cc 90 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 RSP: 0000:ffffb3bd80753d18 EFLAGS: 00010286 ORIG_RAX: ffffffffffffff13 RAX: 0000000055ae7865 RBX: ffffeb784156b9c0 RCX: 0000000000000200 RDX: 7fffffffaa51879a RSI: ffff8cda15ae7000 RDI: ffff8cda5c3b6000 RBP: 000055cc40db6000 R08: 00000000000396d0 R09: 0000000000000011 R10: 0000000000000007 R11: 00000000ffffffff R12: ffff8cda4e37adb0 R13: ffff8cd9e0450000 R14: ffffeb784270ed80 R15: ffff8cd9c5bbd488 FS: 0000000000000000(0000) GS:ffff8cda7fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4d0a5fced0 CR3: 000000001de10006 CR4: 00000000001706f0 Call Trace: <IRQ> ? watchdog_timer_fn.cold.10+0x46/0x9e ? watchdog+0x30/0x30 ? __hrtimer_run_queues+0x101/0x280 ? hrtimer_interrupt+0x100/0x220 ? smp_apic_timer_interrupt+0x6a/0x130 ? apic_timer_interrupt+0xf/0x20 </IRQ> ? copy_page+0x7/0x10 collapse_huge_page+0x8d7/0x1000 khugepaged+0xed9/0x11e0 ? __schedule+0x2d9/0x870 ? finish_wait+0x80/0x80 ? collapse_pte_mapped_thp+0x430/0x430 kthread+0x134/0x150 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x35/0x40 Kernel panic - not syncing: softlockup: hung tasks
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/108974 - 5.14.0-362.24.1.el9_3.aarch64
servers: https://build.whamcloud.com/job/lustre-reviews/108974 - 4.18.0-513.24.1.el8_lustre.x86_64
<<Please provide additional information about the failure here>>
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_77l - trevis-95vm9 crashed during sanity test_77l