Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.15.5, Lustre 2.15.6
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/5c5a7e20-3817-4d4e-bf94-c0c463e0bdfd
test_0d failed with the following error:
test_0d returned 1
Test session details:
clients: https://build.whamcloud.com/job/lustre-b2_15/88 - 5.4.0-110-generic
servers: https://build.whamcloud.com/job/lustre-b2_15/88 - 4.18.0-513.24.1.el8_lustre.x86_64
<<Please provide additional information about the failure here>>
MDS dmesg
[ 2568.298513] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2568.300460] Lustre: lustre-MDT0000: Denying connection for new client 7ca0c85b-8289-4bc1-8f66-fba11c1a6d22 (at 10.240.25.238@tcp), waiting for 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:59 [ 2568.304177] Lustre: Skipped 2 previous similar messages [ 2633.866079] Lustre: lustre-MDT0000: recovery is timed out, evict stale exports [ 2633.867849] Lustre: lustre-MDT0000: disconnecting 1 stale clients [ 2635.128884] Lustre: lustre-MDT0000: Denying connection for new client 7ca0c85b-8289-4bc1-8f66-fba11c1a6d22 (at 10.240.25.238@tcp), waiting for 2 known clients (0 recovered, 1 in progress, and 1 evicted) already passed deadline 0:01 [ 2635.133415] Lustre: Skipped 12 previous similar messages [ 2650.944188] Lustre: lustre-MDT0000: Recovery over after 1:23, of 2 clients 1 recovered and 1 was evicted. [ 2679.928995] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [tgt_recover_0:220356] [ 2679.932486] Modules linked in: dm_flakey osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver intel_rapl_msr nfs lockd grace intel_rapl_common fscache crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr i2c_piix4 virtio_balloon sunrpc ext4 mbcache jbd2 ata_generic ata_piix crc32c_intel serio_raw libata virtio_net virtio_blk net_failover failover [last unloaded: dm_flakey] [ 2679.982751] CPU: 1 PID: 220356 Comm: tgt_recover_0 Kdump: loaded Tainted: G OE --------- - - 4.18.0-513.24.1.el8_lustre.x86_64 #1 [ 2679.985241] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 2679.986403] RIP: 0010:cfs_hash_for_each_relax+0x173/0x460 [libcfs] [ 2680.060896] Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 14 49 8b 46 38 48 8d 74 24 30 4c 89 f7 48 8b 00 e8 86 a4 41 c1 48 85 c0 0f 84 e6 01 00 00 <48> 8b 18 48 85 db 0f 84 c0 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [ 2680.064472] RSP: 0018:ffffb70e83d73df8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [ 2680.065998] RAX: ffffb70e82afe008 RBX: 0000000000000000 RCX: 000000000000000e [ 2680.067394] RDX: ffffb70e82ae1000 RSI: ffffb70e83d73e28 RDI: ffff892a03c8c600 [ 2680.068809] RBP: 0000000000000000 R08: ffffb70e83d73dc0 R09: ffffb70e83d73dc8 [ 2680.070236] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 2680.071657] R13: 0000000000000001 R14: ffff892a03c8c600 R15: 0000000000000004 [ 2680.073055] FS: 0000000000000000(0000) GS:ffff892abfd00000(0000) knlGS:0000000000000000 [ 2680.074619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2680.075772] CR2: 00007ffcac990dc8 CR3: 0000000092e10006 CR4: 00000000001706e0 [ 2680.077199] Call Trace: [ 2680.159866] <IRQ> [ 2680.160376] ? watchdog_timer_fn.cold.10+0x46/0x9e [ 2680.161403] ? watchdog+0x30/0x30 [ 2680.162109] ? __hrtimer_run_queues+0x101/0x280 [ 2680.163067] ? hrtimer_interrupt+0x100/0x220 [ 2680.163974] ? smp_apic_timer_interrupt+0x6a/0x130 [ 2680.164970] ? apic_timer_interrupt+0xf/0x20 [ 2680.165843] </IRQ> [ 2680.166311] ? cfs_hash_for_each_relax+0x173/0x460 [libcfs] [ 2680.167449] ? cfs_hash_for_each_relax+0x16a/0x460 [libcfs] [ 2680.168569] ? ldlm_lock_mode_downgrade+0x300/0x300 [ptlrpc] [ 2681.900765] ? ldlm_lock_mode_downgrade+0x300/0x300 [ptlrpc] [ 2681.901977] cfs_hash_for_each_nolock+0x11f/0x1f0 [libcfs] [ 2681.903101] ldlm_reprocess_recovery_done+0x8b/0x100 [ptlrpc] [ 2682.220190] target_recovery_thread+0xdd0/0x12c0 [ptlrpc] [ 2682.224441] ? replay_request_or_update.isra.29+0x9e0/0x9e0 [ptlrpc] [ 2682.225853] kthread+0x134/0x150 [ 2682.226590] ? set_kthread_struct+0x50/0x50 [ 2682.227460] ret_from_fork+0x35/0x40 [ 2682.228332] Kernel panic - not syncing: softlockup: hung tasks [ 2682.229509] CPU: 1 PID: 220356 Comm: tgt_recover_0 Kdump: loaded Tainted: G OEL --------- - - 4.18.0-513.24.1.el8_lustre.x86_64 #1 [ 2682.232109] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 2682.233288] Call Trace: [ 2682.233856] <IRQ> [ 2682.234330] dump_stack+0x41/0x60 [ 2682.235071] panic+0xe7/0x2ac [ 2682.235742] ? __switch_to_asm+0x11/0x80 [ 2682.236569] watchdog_timer_fn.cold.10+0x85/0x9e [ 2682.237628] ? watchdog+0x30/0x30 [ 2682.238339] __hrtimer_run_queues+0x101/0x280 [ 2682.239338] hrtimer_interrupt+0x100/0x220 [ 2682.240190] smp_apic_timer_interrupt+0x6a/0x130 [ 2682.241164] apic_timer_interrupt+0xf/0x20 [ 2682.242044] </IRQ> [ 2682.242571] RIP: 0010:cfs_hash_for_each_relax+0x173/0x460 [libcfs] [ 2682.243845] Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 14 49 8b 46 38 48 8d 74 24 30 4c 89 f7 48 8b 00 e8 86 a4 41 c1 48 85 c0 0f 84 e6 01 00 00 <48> 8b 18 48 85 db 0f 84 c0 01 00 00 49 8b 46 28 48 89 de 4c 89 f7 [ 2682.247715] RSP: 0018:ffffb70e83d73df8 EFLAGS: 00010282 ORIG_RAX: ffffffffffffff13 [ 2682.249215] RAX: ffffb70e82afe008 RBX: 0000000000000000 RCX: 000000000000000e [ 2682.250697] RDX: ffffb70e82ae1000 RSI: ffffb70e83d73e28 RDI: ffff892a03c8c600 [ 2682.252104] RBP: 0000000000000000 R08: ffffb70e83d73dc0 R09: ffffb70e83d73dc8 [ 2682.253560] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 2682.255013] R13: 0000000000000001 R14: ffff892a03c8c600 R15: 0000000000000004 [ 2682.256439] ? cfs_hash_for_each_relax+0x16a/0x460 [libcfs] [ 2682.257583] ? ldlm_lock_mode_downgrade+0x300/0x300 [ptlrpc] [ 2682.258814] ? ldlm_lock_mode_downgrade+0x300/0x300 [ptlrpc] [ 2682.260053] cfs_hash_for_each_nolock+0x11f/0x1f0 [libcfs] [ 2682.261188] ldlm_reprocess_recovery_done+0x8b/0x100 [ptlrpc] [ 2682.262432] target_recovery_thread+0xdd0/0x12c0 [ptlrpc] [ 2682.263636] ? replay_request_or_update.isra.29+0x9e0/0x9e0 [ptlrpc] [ 2682.264969] kthread+0x134/0x150 [ 2682.265656] ? set_kthread_struct+0x50/0x50 [ 2682.266507] ret_from_fork+0x35/0x40
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
replay-single test_0d - test_0d returned 1
Attachments
Issue Links
- mentioned in
-
Page Loading...