[LU-11005] obdfilter-survey test 1a, 1c and 2a fail with soft lockup/message “Network not available!” Created: 07/May/18  Updated: 27/Mar/23  Resolved: 27/Mar/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.1, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.4
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

obdfilter-survey test_1a, test_1c and test_2a fail with what looks like network issues. Looking at the client test_log from a recent failures, https://testing.hpdd.intel.com/test_sets/4f1abec0-50f5-11e8-abc3-52540065bddc , we see

Resetting fail_loc on all nodes...CMD: trevis-22vm1.trevis.hpdd.intel.com,trevis-22vm2,trevis-22vm3,trevis-22vm4,trevis-22vm5 lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
pdsh@trevis-22vm1: trevis-22vm2: mcmd: connect failed: No route to host
done.
02:07:26 (1525572446) waiting for trevis-22vm1.trevis.hpdd.intel.com network 5 secs ...
02:07:26 (1525572446) network interface is UP
CMD: trevis-22vm1.trevis.hpdd.intel.com rc=0;
			val=\$(/usr/sbin/lctl get_param -n catastrophe 2>&1);
			if [[ \$? -eq 0 && \$val -ne 0 ]]; then
				echo \$(hostname -s): \$val;
				rc=\$val;
			fi;
			exit \$rc
02:07:26 (1525572446) waiting for trevis-22vm2 network 5 secs ...
Network not available!

For a test session that passes, we see that we wait for the network interface to come up and it eventually does;

01:05:51 (1525309551) waiting for trevis-16vm1.trevis.hpdd.intel.com network 5 secs ...
01:05:51 (1525309551) network interface is UP
CMD: trevis-16vm1.trevis.hpdd.intel.com rc=0;
			val=\$(/usr/sbin/lctl get_param -n catastrophe 2>&1);
			if [[ \$? -eq 0 && \$val -ne 0 ]]; then
				echo \$(hostname -s): \$val;
				rc=\$val;
			fi;
			exit \$rc
01:05:51 (1525309551) waiting for trevis-16vm2 network 5 secs ...
01:05:51 (1525309551) network interface is UP

On the console of trevis-22vm2, we see a soft lockup

[52813.979019] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == obdfilter-survey test 1a: Object Storage Targets survey =========================================== 01:50:44 \(1525571444\)
[52814.176674] Lustre: DEBUG MARKER: == obdfilter-survey test 1a: Object Storage Targets survey =========================================== 01:50:44 (1525571444)
[53232.624965] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [sssd_be:596]
[53232.624965] Modules linked in: osc(OE) mgc(OE) lustre(OE) lmv(OE) fld(OE) mdc(OE) fid(OE) lov(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd nfsd joydev pcspkr virtio_balloon i2c_piix4 parport_pc parport nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk ata_piix drm 8139too libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_pci virtio_ring virtio 8139cp mii i2c_core floppy [last unloaded: lnet_selftest]
[53232.624965] CPU: 0 PID: 596 Comm: sssd_be Tainted: G           OE  ------------   3.10.0-693.21.1.el7.x86_64 #1
[53232.624965] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[53232.624965] task: ffff880036758fd0 ti: ffff88007ab78000 task.ti: ffff88007ab78000
[53232.624965] RIP: 0010:[<ffffffff812200c8>]  [<ffffffff812200c8>] __d_lookup+0x68/0x160
[53232.624965] RSP: 0018:ffff88007ab7bd38  EFLAGS: 00010286
[53232.624965] RAX: ffffc9000011f818 RBX: 0000000a0000ffff RCX: 0000000000000012
[53232.624965] RDX: ffffc90000000000 RSI: ffff88007ab7bdd0 RDI: ffff88007cc04240
[53232.624965] RBP: ffff88007ab7bd78 R08: 0000000000000004 R09: ffffffff81279cc0
[53232.624965] R10: 0000000000000000 R11: ffff88007ab7bcce R12: ffff88007ab7bd40
[53232.624965] R13: ffff88007ab7be60 R14: ffff88007ab7be53 R15: ffffffff81332562
[53232.624965] FS:  00007f04cc1d8880(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[53232.624965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[53232.624965] CR2: 0000560dc7999880 CR3: 000000007abb2000 CR4: 00000000000606f0
[53232.624965] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[53232.624965] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[53232.624965] Call Trace:
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff812201ea>] d_lookup+0x2a/0x50
[53232.624965]  [<ffffffff8127a43f>] proc_fill_cache+0x6f/0x180
[53232.624965]  [<ffffffff81279cc0>] ? proc_pid_make_inode+0xf0/0xf0
[53232.624965]  [<ffffffff8127b26e>] proc_pid_readdir+0x14e/0x1f0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff812761bf>] proc_root_readdir+0x3f/0x50
[53232.624965]  [<ffffffff8121a216>] vfs_readdir+0xb6/0xe0
[53232.624965]  [<ffffffff8121a635>] SyS_getdents+0x95/0x120
[53232.624965]  [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21
[53232.624965]  [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146
[53232.624965] Code: 89 c2 d3 ea 01 d0 23 05 4b 50 91 00 48 8b 15 38 50 91 00 48 8d 04 c2 48 8b 18 48 83 e3 fe 75 0b eb 39 90 48 8b 1b 48 85 db 74 30 <44> 39 73 18 75 f2 4c 8d 7b 50 4c 89 ff e8 56 67 49 00 4c 39 63 
[53232.624965] Kernel panic - not syncing: softlockup: hung tasks
[53232.624965] CPU: 0 PID: 596 Comm: sssd_be Tainted: G           OEL ------------   3.10.0-693.21.1.el7.x86_64 #1
[53232.624965] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[53232.624965] Call Trace:
[53232.624965]  <IRQ>  [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
[53232.624965]  [<ffffffff816a8634>] panic+0xe8/0x21f
[53232.624965]  [<ffffffff8102d7cf>] ? show_regs+0x5f/0x210
[53232.624965]  [<ffffffff811334e1>] watchdog_timer_fn+0x231/0x240
[53232.624965]  [<ffffffff811332b0>] ? watchdog+0x40/0x40
[53232.624965]  [<ffffffff810b8196>] __hrtimer_run_queues+0xd6/0x260
[53232.624965]  [<ffffffff810b872f>] hrtimer_interrupt+0xaf/0x1d0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8105467b>] local_apic_timer_interrupt+0x3b/0x60
[53232.624965]  [<ffffffff816c4e73>] smp_apic_timer_interrupt+0x43/0x60
[53232.624965]  [<ffffffff816c1732>] apic_timer_interrupt+0x162/0x170
[53232.624965]  <EOI>  [<ffffffff81279cc0>] ? proc_pid_make_inode+0xf0/0xf0
[53232.624965]  [<ffffffff812200c8>] ? __d_lookup+0x68/0x160
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff812201ea>] d_lookup+0x2a/0x50
[53232.624965]  [<ffffffff8127a43f>] proc_fill_cache+0x6f/0x180
[53232.624965]  [<ffffffff81279cc0>] ? proc_pid_make_inode+0xf0/0xf0
[53232.624965]  [<ffffffff8127b26e>] proc_pid_readdir+0x14e/0x1f0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff8121a320>] ? fillonedir+0xe0/0xe0
[53232.624965]  [<ffffffff812761bf>] proc_root_readdir+0x3f/0x50
[53232.624965]  [<ffffffff8121a216>] vfs_readdir+0xb6/0xe0
[53232.624965]  [<ffffffff8121a635>] SyS_getdents+0x95/0x120
[53232.624965]  [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21
[53232.624965]  [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146

On the console for the OSS, we see

[52707.292360] Lustre: DEBUG MARKER: == obdfilter-survey test 1a: Object Storage Targets survey =========================================== 01:50:44 (1525571444)
[52707.499316] Lustre: DEBUG MARKER: lctl dl | grep obdfilter
[52707.854211] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep tcp | cut -f 1 -d @
[52708.863961] Lustre: Echo OBD driver; http://www.lustre.org/
[53152.432160] LNetError: 17668:0:(socklnd.c:1681:ksocknal_destroy_conn()) Completing partial receive from 12345-10.9.5.10@tcp[1], ip 10.9.5.10:1023, with error, wanted: 152, left: 152, last alive is 28 secs ago
[53152.437669] LustreError: 17668:0:(events.c:304:request_in_callback()) event type 2, status -5, service ost
[53152.440488] LustreError: 25177:0:(pack_generic.c:590:__lustre_unpack_msg()) message length 0 too small for magic/version check
[53152.445357] LustreError: 25177:0:(sec.c:2069:sptlrpc_svc_unwrap_request()) error unpacking request from 12345-10.9.5.10@tcp x1599622583354080
[53168.454267] Lustre: lustre-OST0006: haven't heard from client 0c1d0b38-84c1-ca60-829d-9aaff06c9a4a (at 10.9.5.10@tcp) in 47 seconds. I think it's dead, and I am evicting it. exp ffff88006b639800, cur 1525571906 expire 1525571876 last 1525571859
[53168.460481] Lustre: Skipped 7 previous similar messages
[53702.588305] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null

More failed tests and logs can be found at:
https://testing.hpdd.intel.com/test_sets/7cb76a18-4281-11e8-960d-52540065bddc
https://testing.hpdd.intel.com/test_sets/aa7e5422-425c-11e8-b45c-52540065bddc
https://testing.hpdd.intel.com/test_sets/b0f496da-2cdc-11e8-9e0e-52540065bddc
https://testing.hpdd.intel.com/test_sets/26cfa8ec-20b0-11e8-a4b1-52540065bddc
https://testing.hpdd.intel.com/test_sets/f685a540-0329-11e8-a7cd-52540065bddc



 Comments   
Comment by James Nunez (Inactive) [ 08/May/18 ]

Looking at old obdfilter-survey test 1a failures, the following may be the first time we saw the "Network not available!" failure on any branch. It was attributed to LU-9900, but I don't think this is an unmount issue.

Date: 2017-09-19 02:53:58 UTC
Version: 2.10.53.1
Logs at: https://testing.hpdd.intel.com/test_sets/4bd4b580-9d0a-11e7-b778-5254006e85c2

Client vm2 console log:

04:00:18:[45505.327887] Lustre: DEBUG MARKER: == obdfilter-survey test 1a: Object Storage Targets survey =========================================== 19:54:15 (1505789655)
04:00:18:[49312.327438] sysrq: SysRq : sysrq: Show State
04:00:18:[49312.336172]   task                        PC stack   pid father
04:00:18:[49312.337328] systemd         S 0000000000000000     0     1      0 0x00000000
04:00:18:[49312.337335]  ffff88007c83bdd8 ffff880037915340 ffff88007c834040 ffff88007c83c000
04:00:18:[49312.337337]  0000000000000000 ffff880036d1b5e0 ffff88007c834040 0000000000000000
04:00:18:[49312.337338]  ffff88007c83bdf0 ffffffff815e4c75 0000000000000000 ffff88007c83bf10
04:00:18:[49312.337339] Call Trace:
04:00:18:[49320.629660]  [<ffffffff815e4c75>] schedule+0x35/0x80
04:00:18:[49332.921720]  [<ffffffff815e7979>] schedule_hrtimeout_range_clock+0x119/0x130
04:00:18:[49334.186095]  [<ffffffff8123f899>] ep_poll+0x249/0x310
04:00:18:[49347.389467]  [<ffffffff81240d72>] SyS_epoll_wait+0xb2/0xe0
04:00:18:[49347.389500]  [<ffffffff815e876e>] entry_SYSCALL_64_fastpath+0x12/0x6d
04:00:18:[49374.259921] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [bash:15473]
04:00:18:[49374.263123] Modules linked in: osc(OEN) mgc(OEN) lustre(OEN) lmv(OEN) fld(OEN) mdc(OEN) fid(OEN) lov(OEN) ksocklnd(OEN) ptlrpc(OEN) obdclass(OEN) lnet(OEN) libcfs(OEN) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache af_packet iscsi_boot_sysfs ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm ib_sa ib_mad ib_core ib_addr crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng drbg ansi_cprng aesni_intel ppdev 8139too aes_x86_64 lrw gf128mul glue_helper ablk_helper 8139cp cryptd joydev mii virtio_balloon i2c_piix4 pcspkr pvpanic parport_pc parport processor button ata_generic ext4 crc16 jbd2 mbcache virtio_blk floppy ata_piix ahci libahci uhci_hcd ehci_hcd cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm libata usbcore usb_common virtio_pci virtio_ring virtio sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: lnet_selftest]
04:00:18:[49374.263889] Supported: No, Unsupported modules are loaded
04:00:18:[49374.263889] CPU: 0 PID: 15473 Comm: bash Tainted: G           OE   N  4.4.74-92.35-default #1
04:00:18:[49374.263889] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
04:00:18:[49374.263889] task: ffff880037915340 ti: ffff88007a614000 task.ti: ffff88007a614000
04:00:18:[49374.263889] RIP: 0010:[<ffffffff81108577>]  [<ffffffff81108577>] unwind+0x197/0xe90
04:00:18:[49374.263889] RSP: 0018:ffff88007a6179d8  EFLAGS: 00010216
04:00:18:[49374.263889] RAX: 0000000000000019 RBX: ffffffff815e876d RCX: ffffffffffffff98
04:00:18:[49374.263889] RDX: 0000000000000064 RSI: ffffffff819cdffc RDI: ffff88007a617a00
04:00:18:[49374.263889] RBP: ffffffff8195f988 R08: 0000000000000001 R09: 0000000000000007
04:00:18:[49374.263889] R10: 0000000000000001 R11: 00000000000ffca0 R12: ffffffff819ce058
04:00:18:[49374.263889] R13: ffff88007a617c20 R14: 000000000000000b R15: fffffffffffffffc
04:00:18:[49374.263889] FS:  00007fbffc8bb700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
04:00:18:[49374.263889] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
04:00:18:[49374.263889] CR2: 00007fbffc742000 CR3: 00000000131d0000 CR4: 00000000000406f0
04:00:18:[49374.263889] Stack:
04:00:18:[49374.263889]  ffffffff8119d1a0 00000000000ffd08 ffffffff8119d283 ffffffff8207d900
04:00:18:[49374.263889]  0000000000000002 ffffffff819ce000 ffffffff8195b278 3831356538373665
04:00:18:[49374.263889]  6666666666666666 ffffffff8195f9a0 253461c12e73a3d2 ffffffff8200a9c3
04:00:18:[49374.263889] Call Trace:
04:00:18:[49374.263889]  [<ffffffff8101a942>] dump_trace_unwind+0x92/0xe0
04:00:18:[49374.263889]  [<ffffffff8101aa49>] try_stack_unwind+0x99/0x190
04:00:18:[49374.263889]  [<ffffffff81019a99>] dump_trace+0x59/0x310
04:00:18:[49374.263889]  [<ffffffff81019e3a>] show_stack_log_lvl+0xea/0x170
04:00:18:[49374.263889]  [<ffffffff8101abc1>] show_stack+0x21/0x40
04:00:18:[49374.263889]  [<ffffffff810a8c29>] show_state_filter+0x79/0xb0
04:00:18:[49374.263889]  [<ffffffff8140115c>] sysrq_handle_showstate+0xc/0x20
04:00:18:[49374.263889]  [<ffffffff814017fc>] __handle_sysrq+0xec/0x140
04:00:18:[49374.263889]  [<ffffffff81401c34>] write_sysrq_trigger+0x24/0x40
04:00:18:[49374.263889]  [<ffffffff812615a9>] proc_reg_write+0x39/0x70
04:00:18:[49374.263889]  [<ffffffff811fc003>] __vfs_write+0x23/0x100
04:00:18:[49374.263889]  [<ffffffff811fc68d>] vfs_write+0x9d/0x190
04:00:18:[49374.263889]  [<ffffffff811fd352>] SyS_write+0x42/0xa0
04:00:18:[49374.263889]  [<ffffffff815e876e>] entry_SYSCALL_64_fastpath+0x12/0x6d
04:00:18:[49374.263889] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x6d
04:00:18:[49374.263889] 
04:00:18:[49374.263889] Leftover inexact backtrace:
04:00:18:[49374.263889] 
04:00:18:[49374.263889] Code: b2 07 00 e9 02 ff ff ff 41 8b 04 24 49 8d 53 fc 48 39 c2 0f 82 5e ff ff ff 49 8d 47 20 49 c7 c7 fc ff ff ff 48 89 44 24 18 eb 11 <41> 8b 04 24 49 8d 53 fc 48 39 c2 0f 82 e3 00 00 00 48 8b 74 24 
04:16:09:[49374.263889] Kernel panic - not syncing: softlockup: hung tasks
04:16:09:[49374.263889] CPU: 0 PID: 15473 Comm: bash Tainted: G           OEL  N  4.4.74-92.35-default #1
04:16:09:[49374.263889] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
04:16:09:[49374.263889]  0000000000000000 ffffffff81310100 ffffffff81861a53 ffff88007fc03ed0
04:16:09:[49374.263889]  ffffffff81183061 0000000000000008 ffff88007fc03ee0 ffff88007fc03e80
04:16:09:[49374.263889]  0000000000000000 0000000000000000 0000000000000000 0000000000000006
04:16:09:[49374.263889] Call Trace:
04:16:09:[49374.263889]  [<ffffffff81019a99>] dump_trace+0x59/0x310
04:16:09:[49374.263889]  [<ffffffff81019e3a>] show_stack_log_lvl+0xea/0x170
04:16:09:[49374.263889]  [<ffffffff8101abc1>] show_stack+0x21/0x40
04:16:09:[49374.263889]  [<ffffffff81310100>] dump_stack+0x5c/0x7c
04:16:09:[49374.263889]  [<ffffffff81183061>] panic+0xd2/0x219
04:16:09:[49374.263889]  [<ffffffff81136a79>] watchdog_timer_fn+0x1d9/0x1e0
04:16:09:[49374.263889]  [<ffffffff810eb38a>] __hrtimer_run_queues+0xea/0x260
04:16:09:[49374.263889]  [<ffffffff810eb7c9>] hrtimer_interrupt+0x99/0x190
04:16:09:[49374.263889]  [<ffffffff815eb429>] smp_apic_timer_interrupt+0x39/0x50
04:16:09:[49374.263889]  [<ffffffff815e94fc>] apic_timer_interrupt+0x8c/0xa0
04:16:09:[49374.263889] DWARF2 unwinder stuck at apic_timer_interrupt+0x8c/0xa0
04:16:09:[49374.263889] 
04:16:09:[49374.263889] Leftover inexact backtrace:
04:16:09:[49374.263889] 
04:16:09:[49374.263889]  <IRQ>  <EOI>  [<ffffffff815e876d>] ? entry_SYSCALL_64_fastpath+0x11/0x6d
04:16:09:[49374.263889]  [<ffffffff81108577>] ? unwind+0x197/0xe90
04:16:09:[49374.263889]  [<ffffffff81108633>] ? unwind+0x253/0xe90
04:16:09:[49374.263889]  [<ffffffff8119d1a0>] ? shmem_set_policy+0x30/0x30
04:16:09:[49374.263889]  [<ffffffff8119d283>] ? shmem_file_llseek+0xe3/0xf0
04:16:09:[49374.263889]  [<ffffffff8131b3b2>] ? pointer.isra.22+0x82/0x450
04:16:09:[49374.263889]  [<ffffffff8105dd3a>] ? kvm_clock_read+0x1a/0x20
04:16:09:[49374.263889]  [<ffffffff8101fdc5>] ? sched_clock+0x5/0x10
04:16:09:[49374.263889]  [<ffffffff810ab6e2>] ? sched_clock_local+0x12/0x80
04:16:09:[49374.263889]  [<ffffffff810ab8c2>] ? sched_clock_cpu+0x72/0x90
04:16:09:[49374.263889]  [<ffffffff810d0f09>] ? log_store+0x119/0x200
04:16:09:[49374.263889]  [<ffffffff810a60a9>] ? try_to_wake_up+0x49/0x390
04:16:09:[49374.263889]  [<ffffffff810d32b9>] ? vprintk_emit+0x1a9/0x480
04:16:09:[49374.263889]  [<ffffffff81183838>] ? printk+0x4d/0x4f
04:16:09:[49374.263889]  [<ffffffff8101a942>] ? dump_trace_unwind+0x92/0xe0
04:16:09:[49374.263889]  [<ffffffff8101aa49>] ? try_stack_unwind+0x99/0x190
04:16:09:[49374.263889]  [<ffffffff815e876e>] ? entry_SYSCALL_64_fastpath+0x12/0x6d
04:16:09:[49374.263889]  [<ffffffff81019a99>] ? dump_trace+0x59/0x310
04:16:09:[49374.263889]  [<ffffffff81019e3a>] ? show_stack_log_lvl+0xea/0x170
04:16:09:[49374.263889]  [<ffffffff8101abc1>] ? show_stack+0x21/0x40
04:16:09:[49374.263889]  [<ffffffff810a8c29>] ? show_state_filter+0x79/0xb0
04:16:09:[49374.263889]  [<ffffffff8140115c>] ? sysrq_handle_showstate+0xc/0x20
04:16:09:[49374.263889]  [<ffffffff814017fc>] ? __handle_sysrq+0xec/0x140
04:16:09:[49374.263889]  [<ffffffff81401c34>] ? write_sysrq_trigger+0x24/0x40
04:16:09:[49374.263889]  [<ffffffff812615a9>] ? proc_reg_write+0x39/0x70
04:16:09:[49374.263889]  [<ffffffff811fc003>] ? __vfs_write+0x23/0x100
04:16:09:[49374.263889]  [<ffffffff81295818>] ? security_file_permission+0x38/0xc0
04:16:09:[49374.263889]  [<ffffffff811fc459>] ? rw_verify_area+0x49/0xc0
04:16:09:[49374.263889]  [<ffffffff810c4ef3>] ? percpu_down_read+0x13/0x40
04:16:09:[49374.263889]  [<ffffffff811fc68d>] ? vfs_write+0x9d/0x190
04:16:09:[49374.263889]  [<ffffffff811fd352>] ? SyS_write+0x42/0xa0
04:16:09:[49374.263889]  [<ffffffff815e876e>] ? entry_SYSCALL_64_fastpath+0x12/0x6d
04:16:09:[    0.083851] ioremap error for 0x7ffff000-0x80000000, requested 0x2, got 0x0
Comment by James Nunez (Inactive) [ 09/May/18 ]

We've seen this error impact many tests in a test sessions. For example, runtests, sanityn and other test suites have several tests fail with this error at https://testing.hpdd.intel.com/test_sessions/f89c4193-bd7d-49e4-b100-78f790e0dc02

Generated at Sat Feb 10 02:40:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.