[LU-16140] lnet-selftest test_smoke: OOM crash on aarch64 and arm Created: 07/Sep/22  Updated: 15/Sep/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: Alexey Lyashkov
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-16151 lnet-selftest: Crash OOM on aarch64 Resolved
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run:
https://testing.whamcloud.com/test_sets/f550b8a6-bef4-4010-a6ac-a2ca28ee60fe

test_smoke failed with the following error:

onyx-91vm6 crashed during lnet-selftest test_smoke

[  468.793223] Lustre: DEBUG MARKER: onyx-91vm5.onyx.whamcloud.com: executing lst_setup
[  469.145323] Lustre: DEBUG MARKER: onyx-91vm6.onyx.whamcloud.com: executing lst_setup
[  469.232386] Lustre: DEBUG MARKER: onyx-91vm6.onyx.whamcloud.com: executing lst_setup
[  471.585548] obd_memory max: 3765627, obd_memory current: 3440400
[  471.585577] NetworkManager invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[  471.585593] CPU: 0 PID: 676 Comm: NetworkManager Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-372.16.1.el8_6.aarch64 #1
[  471.585601] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  471.585606] Call trace:
[  471.585609]  dump_backtrace+0x0/0x160
[  471.585675]  show_stack+0x28/0x38
[  471.585679]  dump_stack+0x5c/0x74
[  471.585723]  dump_header+0x4c/0x1e0
[  471.585751]  out_of_memory+0x410/0x510
[  471.585755]  __alloc_pages_nodemask+0xd74/0xde8
[  471.585766]  alloc_pages_vma+0x94/0x1f8
[  471.585771]  __read_swap_cache_async+0x100/0x280
[  471.585780]  read_swap_cache_async+0x60/0xa8
[  471.585786]  swap_cluster_readahead+0x294/0x2f8
[  471.585791]  swapin_readahead+0x2a4/0x3c4
[  471.585795]  do_swap_page+0x55c/0x868
[  471.585803]  __handle_mm_fault+0x4c4/0x590
[  471.585808]  handle_mm_fault+0xe0/0x180
[  471.585813]  do_page_fault+0x164/0x488
[  471.585824]  do_translation_fault+0xa0/0xb0
[  471.585830]  do_mem_abort+0x54/0xb0
[  471.585835]  el1_da+0x1c/0x98
[  471.585840]  do_sys_poll+0x3c0/0x560
[  471.585854]  __arm64_sys_ppoll+0xc8/0x120
[  471.585859]  el0_svc_handler+0xb4/0x188
[  471.585871]  el0_svc+0x8/0xc
[  471.585878] Mem-Info:
[  471.585883] active_anon:20 inactive_anon:0 isolated_anon:0
 active_file:21 inactive_file:46 isolated_file:0
 unevictable:0 dirty:0 writeback:0
 slab_reclaimable:425 slab_unreclaimable:926
 mapped:92 shmem:1 pagetables:163 bounce:0
 free:2328 free_pcp:30 free_cma:0
[  471.585896] Node 0 active_anon:1280kB inactive_anon:0kB active_file:1344kB inactive_file:2944kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:5888kB dirty:0kB writeback:0kB shmem:64kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:9920kB pagetables:10432kB all_unreclaimable? no
[  471.585910] Node 0 DMA32 free:148992kB min:660032kB low:693952kB high:727872kB active_anon:2880kB inactive_anon:0kB active_file:0kB inactive_file:7936kB unevictable:0kB writepending:0kB present:3145728kB managed:2758144kB mlocked:0kB bounce:0kB free_pcp:1920kB local_pcp:960kB free_cma:0kB
[  471.585927] lowmem_reserve[]: 0 0 0
[  471.585938] Node 0 DMA32: 232*64kB (UM) 50*128kB (UM) 97*256kB (UM) 46*512kB (M) 26*1024kB (M) 8*2048kB (M) 4*4096kB (M) 3*8192kB (M) 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB = 153600kB
[  471.587258] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled

All lnet-selftest runs with aarch64 clients since 2022-09-01 have started crashing due to OOM, while none of the test runs before that date have failed.
https://testing.whamcloud.com/search?client_architecture_type_id=a697f55a-d8a4-11e8-975a-52540065bddc&test_set_script_id=c24874b2-4a56-11e0-a7f6-52540025f9af&sub_test_script_id=c252c2c8-4a56-11e0-a7f6-52540025f9af&start_date=2022-08-29&end_date=2022-09-07&source=sub_tests#redirect

This means it is very likely a regression caused by a patch landing on 2022-09-01. The patches landed at that time are below, and since the failure is in lnet-selftest only libcfs/lnet patches could be causing the problem:

git log --oneline --after 2022-08-31 --before 2022-09-02
286924f8a0 LU-9859 libcfs: remove Lustre specific bitmap handling
0e48653c27 LU-16085 llite: fix stat attributes_mask
dfc6beade3 LU-16093 kernel: kernel update SLES12 SP5 [4.12.14-122.130.1]
fef1db004c LU-16084 tests: fix lustre-patched filefrag check
162336079d LU-15994 tests: add testing for io_uring via fio
ac6528af8d LU-15548 tests: skip conf-sanity/131 for older servers
d54114d0c5 LU-15873 obd: skip checking read-only fs health
d851381ea6 LU-1904 idl: add checks for OBD_CONNECT flags
155cbc22ba LU-16012 sec: fix detection of SELinux enforcement
807f3a0779 LU-16048 build: Update ZFS version to 2.1.5
afa4c31087 LU-16045 enc: force use of new enc xattr on new servers
2612cf4ad8 LU-16035 kfilnd: Initial kfilnd implementation
62b470a023 LU-16027 tests: sanity:test_66: specify blocksize explicitly
9899144862 LU-15999 tests: format journal with correct block size
52057d85ea LU-15393 tests: check QoS hang with OST failover
26beb8664f LU-16081 lnet: Memory leak on adding existing interface *
d4978678b4 LU-15694 quota: keep grace time while setting default limits
b515c6ec2a LU-15642 obdclass: use consistent stats units
84b1ca8618 LU-15930 lnet: Remove duplicate checks for peer sensitivity *
2431e099b1 LU-15929 lnet: Correct net selection for router ping *
caf6095ade LU-15595 lnet: LNet peer aliveness broken *
8ee85e1541 LU-15595 tests: Add various router tests
ff3322fd0c LU-14955 lnet: Use fatal NI if none other available *
8a7aa8d590 LU-16058 build: proc_ops check fails with SUBARCH undefined
fab404836d LU-12514 target: move server mount code to target layer
f1c8ac1156 LU-15811 llite: Refactor DIO/AIO free code
36c34af607 LU-15811 llite: Unify range unlock
f3fe144b85 LU-15003 sec: use enc pool for bounce pages
9ca348e876 LU-14719 utils: dir migration stop on error
a20b78a81d LU-15357 iokit: fix the obsolete usage of cfg_device
51c1853933 LU-15811 llite: Rework upper/lower DIO/AIO
9e2df7e5cc LU-10391 lnet: change ni_status in lnet_ni to u32* *
c92bdd97d9 LU-16056 libcfs: restore umask handling in kernel threads *
76c3fa96dc LU-16082 ldiskfs: old-style EA inode handling fix
2447564e12 LU-16011 lnet: use preallocate bulk for server ***

It is noteworthy that the three test runs that have passed since 2022-09-01 are based on commit 3f0bee2502, "LU-15959 kernel: new kernel [SLES15 SP4 5.14.21-150400.24.18.1]", which is one patch before the first commit 2447564e12 "LU-16011 lnet: use preallocate bulk for server" in the above list.

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
lnet-selftest test_smoke - onyx-91vm6 crashed during lnet-selftest test_smoke



 Comments   
Comment by Andreas Dilger [ 07/Sep/22 ]

Shadow, could you please take a look at this.

It looks like your patch https://review.whamcloud.com/47952 "LU-16011 lnet: use preallocate bulk for server" is causing 100% OOM failure for aarch64 since it landed. There was no aarch64 testing on this patch due to "Test-Parameters: trivial". This is normally OK for LNet patches, but in this case there is likely some error in memory allocation for 64KB PAGE_SIZE.

Comment by Alexey Lyashkov [ 07/Sep/22 ]

@Andreas,

this patch have passed a Oleg testing with 2Gb ram per VM. for must more memory it's limited with number work items.
Can you provide a coredump and modules to inspect ?

Comment by Alexey Lyashkov [ 07/Sep/22 ]

please don't forget, o2iblnd have a leak introduced by "babf023227 LU-13368 lnet: discard the callback" this was caused an OOM on Cray routers after o2ib reconnects. o2ib connects was in the kib_connd_waits list.

kib_connd_waits = {
    next = 0xffff88826f956a18,
    prev = 0xffff88851866ec18
  },
crash> list 0xffff88826f956a18 | wc -l
15009

I known, this revert already created - but - did you have arm64 test with this revert ?

Comment by Andreas Dilger [ 07/Sep/22 ]

this patch have passed a Oleg testing with 2Gb ram per VM. for must more memory it's limited with number work items.

I don't think Oleg tests with aarch64 clients, and the OOM is not being hit on x86_64 clients.

Can you provide a coredump and modules to inspect ?

There are coredumps available, but I think they can only be downloaded directly via our VPN. green, would you be able to help Shadow get access to the coredump? The one from the most recent failure https://testing.whamcloud.com/test_sets/bcf91e7c-563a-4c7f-8137-17f785785b8b reports at the top of the vmcore-dmesg file:
https://testing.whamcloud.com/test_logs/10f980f6-5d90-426e-90a4-5e5a92cd7d91/show_text

Core file can be found in /scratch/dumps/trevis-108vm2.trevis.whamcloud.com/10.240.44.205-2022-09-07-14:53:37
Comment by Alexey Lyashkov [ 07/Sep/22 ]

btw. how much RAM in our's aarch64 VM ?

I will create an own aarch64 guests with 2G ram (soft emu bug - don't able to have more with IB card).

Comment by Andreas Dilger [ 07/Sep/22 ]

I was just looking at the console messages in the original test failure and it shows a more useful stack trace in brw_srpc_init():
https://testing.whamcloud.com/test_sets/f550b8a6-bef4-4010-a6ac-a2ca28ee60fe
https://testing.whamcloud.com/test_logs/bfeee1fb-bd31-4962-9f5b-660f22af465a/show_text

[  474.320712] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
[  474.320712] 
[  474.335774] CPU: 1 PID: 19827 Comm: modprobe Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-372.16.1.el8_6.aarch64 #1
[  474.344303] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  474.349142] Call trace:
[  474.350920]  dump_backtrace+0x0/0x160
[  474.353462]  show_stack+0x28/0x38
[  474.355839]  dump_stack+0x5c/0x74
[  474.358227]  panic+0x140/0x30c
[  474.360369]  out_of_memory+0x438/0x510
[  474.363003]  __alloc_pages_nodemask+0xd74/0xde8
[  474.366230]  srpc_alloc_bulk+0x130/0x5d0 [lnet_selftest]
[  474.370112]  brw_srpc_init+0x2c/0x60 [lnet_selftest]
[  474.373562]  srpc_add_service+0x2a4/0x740 [lnet_selftest]
[  474.377337]  sfw_startup+0x23c/0x898 [lnet_selftest]
[  474.380852]  lnet_selftest_init+0x5f4/0x10000 [lnet_selftest]
[  474.384850]  do_one_initcall+0x58/0x1f0
[  474.387581]  do_init_module+0x64/0x238
[  474.390197]  load_module+0x13ec/0x1690
[  474.392839]  __se_sys_finit_module+0xa4/0x100
[  474.395916]  __arm64_sys_finit_module+0x28/0x38
[  474.399093]  el0_svc_handler+0xb4/0x188
[  474.401796]  el0_svc+0x8/0xc
[  474.405020] SMP: stopping secondary CPUs
[  474.410883] Starting crashdump kernel...
[  474.413584] Bye!

It looks like there was OOM hit on both aarch64 nodes (console.onyx-91vm5.log and console.onyx-91vm6.log at the same time.

how much RAM in our's aarch64 VM ?

I will create an own aarch64 guests with 2G ram (soft emu bug - don't able to have more with IB card).

There should be 3GB of RAM for the client VMs. I believe all of the the testing is with TCP and not IB on these nodes.

Comment by Andreas Dilger [ 07/Sep/22 ]

I pushed (untested) patch https://review.whamcloud.com/48476 with some debugging to run on aarch64 to show how large the allocation is that fails.

Comment by Alexey Lyashkov [ 08/Sep/22 ]

Andreas,

it looks I found a root cause of it. LST want a something like a "LU-10157 ptlrpc: fill md correctly." It have allocate an 64k page for transfer, but just 4k used from it.
it increase memory consumption in such small environment. Let me cook a patch and submit.

Comment by Xinliang Liu [ 08/Sep/22 ]

Hi Alexey and Andreas, thanks for analyzing and addressing this issue. We also found that lnet-selftest test_smoke crashs on our arm64 CI recently. Looking forwarding to the fixed patch.

Comment by Alexey Lyashkov [ 08/Sep/22 ]

It looks I able to replicate this issue locally. I hope patch will be ready on begin of next week.

PS. lustre test-framework too bad works with different path and different arch without installing from rpm.

Comment by Andreas Dilger [ 08/Sep/22 ]

I had just found the same thing with my debug patch:
https://testing.whamcloud.com/test_sets/af232f05-d183-4a2f-9d3d-84b368efbc85

[  476.508122] Lustre: 14798:0:(brw_test.c:544:brw_init_test_service()) cache_size = 20578/41157 pages (80 IOVs)
[  476.515256] Lustre: 14798:0:(brw_test.c:549:brw_init_test_service()) want 2048 work items, got 80
[  476.522193] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0
[  476.528390] LNet: 14798:0:(rpc.c:162:srpc_alloc_bulk()) allocate bulk page 1/256 on cpt 0
[  476.534440] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0
**** repeats 70 times ****
[  477.019823] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0
[  477.047753] LNet: 14798:0:(rpc.c:162:srpc_alloc_bulk()) allocate bulk page 12/256 on cpt 0
[  477.053733] LNet: 14798:0:(rpc.c:162:srpc_alloc_bulk()) Skipped 18442 previous similar messages
[  477.061430] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0
**** repeats 68 times ****
[  478.028197] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0
[  478.110577] obd_memory max: 3765625, obd_memory current: 3440400
[  478.114656] systemd-journal invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

What is also unexpected is that this is preallocating a lot more than 80 bulk descriptors, about 140 in total before OOM is hit. Since 80 desc * 256 pages/desc * 64KiB/page = 1310720 KiB = 1.2 GiB is already half of RAM (20480 of 41157 pages), the OOM is happening when over 2.2 GiB 87% of RAM has been allocated by these descriptors.

So in addition to the 4KB vs. 64KB PAGE_SIZE issue, there is also a problem with the cache_size calculation in brw_init_test_service(). That appears to be twice as much RAM allocated as expected, maybe it is per-NID or per-CPT or two "services" are started, which is bad. This needs to be limited to a more reasonable amount of memory.

Comment by Gerrit Updater [ 09/Sep/22 ]

"Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/48507
Subject: LU-16140 lnet: fix lst on 64k page size.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 93938c2e4181c0bd91b69c980537f2dfa34d5a17

Comment by Alexey Lyashkov [ 09/Sep/22 ]

>[ 476.522193] LNet: 14798:0:(rpc.c:148:srpc_alloc_bulk()) allocate 256 bulk pages, 4120 bytes on cpt 0

it's primary fail (for both - client and server parts).
bulk pages allocation should be LNET_MTU / PAGE_SIZE + 1 (for unaligned offset) not an LNET_MAX_IOV as it was before.

Andreas, can you run attached patch on the our's arm? it passed my verification with 3G ram.

Loading modules from /work/lustre/wc/LU-16140-arm-mem/lustre/lustre
detected 8 online CPUs by sysfs
libcfs will create CPU partition based on online CPUs
../lnet/lnet/lnet options: 'networks="tcp0(eth0)" lnet_peer_discovery_disabled=0 lnet_health_sensitivity=1 accept=all'
gss/krb5 is not supported
loading modules on: 'devel1'
devel1: devel1: executing load_modules_local
devel1: Loading modules from /work/lustre/wc/lustre-basic/lustre
devel1: /work/lustre/wc/lustre-basic/lustre//tests/test-framework.sh: line 693: echo: write error: Device or resource busy
devel1: /work/lustre/wc/lustre-basic/lustre//tests/test-framework.sh: line 694: echo: write error: Device or resource busy
devel1: detected 8 online CPUs by sysfs
devel1: libcfs will create CPU partition based on online CPUs
devel1: ../lnet/lnet/lnet options: 'networks="tcp0(eth0)" lnet_peer_discovery_disabled=0 lnet_health_sensitivity=1 accept=all'
devel1: ptlrpc/ptlrpc options: 'lbug_on_grant_miscount=1'
devel1: gss/krb5 is not supported
devel1: quota/lquota options: 'hash_lqs_cur_bits=3'
192.168.100.1: LNET configured
LNET configured
12345-0@lo
12345-192.168.100.1@tcp
SESSION: read/write FEATURES: 1 TIMEOUT: 300 FORCE: No
192.168.100.1@tcp are added to session
192.168.100.5@tcp are added to session
Test was added successfully
bulk_rw is running now
[LNet Rates of servers]
[R] Avg: 119      RPC/s Min: 119      RPC/s Max: 119      RPC/s
[W] Avg: 123      RPC/s Min: 123      RPC/s Max: 123      RPC/s
[LNet Bandwidth of servers]
[R] Avg: 61.79    MiB/s Min: 61.79    MiB/s Max: 61.79    MiB/s
[W] Avg: 0.01     MiB/s Min: 0.01     MiB/s Max: 0.01     MiB/s
[LNet Rates of servers]
[R] Avg: 119      RPC/s Min: 119      RPC/s Max: 119      RPC/s
[W] Avg: 120      RPC/s Min: 120      RPC/s Max: 120      RPC/s
[LNet Bandwidth of servers]
[R] Avg: 60.27    MiB/s Min: 60.27    MiB/s Max: 60.27    MiB/s
[W] Avg: 0.01     MiB/s Min: 0.01     MiB/s Max: 0.01     MiB/s
[LNet Rates of servers]
[R] Avg: 125      RPC/s Min: 125      RPC/s Max: 125      RPC/s
[W] Avg: 124      RPC/s Min: 124      RPC/s Max: 124      RPC/s
[LNet Bandwidth of servers]
[R] Avg: 61.92    MiB/s Min: 61.92    MiB/s Max: 61.92    MiB/s
[W] Avg: 0.01     MiB/s Min: 0.01     MiB/s Max: 0.01     MiB/s
[LNet Rates of servers]
[R] Avg: 122      RPC/s Min: 122      RPC/s Max: 122      RPC/s
[W] Avg: 122      RPC/s Min: 122      RPC/s Max: 122      RPC/s
[LNet Bandwidth of servers]
[R] Avg: 61.10    MiB/s Min: 61.10    MiB/s Max: 61.10    MiB/s
[W] Avg: 0.01     MiB/s Min: 0.01     MiB/s Max: 0.01     MiB/s
[LNet Rates of servers]
[R] Avg: 123      RPC/s Min: 123      RPC/s Max: 123      RPC/s
[W] Avg: 123      RPC/s Min: 123      RPC/s Max: 123      RPC/s
[LNet Bandwidth of servers]
[R] Avg: 61.66    MiB/s Min: 61.66    MiB/s Max: 61.66    MiB/s
[W] Avg: 0.01     MiB/s Min: 0.01     MiB/s Max: 0.01     MiB/s
session is ended
test-tcp.sh: строка 32:  5280 Завершено      lst stat servers
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
pdsh@devel4: devel1: ssh exited with exit code 126
Stopping clients: devel4 /mnt/lustre (opts:-f)
Stopping clients: devel4 /mnt/lustre2 (opts:-f)
pdsh@devel4: devel1: ssh exited with exit code 1
pdsh@devel4: devel1: ssh exited with exit code 1
pdsh@devel4: devel1: ssh exited with exit code 1
pdsh@devel4: devel1: ssh exited with exit code 1
unloading modules on: 'devel1'
devel1: devel1: executing unload_modules_local
devel1: /work/lustre/wc/lustre-basic/lustre/scripts/lustre_rmmod: line 99: echo: write error: Device or resource busy
devel1: rmmod: ERROR: Module ksocklnd is in use
pdsh@devel4: devel1: ssh exited with exit code 2
~
[root@devel4 ~]# uname -r
4.18.0-348.7.1.el8_5.aarch64
[root@devel4 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:          2,9Gi       411Mi       2,1Gi        18Mi       390Mi       2,1Gi
Swap:         2,0Gi          0B       2,0Gi
[root@devel4 ~]#
Comment by Xinliang Liu [ 14/Sep/22 ]

With patch 48507, our Arm CI won't crash now. See https://testing.whamcloud.com/test_sessions/0f397625-0376-4391-948c-a3c2e8386d09

Comment by Andreas Dilger [ 14/Sep/22 ]

With patch 48507, our Arm CI won't crash now. See https://testing.whamcloud.com/test_sessions/0f397625-0376-4391-948c-a3c2e8386d09

It didn't crash with OOM, but it didn't pass either... Also, the v1/v2 488507 patch is crashing both aarch64 as well as x86_64, so that isn't a proper fix yet:

Failed optional test review-ldiskfs on CentOS 8.5/x86_64, SLES 15.3/x86_64 uploaded by Trevis Autotest from trevis-40vm2: https://testing.whamcloud.com/test_sessions/601a2acf-fc26-4a52-8075-1cdee24113ac ran 7 tests. 2 tests failed: sanity, lnet-selftest.
%% THIS TEST SESSION CRASHED %%

Failed enforced test review-ldiskfs on CentOS 8.5/x86_64 uploaded by Onyx Autotest from onyx-120vm1: https://testing.whamcloud.com/test_sessions/5e543a27-87e0-4031-963e-3fcd5a5dfc57 ran 7 tests. 2 tests failed: sanity, lnet-selftest.
%% THIS TEST SESSION CRASHED %%

Failed optional test review-ldiskfs on RHEL 8.6/x86_64 uploaded by Onyx Autotest from onyx-77vm1: https://testing.whamcloud.com/test_sessions/4ab09db6-03aa-4ed2-843d-3dcb4f4f52dc ran 7 tests. 2 tests failed: sanity, lnet-selftest.
%% THIS TEST SESSION CRASHED %%

Failed enforced test review-ldiskfs-ubuntu on CentOS 8.5/x86_64, Ubuntu 20.04/x86_64 uploaded by Trevis Autotest from trevis-25vm3: https://testing.whamcloud.com/test_sessions/e4ea3a13-f65d-4a29-8582-c402ee3153e1 ran 8 tests. 2 tests failed: sanity, lnet-selftest.
%% THIS TEST SESSION CRASHED %%

Failed enforced test review-ldiskfs-arm on RHEL 8.6/aarch64, RHEL 8.6/x86_64 uploaded by Onyx Autotest from onyx-91vm1: https://testing.whamcloud.com/test_sessions/deef2746-9d03-47b3-87dc-557adacc7c7e ran 8 tests. 2 tests failed: sanity, lnet-selftest.
%% THIS TEST SESSION CRASHED %%

Comment by Gerrit Updater [ 14/Sep/22 ]

"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/48457/
Subject: LU-16140 lnet: revert "LU-16011 lnet: use preallocate bulk for server"
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1b09fb2296993c52e5d90d99ceb43c8c54c5c7a2

Generated at Sat Feb 10 03:24:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.