[LU-17151] sanity: test_411b Error: '(3) failed to write successfully' Created: 27/Sep/23  Updated: 28/Jan/24

Status: Reopened
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Qian Yingjin
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-16713 Writeback and commit pages under memo... Resolved
is related to LU-17183 sanity.sh test_411b: cgroups OOM on ARM Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Serguei Smirnov <ssmirnov@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/2a3fbe0b-f784-4875-bd67-6ab32aa223a3

Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/99011 - 4.18.0-425.10.1.el8_7.aarch64
servers: https://build.whamcloud.com/job/lustre-reviews/99011 - 4.18.0-477.21.1.el8_lustre.x86_64

sanity test_411b: @@@@@@ FAIL: (3) failed to write successfully
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:6700:error()
= /usr/lib64/lustre/tests/sanity.sh:27545:test_411b()
= /usr/lib64/lustre/tests/test-framework.sh:7040:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:7096:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:6926:run_test()
= /usr/lib64/lustre/tests/sanity.sh:27592:main()
Dumping lctl log to /autotest/autotest-2/2023-09-27/lustre-reviews_review-ldiskfs-dne-arm_99011_29_0b05909e-9d3c-46a3-9f81-125f5c37cc5d//sanity.test_411b.*.1695797605.log
CMD: trevis-108vm17.trevis.whamcloud.com,trevis-108vm18,trevis-72vm4,trevis-72vm5,trevis-72vm6 /usr/sbin/lctl dk > /autotest/autotest-2/2023-09-27/lustre-reviews_review-ldiskfs-dne-arm_99011_29_0b05909e-9d3c-46a3-9f81-125f5c37cc5d//sanity.test_411b.debug_log.\$(hostname -s).1695797605.log;
dmesg > /autotest/autotest-2/2023-09-27/lustre-reviews_review-ldiskfs-dne-arm_99011_29_0b05909e-9d3c-46a3-9f81-125f5c37cc5d//sanity.test_411b.dmesg.\$(hostname -s).1695797605.log
cache 19660800
rss 0
rss_huge 0
shmem 0
mapped_file 0
dirty 2883584
writeback 0
swap 2949120
pgpgin 21206
pgpgout 20906
pgfault 4034
pgmajfault 417
inactive_anon 0
active_anon 0
inactive_file 18022400
active_file 1638400
unevictable 0
hierarchical_memory_limit 268435456
hierarchical_memsw_limit 9223372036854710272
total_cache 19660800
total_rss 0
total_rss_huge 0
total_shmem 0
total_mapped_file 0
total_dirty 2883584
total_writeback 0
total_swap 2949120
total_pgpgin 21206
total_pgpgout 20906
total_pgfault 4034
total_pgmajfault 417
total_inactive_anon 0
total_active_anon 0
total_inactive_file 18022400
total_active_file 1638400
total_unevictable 0



 Comments   
Comment by Alex Zhuravlev [ 03/Oct/23 ]

cgroup limit causes OOM which in turn kills dd

[16546.136239] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/f411b.sanity,task_memcg=/f411b.sanity,task=dd,pid=1071239,uid=0
[16546.146364] Memory cgroup out of memory: Killed process 1071239 (dd) total-vm:36672kB, anon-rss:0kB, file-rss:2048kB, shmem-rss:0kB, UID:0 pgtables:384kB oom_score_adj:0
[16546.861389] dd invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Comment by Arshad Hussain [ 03/Oct/23 ]

+1 On Master +https://testing.whamcloud.com/sub_tests/bfbca56f-69b4-4c4c-94e1-f2b801dbc276

========================================================== 10:28:52 (1696328932)
[19651.231511] Lustre: DEBUG MARKER: == sanity test 411b: confirm Lustre can avoid OOM with reasonable cgroups limits ========================================================== 10:28:52 (1696328932)
[19669.199217] dd invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[19669.208834] CPU: 1 PID: 1254796 Comm: dd Kdump: loaded Tainted: G OE --------- - - 4.18.0-425.10.1.el8_7.aarch64 #1
[19669.216852] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[19669.221748] Call trace:
[19669.223597] dump_backtrace+0x0/0x178
[19669.226811] show_stack+0x28/0x38
[19669.229191] dump_stack+0x5c/0x74
[19669.231641] dump_header+0x4c/0x1e0
[19669.234205] oom_kill_process+0x18c/0x190
[19669.237056] out_of_memory+0x188/0x518
[19669.239728] mem_cgroup_out_of_memory+0xf8/0x108
[19669.243134] try_charge+0x6a8/0x6f0
[19669.245659] __mem_cgroup_charge+0x48/0xe8
[19669.248531] mem_cgroup_swapin_charge_page+0x68/0x170
[19669.252071] __read_swap_cache_async+0x1d8/0x280
[19669.255316] swap_cluster_readahead+0x184/0x300
[19669.258578] swapin_readahead+0x2a8/0x3cc
[19669.261400] do_swap_page+0x53c/0x828
[19669.264147] __handle_mm_fault+0x260/0x698
[19669.267037] handle_mm_fault+0xe4/0x188
[19669.269721] do_page_fault+0x164/0x488
[19669.272493] do_translation_fault+0xa0/0xb0
[19669.275595] do_mem_abort+0x54/0xb0
[19669.278156] el1_abort+0x28/0x30
[19669.280633] el1_sync_handler+0x70/0x120
[19669.283516] el1_sync+0x7c/0x100
[19669.285854] fault_in_readable+0x9c/0x188
[19669.288769] fault_in_iov_iter_readable+0x80/0xd0
[19669.292460] generic_perform_write+0x7c/0x1b8
[19669.295657] __generic_file_write_iter+0x104/0x1e8
[19669.299209] vvp_io_write_start+0x4d8/0x1058 [lustre]
[19669.303501] cl_io_start+0x98/0x1d8 [obdclass]
[19669.307477] cl_io_loop+0xcc/0x2a0 [obdclass]
[19669.310836] ll_file_io_generic+0x3dc/0xfe0 [lustre]
[19669.314709] ll_file_write_iter+0x7d4/0xa58 [lustre]
[19669.318471] new_sync_write+0x104/0x158
[19669.321468] __vfs_write+0x78/0x90
[19669.324000] vfs_write+0xb0/0x1b8
[19669.326565] ksys_write+0x70/0xd8
[19669.328989] __arm64_sys_write+0x28/0x38
[19669.331899] do_el0_svc+0xb4/0x188
[19669.334521] el0_sync_handler+0x88/0xac
[19669.337306] el0_sync+0x140/0x180
[19669.340071] memory: usage 262144kB, limit 262144kB, failcnt 15083
[19669.344527] memory+swap: usage 328128kB, limit 9007199254740928kB, failcnt 0
[19669.349159] kmem: usage 192kB, limit 9007199254740928kB, failcnt 0
[19669.353303] Memory cgroup stats for /f411b.sanity:
[19669.353714] anon 0
file 163250176
kernel_stack 0
pagetables 0
percpu 0
sock 0
shmem 0
file_mapped 0
file_dirty 3866624
file_writeback 0
swapcached 606339072
anon_thp 0
file_thp 0
shmem_thp 0
inactive_anon 131072
active_anon 0
inactive_file 265682944
active_file 2424832
unevictable 0
slab_reclaimable 70840
slab_unreclaimable 13632
slab 84472
workingset_refault_anon 8219
workingset_refault_file 23
workingset_activate_anon 8177
workingset_activate_file 3
workingset_restore_anon 7464
workingset_restore_file 2
workingset_nodereclaim 0
pgfault 13248
pgmajfault 1427
pgrefill 32282
pgscan 2242586
pgsteal 26364
pgactivate 31809
pgdeactivate 32273
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 0
thp_collapse_alloc 0
[19669.422320] Tasks state (memory values in pages):
[19669.425648] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[19669.431758] [1254796] 0 1254796 1085 32 393216 1068 0 dd
[19669.437512] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/f411b.sanity,task_memcg=/f411b.sanity,task=dd,pid=1254796,uid=0
[19669.447727] Memory cgroup out of memory: Killed process 1254796 (dd) total-vm:69440kB, anon-rss:0kB, file-rss:2048kB, shmem-rss:0kB, UID:0 pgtables:384kB oom_score_adj:0
[19670.119273] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_411b: @@@@@@ FAIL: (3) failed to write successfully 
[19670.662747] Lustre: DEBUG MARKER: sanity test_411b: @@@@@@ FAIL: (3) failed to write successfully
[19671.323379] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /autotest/autotest-2/2023-10-03/lustre-reviews_review-ldiskfs-dne-arm_99080_29_acee458f-e73b-455d-b9a7-38fbc1e13ed6//sanity.test_411b.debug_log.$(hostname -s).1696328953.log;
dmesg > /autotest/autotest-2/2023-10-03/lustre-reviews_rev
Comment by Arshad Hussain [ 04/Oct/23 ]

+1 on Master https://testing.whamcloud.com/test_sets/a2c9650c-efdc-49bf-80bb-7770e72cd53a

Comment by Andreas Dilger [ 05/Oct/23 ]

Hi Yingjin, can you please take a look at this. The sanity.sh test_411b added from patch https://review.whamcloud.com/50544 "LU-16713 llite: writeback/commit pages under memory pressure" is failing regularly since it landed (and once before landing on master-next). There is also a regular error "error writing to file from 1249987".

Comment by Andreas Dilger [ 05/Oct/23 ]

It looks like almost all of the failures are on aarch64 clients (48/58 failures).

Comment by Gerrit Updater [ 10/Oct/23 ]

"Timothy Day <timday@amazon.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52610
Subject: LU-17151 tests: increase 411b mem limit for arm
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8444e3f8d5d467ec7653c713529806d6123a449c

Comment by James A Simmons [ 16/Oct/23 ]

Can we land this since this issue is very common.

Comment by Gerrit Updater [ 25/Oct/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52610/
Subject: LU-17151 tests: increase sanity/411b memory limit
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0e878390e1c8c5883bccd01758392eaa16a67f31

Comment by Peter Jones [ 25/Oct/23 ]

Landed for 2.16

Comment by Nikitas Angelinas [ 31/Oct/23 ]

There is another failure on x86_64, on the first dd invocation in the test, on latest master that includes this fix at https://testing.whamcloud.com/test_sets/36fdecc1-7e58-410d-b1eb-7345e901b7e5.

Comment by Andreas Dilger [ 20/Nov/23 ]

It looks like this is still failing fairly regularly in the past week:

https://testing.whamcloud.com/search?status%5B%5D=FAIL&test_set_script_id=f9516376-32bc-11e0-aaee-52540025f9ae&sub_test_script_id=e877b2b8-1ca3-4655-836b-81fb1693f948&start_date=2023-11-13&end_date=2023-11-20&source=sub_tests#redirect

Comment by Andreas Dilger [ 20/Nov/23 ]

It looks like those failures are despite patch https://review.whamcloud.com/52610 "LU-17151 tests: increase sanity/411b memory limit" having landed.

Comment by Patrick Farrell [ 12/Dec/23 ]

This is failing quite a bit on master right now.

Comment by Qian Yingjin [ 09/Jan/24 ]

Should we increase the memory limit value in the test script or enforce the test only running on real environment not VM?

Comment by Alex Zhuravlev [ 09/Jan/24 ]

do I understand correctly that the all last runs were tmpfs-based?

Comment by Andreas Dilger [ 09/Jan/24 ]

Yes, it is using tmpfs on the VM host for the MDT since August or so. It is exported to the VM guest as a block device. The OSTs are still HDD I believe.

Comment by Alex Zhuravlev [ 09/Jan/24 ]

my observation with tmpfs-based ZFS is that it tends do not reuse blocks instead allocate new ones ("inner") and this leads to memory overuse. can be fixed running "zfs trim .." once few minutes/subtests.

Comment by Andreas Dilger [ 09/Jan/24 ]

Does trim get passed from the VM down to the host to free the memory? You could try adding a patch to this subtest to run fstrim or zfs trim on all of the targets before or during the test. However, if only the MDT is on tmpfs then I don't think these tests are using much memory there.

It would be useful to add some debugging to see where all of the memory is used (slabinfo/meminfo) so that we can reduce the size. Some of the internal data structures are allocated/limited at mount time based on the total RAM size and not limited by the cgroup size (e.g. lu cache, max_dirty_mb, etc), and the client needs to do a better job to free this memory under pressure (e.g. registered shrinker). Should we do anything to flush the cache at the start of the test, so that the process in the cgroup is not penalized by previous allocations outside its control?

It would also be good to improve the test scripts slightly to match proper test script style:

  • no need for "trap 0" in cleanup_test411_cgroup() since this would clobber any other registered stack_trap calls
  • in test_411a use "stack_trap cleanup_test411_cgroup" to do the cleanup instead of calling it explicitly
  • in test_411b add "stack_trap 'rm -f $DIR/$tfile.*'" to clean up the files even if the test fails
Comment by Alex Zhuravlev [ 10/Jan/24 ]

Does trim get passed from the VM down to the host to free the memory?

yes, here are my local runs:

  • sanity on ZFS w/o trim: 9785 MBs allocated from the host by sanity's completion
  • sanity on ZFS w/ zfs trim in run_one_logged: 8159 MBs
Comment by Andreas Dilger [ 28/Jan/24 ]

22 failures in the past week

Generated at Sat Feb 10 03:33:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.