[LU-13248] sanity test 807 fails with '/mnt/lustre/d807.sanity/single_dd expected blocks: 1, got: 0' Created: 12/Feb/20  Updated: 17/Nov/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.12.6
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: ppc
Environment:

PPC Clients


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity test_807 fails with '/mnt/lustre/d807.sanity/single_dd expected blocks: 1, got: 0' starting on 27 SEPT 2019 and fails 100% of the time for PPC client testing.

Looking at a recent failure at https://testing.whamcloud.com/test_sets/6b8903ee-4d49-11ea-b58e-52540065bddc, the test output from the suite_log is

== sanity test 807: verify LSOM syncing tool ========================================================= 03:15:25 (1581477325)
CMD: trevis-55vm7 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n
CMD: trevis-55vm7 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm
mdd.lustre-MDT0000.changelog_mask=+hsm
CMD: trevis-55vm7 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n
Registered 1 changelog users: 'cl1'
CMD: trevis-55vm7 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.changelog_users
CMD: trevis-77vm1.trevis.whamcloud.com params=\$(/usr/sbin/lctl get_param llite.*.xattr_cache);
			 [[ -z \"\" ]] && param= ||
			 param=\$(grep  <<< \"\$params\");
			 [[ -z \$param ]] && param=\"\$params\";
			 while read s; do echo client \$s;
			 done <<< \"\$param\"
llite.lustre-c0000000051b7000.xattr_cache=0
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00218108 s, 481 MB/s
Test SOM for multi-client (2) writes
CMD: trevis-77vm1.trevis.whamcloud.com multiop /mnt/lustre/f807.sanity Oz0w1048576c
CMD: trevis-77vm2 multiop /mnt/lustre/f807.sanity Oz1048576w1048576c
llsom_sync: failed to get changelog record: Invalid argument (22)
Start to sync 0 records.
 sanity test_807: @@@@@@ FAIL: /mnt/lustre/d807.sanity/single_dd expected blocks: 1, got: 0 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:6121:error()
  = /usr/lib64/lustre/tests/sanity.sh:22358:check_lsom_data()
  = /usr/lib64/lustre/tests/sanity.sh:22513:test_807()

On the client1 (vm1) console log we see some errors

[ 6385.255266] Lustre: DEBUG MARKER: == sanity test 807: verify LSOM syncing tool ========================================================= 03:15:25 (1581477325)
[ 6386.739071] Lustre: DEBUG MARKER: params=$(/usr/sbin/lctl get_param llite.*.xattr_cache); [[ -z "" ]] && param= || param=$(grep <<< "$params"); [[ -z $param ]] && param="$params"; while read s; do echo client $s; done <<< "$param"
[ 6386.852920] Lustre: DEBUG MARKER: multiop /mnt/lustre/f807.sanity Oz0w1048576c
[ 6392.107985] Lustre: 3288:0:(llog_cat.c:808:llog_cat_process_common()) lustre-MDT0000-mdc-c0000000051b7000: invalid record in catalog [0x5:0x0:0xa]:0: rc = -22
[ 6392.108146] Lustre: 3288:0:(llog_cat.c:808:llog_cat_process_common()) Skipped 1 previous similar message
[ 6392.108218] LustreError: 3288:0:(mdc_changelog.c:295:chlg_load()) lustre-MDT0000-mdc-c0000000051b7000: fail to process llog: rc = -22
[ 6392.108306] LustreError: 3288:0:(mdc_changelog.c:295:chlg_load()) Skipped 1 previous similar message
[ 6392.312350] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_807: @@@@@@ FAIL: \/mnt\/lustre\/d807.sanity\/single_dd expected blocks: 1, got: 0 
[ 6392.516181] Lustre: DEBUG MARKER: sanity test_807: @@@@@@ FAIL: /mnt/lustre/d807.sanity/single_dd expected blocks: 1, got: 0

Logs for other sanity test 807 failures are at
https://testing.whamcloud.com/test_sets/6a18091c-233a-11ea-bb75-52540065bddc
https://testing.whamcloud.com/test_sets/5e7bd63a-f7af-11e9-b62b-52540065bddc


Generated at Sat Feb 10 02:59:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.