[LU-15812] clean downgrade: sanity test_398a: FAIL: lock should be cancelled by direct IO Created: 03/May/22  Updated: 04/May/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: zfs

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

after clean upgrade from 2.14.0 EL8.3 zfs to 2.15.0 EL8.5 zfs and then back to 2.14.0 EL8.3, sanity hit this error

== sanity test 398a: direct IO should cancel lock otherwise lockless ================================= 01:11:49 (1651540309)
ldlm.namespaces.MGC10.240.43.80@tcp.lru_size=clear
ldlm.namespaces.lustre-MDT0000-mdc-ffff98813cf68000.lru_size=clear
ldlm.namespaces.lustre-OST0000-osc-ffff98813cf68000.lru_size=clear
ldlm.namespaces.lustre-OST0001-osc-ffff98813cf68000.lru_size=clear
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00432488 s, 242 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.111508 s, 9.4 MB/s
 sanity test_398a: @@@@@@ FAIL: lock should be cancelled by direct IO 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:6273:error()
  = /usr/lib64/lustre/tests/sanity.sh:22338:test_398a()
  = /usr/lib64/lustre/tests/test-framework.sh:6576:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:6623:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:6465:run_test()
  = /usr/lib64/lustre/tests/sanity.sh:22348:main()


 Comments   
Comment by Alena Nikitenko [ 04/May/22 ]

I've encountered the same problem on the same upgrade/downgrade path, but for ldiskfs:

== sanity test 398a: direct IO should cancel lock otherwise lockless ================================= 18:13:00 (1651687980)
ldlm.namespaces.MGC10.240.30.112@tcp.lru_size=clear
ldlm.namespaces.lustre-MDT0000-mdc-ffff8cc20aa4c000.lru_size=clear
ldlm.namespaces.lustre-OST0000-osc-ffff8cc20aa4c000.lru_size=clear
ldlm.namespaces.lustre-OST0001-osc-ffff8cc20aa4c000.lru_size=clear
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00364705 s, 288 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00968883 s, 108 MB/s
 sanity test_398a: @@@@@@ FAIL: lock should be cancelled by direct IO
  Trace dump:
  = /lib64/lustre/tests/test-framework.sh:6273:error()
  = /lib64/lustre/tests/sanity.sh:22338:test_398a()
  = /lib64/lustre/tests/test-framework.sh:6576:run_one()
  = /lib64/lustre/tests/test-framework.sh:6623:run_one_logged()
  = /lib64/lustre/tests/test-framework.sh:6465:run_test()
  = /lib64/lustre/tests/sanity.sh:22348:main()
Dumping lctl log to /tmp/test_logs/2022-05-04/163901/sanity.test_398a.*.1651687980.log
Generated at Sat Feb 10 03:21:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.