[LU-13144] sanity-hsm test 604 fails with 'No matching NOPEN entry' Created: 15/Jan/20  Updated: 16/Jan/20

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.12.4
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: rhel8
Environment:

RHEL 8 client


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity-hsm test_604 fails with 'No matching NOPEN entry'. Looking at the suite_log, the full output from test_604 is

== sanity-hsm test 604: NOPEN Changelog entry ======================================================== 18:43:31 (1578681811)
CMD: trevis-40vm12 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.changelog_deniednext
CMD: trevis-40vm12 lctl set_param mdd.lustre-MDT0000.changelog_deniednext=20
mdd.lustre-MDT0000.changelog_deniednext=20
CMD: trevis-40vm12 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n
CMD: trevis-40vm12 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm
mdd.lustre-MDT0000.changelog_mask=+hsm
CMD: trevis-40vm12 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n
Registered 1 changelog users: 'cl5'
CMD: trevis-40vm12 /usr/sbin/lctl set_param mdd.*.changelog_mask=ALL
mdd.lustre-MDT0000.changelog_mask=ALL
lustre-MDT0000: clear the changelog for cl5 of all records
running as uid/gid/euid/egid 500/500/500/500, groups:
 [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
lustre-MDT0000.39 24NOPEN 18:46:23.657537448 2020.01.10 0x1 t=[0x20000040b:0xc6:0x0] j=cat.500 ef=0xf u=500:500 nid=10.9.3.149@tcp m=r--
Got NID '10.9.3.149@tcp'
lustre-MDT0000: clear the changelog for cl5 of all records
running as uid/gid/euid/egid 500/500/500/500, groups:
 [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
lustre-MDT0000: clear the changelog for cl5 of all records
running as uid/gid/euid/egid 500/500/500/500, groups:
 [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
 sanity-hsm test_604: @@@@@@ FAIL: No matching NOPEN entry 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:5900:error()
  = /usr/lib64/lustre/tests/sanity-hsm.sh:5571:test_604()

The ‘Permission denied’ cat errors are seen in tests that pass, but, in those tests we do see another NOPEN entry in the changelog after the third cat of the file.

There is nothing of interest in the console logs.

We’ve seen this error twice for RHEL 8 clients starting 10 JAN 2020
2.12.4 https://testing.whamcloud.com/test_sets/fd46e654-349c-11ea-adca-52540065bddc
2.14.0 (ARM client) https://testing.whamcloud.com/test_sets/27e8c5e0-35a0-11ea-b1e8-52540065bddc

Note: PPC client have failed this test for at least the past year with the same error message, but these failures have the additional error “lfs changelog: cannot access changelog: Invalid argument” and many sanity-hsm tests fail before test 604. Thus, I think the PPC client failures for this test are different from the RHEL8 failures.


Generated at Sat Feb 10 02:58:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.