Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13144

sanity-hsm test 604 fails with 'No matching NOPEN entry'

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.14.0, Lustre 2.12.4
    • Fix Version/s: None
    • Labels:
    • Environment:
      RHEL 8 client
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      sanity-hsm test_604 fails with 'No matching NOPEN entry'. Looking at the suite_log, the full output from test_604 is

      == sanity-hsm test 604: NOPEN Changelog entry ======================================================== 18:43:31 (1578681811)
      CMD: trevis-40vm12 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.changelog_deniednext
      CMD: trevis-40vm12 lctl set_param mdd.lustre-MDT0000.changelog_deniednext=20
      mdd.lustre-MDT0000.changelog_deniednext=20
      CMD: trevis-40vm12 /usr/sbin/lctl get_param mdd.lustre-MDT0000.changelog_mask -n
      CMD: trevis-40vm12 /usr/sbin/lctl set_param mdd.lustre-MDT0000.changelog_mask=+hsm
      mdd.lustre-MDT0000.changelog_mask=+hsm
      CMD: trevis-40vm12 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n
      Registered 1 changelog users: 'cl5'
      CMD: trevis-40vm12 /usr/sbin/lctl set_param mdd.*.changelog_mask=ALL
      mdd.lustre-MDT0000.changelog_mask=ALL
      lustre-MDT0000: clear the changelog for cl5 of all records
      running as uid/gid/euid/egid 500/500/500/500, groups:
       [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
      cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
      lustre-MDT0000.39 24NOPEN 18:46:23.657537448 2020.01.10 0x1 t=[0x20000040b:0xc6:0x0] j=cat.500 ef=0xf u=500:500 nid=10.9.3.149@tcp m=r--
      Got NID '10.9.3.149@tcp'
      lustre-MDT0000: clear the changelog for cl5 of all records
      running as uid/gid/euid/egid 500/500/500/500, groups:
       [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
      cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
      lustre-MDT0000: clear the changelog for cl5 of all records
      running as uid/gid/euid/egid 500/500/500/500, groups:
       [cat] [/mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm]
      cat: /mnt/lustre2/d604.sanity-hsm/f604.sanity-hsm: Permission denied
       sanity-hsm test_604: @@@@@@ FAIL: No matching NOPEN entry 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:5900:error()
        = /usr/lib64/lustre/tests/sanity-hsm.sh:5571:test_604()
      

      The ‘Permission denied’ cat errors are seen in tests that pass, but, in those tests we do see another NOPEN entry in the changelog after the third cat of the file.

      There is nothing of interest in the console logs.

      We’ve seen this error twice for RHEL 8 clients starting 10 JAN 2020
      2.12.4 https://testing.whamcloud.com/test_sets/fd46e654-349c-11ea-adca-52540065bddc
      2.14.0 (ARM client) https://testing.whamcloud.com/test_sets/27e8c5e0-35a0-11ea-b1e8-52540065bddc

      Note: PPC client have failed this test for at least the past year with the same error message, but these failures have the additional error “lfs changelog: cannot access changelog: Invalid argument” and many sanity-hsm tests fail before test 604. Thus, I think the PPC client failures for this test are different from the RHEL8 failures.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wc-triage WC Triage
                Reporter:
                jamesanunez James Nunez
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: