Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13232

sanity test 160j fails with 'read changelog failed'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • Lustre 2.14.0
    • PPC clients
    • 3
    • 9223372036854775807

    Description

      sanity test_160j fails with 'read changelog failed' for PPC client testing 100% of the time.

      Looking at a recent failure at https://testing.whamcloud.com/test_sets/d3720002-4a27-11ea-b69a-52540065bddc, the actual error is a problem with the input to cat

      Registered 1 changelog users: 'cl3'
      total: 2 create in 0.00 seconds: 1052.66 ops/second
      cat: -: Invalid argument
       sanity test_160j: @@@@@@ FAIL: read changelog failed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6121:error()
        = /usr/lib64/lustre/tests/sanity.sh:14350:test_160j()
      

      The code that is failing in sanity test 160j is

      14341         # read changelog
      14342         cat <&4 >/dev/null || error "read changelog failed"
      

      Looking at the client1 (vm12) console log, we see

      [ 5314.374481] Lustre: DEBUG MARKER: == sanity test 160j: client can be umounted while its chanangelog is being used ===================== 01:24:59 (1581125099)
      [ 5314.494530] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2
      [ 5314.506580] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-10vm12@tcp:/lustre /mnt/lustre2
      [ 5314.555637] Lustre: Mounted lustre-client
      [ 5315.555507] Lustre: 10940:0:(llog_cat.c:808:llog_cat_process_common()) lustre-MDT0000-mdc-c0000000b5687800: invalid record in catalog [0x5:0x0:0xa]:0: rc = -22
      [ 5315.555690] LustreError: 10940:0:(mdc_changelog.c:295:chlg_load()) lustre-MDT0000-mdc-c0000000b5687800: fail to process llog: rc = -22
      [ 5315.600825] Lustre: Unmounted lustre-client
      [ 5315.777197] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_160j: @@@@@@ FAIL: read changelog failed 
      

      sanity test 160j started failing for PPC clients as soon as it was first introduced/landed on 27 SEPT 2019.

      Logs for more PPC client sanity test 160j failures are at
      https://testing.whamcloud.com/test_sets/717d4832-1dba-11ea-80b4-52540065bddc
      https://testing.whamcloud.com/test_sets/5e7bd63a-f7af-11e9-b62b-52540065bddc

      Attachments

        Activity

          People

            jamesanunez James Nunez (Inactive)
            jamesanunez James Nunez (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: