Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13751

sanity test_160j: FAIL: read changelog failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0
    • Lustre 2.14.0
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/737a1583-79d3-4fbc-a7a8-97e2c2a459e2

      test_160j failed with the following error:

      cat: -: Cannot send after transport endpoint shutdown
       sanity test_160j: @@@@@@ FAIL: read changelog failed
      

      Console log on client:

      Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-22vm4@tcp:/lustre /mnt/lustre2
      Lustre: Mounted lustre-client
      Lustre: 656:0:(llog_cat.c:834:llog_cat_process_common()) lustre-MDT0000-mdc-ffff8b40363b1000: can't find llog handle [0x51f:0x1:0x0]:0: rc = -108
      LustreError: 656:0:(mdc_changelog.c:335:chlg_load()) lustre-MDT0000-mdc-ffff8b40363b1000: fail to process llog: rc = -108
      Lustre: Unmounted lustre-client
      Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_160j: @@@@@@ FAIL: read changelog failed 
      

      More failure instances on master branch:
      https://testing.whamcloud.com/test_sets/91dbe253-024d-439c-8d6b-e025071d97a7
      https://testing.whamcloud.com/test_sets/d9e11ade-9442-4759-b61e-18635b197bea
      https://testing.whamcloud.com/test_sets/b21e9658-39e1-4e46-b916-84ba852b553c

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_160j - read changelog failed

      Attachments

        Issue Links

          Activity

            [LU-13751] sanity test_160j: FAIL: read changelog failed
            pjones Peter Jones added a comment -

            Thanks Mike. James will look into making that change.

            pjones Peter Jones added a comment - Thanks Mike. James will look into making that change.

            I tend to think this is test script issue, the llog read from client sends RPC to server and ptlrpc_wait_queue() may ends up with -ESHUTDOWN on umount, so that is OK I'd say. Originally this test 160j was added in context of LU-11626 to check there is no LBUG due to missed obd device so correctness of this test is not about 'changelog must be read after umount' but about the server shouldn't see LBUG during that. Therefore I think we should just consider that error as valid case during the test

            tappro Mikhail Pershin added a comment - I tend to think this is test script issue, the llog read from client sends RPC to server and ptlrpc_wait_queue() may ends up with -ESHUTDOWN on umount, so that is OK I'd say. Originally this test 160j was added in context of LU-11626 to check there is no LBUG due to missed obd device so correctness of this test is not about 'changelog must be read after umount' but about the server shouldn't see LBUG during that. Therefore I think we should just consider that error as valid case during the test
            jhammond John Hammond added a comment -

            The first failure I could find with this error message was https://testing.whamcloud.com/sub_tests/57f6774c-e2bb-11e9-9874-52540065bddc

            jhammond John Hammond added a comment - The first failure I could find with this error message was https://testing.whamcloud.com/sub_tests/57f6774c-e2bb-11e9-9874-52540065bddc
            pjones Peter Jones added a comment -

            Mike

            Could you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Mike Could you please investigate? Thanks Peter
            yujian Jian Yu added a comment -

            The failure occurred 9 times in the past one week.

            yujian Jian Yu added a comment - The failure occurred 9 times in the past one week.
            yujian Jian Yu added a comment -

            This failure occurred 8 times in the past two weeks. It's affecting the patch testing on master branch.

            yujian Jian Yu added a comment - This failure occurred 8 times in the past two weeks. It's affecting the patch testing on master branch.

            People

              jamesanunez James Nunez (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: