Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2003

conf-sanity 21d @@@@@@ FAIL: import is not in FULL state

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.3.0, Lustre 2.4.0
    • MDS and MGS are the same server but two different disk are used for the MDT and MGT.
    • 3
    • 10090

    Description

      Test conf-sanity 21d failed with

      conf-sanity test_21d: @@@@@@ FAIL: import is not in FULL state
      Trace dump:
      = ./../tests/test-framework.sh:3709:error_noexit()
      = ./../tests/test-framework.sh:3731:error()
      = ./../tests/test-framework.sh:4764:wait_osc_import_state()
      = ./conf-sanity.sh:721:test_21d()
      = ./../tests/test-framework.sh:3967:run_one()
      = ./../tests/test-framework.sh:3997:run_one_logged()
      = ./../tests/test-framework.sh:3819:run_test()
      = ./conf-sanity.sh:731:main()

      Attachments

        Activity

          [LU-2003] conf-sanity 21d @@@@@@ FAIL: import is not in FULL state

          Finished testing 2.5 and the problem is gone. This ticket can be closed.

          simmonsja James A Simmons added a comment - Finished testing 2.5 and the problem is gone. This ticket can be closed.

          Just confirmed that on master this now passes. I need to now test b2_5.

          simmonsja James A Simmons added a comment - Just confirmed that on master this now passes. I need to now test b2_5.
          yujian Jian Yu added a comment -

          Hi James,

          I just ran conf-sanity test 21 for 10 times on the latest Lustre b2_5 and master builds separately. All of the test runs passed. Here are the reports:
          https://testing.hpdd.intel.com/test_sessions/90dfb9a2-9562-11e4-8cd1-5254006e85c2 (master)
          https://testing.hpdd.intel.com/test_sessions/e383a284-9578-11e4-8cd1-5254006e85c2 (b2_5)

          MGS and MDS are the same node but MGT and MDT use different disk partitions.

          yujian Jian Yu added a comment - Hi James, I just ran conf-sanity test 21 for 10 times on the latest Lustre b2_5 and master builds separately. All of the test runs passed. Here are the reports: https://testing.hpdd.intel.com/test_sessions/90dfb9a2-9562-11e4-8cd1-5254006e85c2 (master) https://testing.hpdd.intel.com/test_sessions/e383a284-9578-11e4-8cd1-5254006e85c2 (b2_5) MGS and MDS are the same node but MGT and MDT use different disk partitions.

          Let me test this for b2_5 and master first.

          simmonsja James A Simmons added a comment - Let me test this for b2_5 and master first.

          Hello James,

          Would you like us to continue to keep this ticket open?

          Thanks,
          ~ jfc.

          jfc John Fuchs-Chesney (Inactive) added a comment - Hello James, Would you like us to continue to keep this ticket open? Thanks, ~ jfc.

          Actually I haven't got around to in depth testing of 2.6 the last few months with the testing of 2.5 I have been doing. Please keep this open since I do know that for 2.4 it was failing consistently for me.

          simmonsja James A Simmons added a comment - Actually I haven't got around to in depth testing of 2.6 the last few months with the testing of 2.5 I have been doing. Please keep this open since I do know that for 2.4 it was failing consistently for me.

          James (Simmons),

          Have you seen this error recently in your testing? I've tried over the past two days to hit this error with the latest master running conf-sanity and just conf-sanity test 21d alone and can't trigger it. As you can see, Yu, Jian can't trigger the problem either.

          If you haven't seen this error, please let me know if you are comfortable closing this ticket.

          Thanks,
          James (Nunez)

          jamesanunez James Nunez (Inactive) added a comment - James (Simmons), Have you seen this error recently in your testing? I've tried over the past two days to hit this error with the latest master running conf-sanity and just conf-sanity test 21d alone and can't trigger it. As you can see, Yu, Jian can't trigger the problem either. If you haven't seen this error, please let me know if you are comfortable closing this ticket. Thanks, James (Nunez)
          yujian Jian Yu added a comment -

          Still cannot reproduce the failure on master branch (build #2052) by running conf-sanity from test 0 to 22 for more than 10 times. And also failed to reproduce the failure on Lustre 2.4.3.

          yujian Jian Yu added a comment - Still cannot reproduce the failure on master branch (build #2052) by running conf-sanity from test 0 to 22 for more than 10 times. And also failed to reproduce the failure on Lustre 2.4.3.
          yujian Jian Yu added a comment -

          I'm sorry for the late reply. I've manually running conf-sanity test 21d alone with separate MGS and MDT on the latest master branch for more than 10 times but did not reproduce the failure. I'll do more experiments.

          yujian Jian Yu added a comment - I'm sorry for the late reply. I've manually running conf-sanity test 21d alone with separate MGS and MDT on the latest master branch for more than 10 times but did not reproduce the failure. I'll do more experiments.

          After several runs of conf-sanity and running conf-sanity test 21d alone, I'm able to reproduce this error. I have a separate but co-located MGS and MDS as in the original setup.

          Results for this failure are at https://maloo.whamcloud.com/test_sessions/cc509aae-6806-11e3-a01f-52540035b04c

          jamesanunez James Nunez (Inactive) added a comment - After several runs of conf-sanity and running conf-sanity test 21d alone, I'm able to reproduce this error. I have a separate but co-located MGS and MDS as in the original setup. Results for this failure are at https://maloo.whamcloud.com/test_sessions/cc509aae-6806-11e3-a01f-52540035b04c

          From dmesg on the OSS:

          [ 1207.629766] Lustre: DEBUG MARKER: == conf-sanity test 21d: start mgs then ost and then mds == 10:53:04 (1348239184)
          [ 1467.507155] Lustre: DEBUG MARKER: rpc : @@@@@@ FAIL: can't put import for osc.lustre-OST0001-osc-MDT0000.ost_server_uuid into FULL state after 140 sec, have DISCONN
          [ 1469.733721] Lustre: DEBUG MARKER: conf-sanity test_21d: @@@@@@ FAIL: import is not in FULL state
          
          jamesanunez James Nunez (Inactive) added a comment - From dmesg on the OSS: [ 1207.629766] Lustre: DEBUG MARKER: == conf-sanity test 21d: start mgs then ost and then mds == 10:53:04 (1348239184) [ 1467.507155] Lustre: DEBUG MARKER: rpc : @@@@@@ FAIL: can't put import for osc.lustre-OST0001-osc-MDT0000.ost_server_uuid into FULL state after 140 sec, have DISCONN [ 1469.733721] Lustre: DEBUG MARKER: conf-sanity test_21d: @@@@@@ FAIL: import is not in FULL state

          People

            yujian Jian Yu
            simmonsja James A Simmons
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: