Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6563

conf-sanity test_53b: FAIL: Assertion 25 failed: (($tstarted >= $tmin2)) (expanded: ((7 >= 8)))

Details

    • 3
    • 9223372036854775807

    Description

      conf-sanity test 53b failed as follows under DNE configuration:

      CMD: shadow-21vm8 /usr/sbin/lctl set_param mds.MDS.mdt.threads_min=8
      mds.MDS.mdt.threads_min=8
      CMD: shadow-21vm8 /usr/sbin/lctl set_param mds.MDS.mdt.threads_max=142
      mds.MDS.mdt.threads_max=142
      CMD: shadow-21vm8 lctl get_param -n mds.MDS.mdt.threads_min
      CMD: shadow-21vm8 lctl get_param -n mds.MDS.mdt.threads_max
      checking (($tmin2 == ($tmin + $nthrs))) (((8 == (6 + 2))))...
      checking (($tmax2 == ($tmax - $nthrs))) (((142 == (144 - 2))))...
      CMD: shadow-21vm8 lctl get_param -n mds.MDS.mdt.threads_started
      checking (($tstarted >= $tmin2)) (((7 >= 8)))...
       conf-sanity test_53b: @@@@@@ FAIL: Assertion 25 failed: (($tstarted >= $tmin2)) (expanded: ((7 >= 8)))
      

      Maloo report: https://testing.hpdd.intel.com/test_sets/2a054c68-f25c-11e4-9f61-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-6563] conf-sanity test_53b: FAIL: Assertion 25 failed: (($tstarted >= $tmin2)) (expanded: ((7 >= 8)))

            Have landed patch to revert this.

            adilger Andreas Dilger added a comment - Have landed patch to revert this.

            I think the problem is that the test is increasing threads_min but it doesn't necessarily do anything to trigger the threads to start. The test probably needs to do something like "touch" or similar before sleeping to ensure the service thread is triggered and will check the ptlrpc_threads_enough() condition. The service thread probably handles some RPCs naturally via ping or DLM lock callback similar some of the time, but not consistently, which is why it is failing intermittently.

            adilger Andreas Dilger added a comment - I think the problem is that the test is increasing threads_min but it doesn't necessarily do anything to trigger the threads to start. The test probably needs to do something like "touch" or similar before sleeping to ensure the service thread is triggered and will check the ptlrpc_threads_enough() condition. The service thread probably handles some RPCs naturally via ping or DLM lock callback similar some of the time, but not consistently, which is why it is failing intermittently.

            Ugh, conf-sanity 53X failing again. Such touchy code.

            simmonsja James A Simmons added a comment - Ugh, conf-sanity 53X failing again. Such touchy code.

            Patch to revert the change:
            http://review.whamcloud.com/14682

            adilger Andreas Dilger added a comment - Patch to revert the change: http://review.whamcloud.com/14682

            Looks like the patch http://review.whamcloud.com/13823 is causing test failures. I will submit a reversion patch.

            adilger Andreas Dilger added a comment - Looks like the patch http://review.whamcloud.com/13823 is causing test failures. I will submit a reversion patch.

            Hi Yu Jian, please don't use URL shortening services like tinyurl.com. That link may not stick around forever, and it isn't possible to see what it is actually linking to without following the link. Please just use the full URL.

            adilger Andreas Dilger added a comment - Hi Yu Jian, please don't use URL shortening services like tinyurl.com. That link may not stick around forever, and it isn't possible to see what it is actually linking to without following the link. Please just use the full URL.
            yujian Jian Yu added a comment - - edited The failure has started occurring consistently on master branch since 2015-05-02: https://testing.hpdd.intel.com/sub_tests/query?utf8=%E2%9C%93&test_set[test_set_script_id]=7f66aa20-3db2-11e0-80c0-52540025f9af&sub_test[sub_test_script_id]=286c0182-40a7-11e0-8bad-52540025f9af&sub_test[status]=FAIL&sub_test[query_bugs]=&test_session[test_host]=&test_session[test_group]=&test_session[user_id]=&test_session[query_date]=&test_session[query_recent_period]=&test_node[os_type_id]=&test_node[distribution_type_id]=&test_node[architecture_type_id]=&test_node[file_system_type_id]=&test_node[lustre_branch_id]=24a6947e-04a9-11e1-bb5f-52540025f9af&test_node_network[network_type_id]=&commit=Update+results

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: