Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12247

sanityn test 34 fails with 'test_34 returned 1'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0
    • Lustre 2.13.0, Lustre 2.12.1
    • ppc clients
    • 3
    • 9223372036854775807

    Description

      sanityn test_34 fails with 'test_34 returned 1'. We see this test fail for PPC only.

      Looking at the suite_log for a recent failure, with logs at https://testing.whamcloud.com/test_sets/7d68fa34-668f-11e9-8bb1-52540065bddc, we see that wait_osc_import_ready() does not return in time

      CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0001-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0002-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0003-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0004-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0005-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0006-osc-ffff*.ost_server_uuid in 40 secs
      CMD: trevis-55vm11 lctl set_param -n fail_loc=0 2>/dev/null || true
      CMD: trevis-77vm1.trevis.whamcloud.com lctl get_param -n at_min
      can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs
      test_34 returned 1
      FAIL 34 (353s)
      

      When sanityn test 34 succeeds, we should see client 2 checking each OST that it in the IDLE state. When this test fails, we don’t see any activity on the console of client 2 (vm2)

       [10987.532565] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20 \(1555993460\)
      [10987.714805] Lustre: DEBUG MARKER: == sanityn test 34: no lock timeout under IO ========================================================= 04:24:20 
      

      The OSS console log does not have errors outside what is seen when sanityn test 34 succeeds.

      Logs for other failures are at
      https://testing.whamcloud.com/test_sets/07ef1416-2970-11e9-b901-52540065bddc
      https://testing.whamcloud.com/test_sets/052e8140-2555-11e9-830a-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: