Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11909

sanity-lfsck test 18a fails with '(6.1) Expect 3 fixed on mds1, but got: 0'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.10.7
    • None
    • 3
    • 9223372036854775807

    Description

      sanity-lfsck test_18a fails with '(6.1) Expect 3 fixed on mds1, but got: 0' , test_18b fails with '(4.1) Expect 3 fixed on mds1, but got: 0'
      , and test_18c fails with '(4) Expect 3 fixed on mds1, but got: 0'. We are only seeing this on b2_10 and these tests started failing on January 19. 2019. Looking at master failures, we haven’t seen these tests fail on master since at least June of 2018.

      Looking at the logs for the failures at https://testing.whamcloud.com/test_sets/3ea3a8f2-2522-11e9-830a-52540065bddc , the last thing seen in the client suite_log for test 18a is

      Inject failure, to make the MDT-object lost its layout EA
      CMD: trevis-24vm4 /usr/sbin/lctl set_param fail_loc=0x1615
      fail_loc=0x1615
      CMD: trevis-24vm4 /usr/sbin/lctl set_param fail_loc=0
      fail_loc=0
      The file size should be incorrect since layout EA is lost
      Trigger layout LFSCK on all devices to find out orphan OST-object
      CMD: trevis-24vm4 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t layout -r -o
      Started LFSCK on the device lustre-MDT0000: scrub layout
      CMD: trevis-24vm4 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0000.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-24vm4 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0000.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-24vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0000.lfsck_layout
      CMD: trevis-24vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0001.lfsck_layout
      CMD: trevis-24vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0002.lfsck_layout
      CMD: trevis-24vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0003.lfsck_layout
      CMD: trevis-24vm4 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout
       sanity-lfsck test_18a: @@@@@@ FAIL: (6.1) Expect 3 fixed on mds1, but got: 0
      

      There is no more information in the dmesg nor console logs.

      More test 18* failures can be found at
      https://testing.whamcloud.com/test_sets/b954607e-253f-11e9-b54c-52540065bddc
      https://testing.whamcloud.com/test_sets/413e02a2-252b-11e9-9e22-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: