Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-19732

sanity-lfsck test_18a: FAIL: (6.1) Expect 3 fixed on mds1, but got: 248

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Lustre 2.18.0
    • Lustre 2.18.0
    • None
    • RHEL 10.0 client and server
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/fe52c8a0-6c56-4985-88f3-7cca6609b99e

      test_18a failed with the following error:

      == sanity-lfsck test 18a: Find out orphan OST-object and repair it (1) ========================================================== 22:28:12 (1766183292)
      #####
      The target MDT-object is there, but related stripe information
      is lost or partly lost. The LFSCK should regenerate the missing
      layout EA entries.
      #####
      2+0 records in
      2+0 records out
      2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.0126913 s, 165 MB/s
      [0x200000404:0x73:0x0]
      /mnt/lustre/d18a.sanity-lfsck/a1/f1
      lmm_stripe_count:  1
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 0
      	obdidx		 objid		 objid		 group
      	     0	           108	         0x6c	   0x300000401
      
      2+0 records in
      2+0 records out
      2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.013098 s, 160 MB/s
      [0x240000404:0x2:0x0]
      /mnt/lustre/d18a.sanity-lfsck/a2/f2
      lmm_stripe_count:  2
      lmm_stripe_size:   1048576
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 1
      	obdidx		 objid		 objid		 group
      	     1	             2	          0x2	   0x340000401
      	     0	             2	          0x2	   0x300000403
      
      2+0 records in
      2+0 records out
      2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.0167845 s, 125 MB/s
      [0x200000404:0x75:0x0]
      /mnt/lustre/d18a.sanity-lfsck/f3
        lcm_layout_gen:    3
        lcm_mirror_count:  1
        lcm_entry_count:   2
          lcme_id:             1
          lcme_mirror_id:      0
          lcme_flags:          init
          lcme_extent.e_start: 0
          lcme_extent.e_end:   1048576
            lmm_stripe_count:  1
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0
            lmm_layout_gen:    0
            lmm_stripe_offset: 0
            lmm_objects:
            -   0: { l_ost_idx:   0, l_fid: [0x300000401:0x6d:0x0] }
      
          lcme_id:             2
          lcme_mirror_id:      0
          lcme_flags:          init
          lcme_extent.e_start: 1048576
          lcme_extent.e_end:   EOF
            lmm_stripe_count:  1
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0
            lmm_layout_gen:    0
            lmm_stripe_offset: 1
            lmm_objects:
            -   0: { l_ost_idx:   1, l_fid: [0x340000403:0x2:0x0] }
      
      Inject failure, to make the MDT-object lost its layout EA
      CMD: trevis-152vm88 /usr/sbin/lctl set_param fail_loc=0x1615
      fail_loc=0x1615
      chown: warning: '.' should be ':': '1.1'
      CMD: trevis-152vm89 /usr/sbin/lctl set_param fail_loc=0x1615
      fail_loc=0x1615
      chown: warning: '.' should be ':': '1.1'
      chown: warning: '.' should be ':': '1.1'
      CMD: trevis-152vm88 /usr/sbin/lctl set_param fail_loc=0
      fail_loc=0
      CMD: trevis-152vm89 /usr/sbin/lctl set_param fail_loc=0
      fail_loc=0
      The file size should be incorrect since layout EA is lost
      Trigger layout LFSCK on all devices to find out orphan OST-object
      CMD: trevis-152vm88 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t layout -r -o
      Started LFSCK on the device lustre-MDT0000: scrub layout
      CMD: trevis-152vm88 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0000.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-152vm89 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0001.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-152vm88 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0002.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-152vm89 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0003.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: trevis-152vm87 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0000.lfsck_layout
      CMD: trevis-152vm87 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0001.lfsck_layout
      CMD: trevis-152vm88 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout
       sanity-lfsck test_18a: @@@@@@ FAIL: (6.1) Expect 3 fixed on mds1, but got: 248 
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/119916 - 6.12.0-55.43.1.el10_0.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/119916 - 6.12.0-55.43.1_lustre.el10.x86_64

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-lfsck test_18a - (6.1) Expect 3 fixed on mds1, but got: 248

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: