Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18091

sanity-lfsck test_18h: FAIL: (6) Data in /mnt/lustre/d18h.sanity-lfsck/f0 is broken

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.16.0
    • None
    • Ubuntu 24.04 client
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for jianyu <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/76b849f1-717c-4798-a705-f16312ffa149

      test_18h failed with the following error:

      == sanity-lfsck test 18h: LFSCK can repair crashed PFL extent range ========================================================== 06:35:47 (1721975747)
      #####
      The PFL extent crashed. During the first cycle LFSCK scanning,
      the layout LFSCK will keep the bad PFL file(s) there without
      scanning its OST-object(s). Then in the second stage scanning,
      the OST will return related OST-object(s) to the MDT as orphan.
      And then the LFSCK on the MDT can rebuild the PFL extent with
      the 'orphan(s)' stripe information.
      #####
      0+1 records in
      0+1 records out
      295280 bytes (295 kB, 288 KiB) copied, 0.00202335 s, 146 MB/s
      cp: error copying '/mnt/lustre/d18h.sanity-lfsck/f0' to '/mnt/lustre/d18h.sanity-lfsck/guard': No data available
      Inject failure stub to simulate bad PFL extent range
      CMD: onyx-35vm6 /usr/sbin/lctl set_param fail_loc=0x162f
      fail_loc=0x162f
      chown: warning: '.' should be ':': '1.1'
      CMD: onyx-35vm6 /usr/sbin/lctl set_param fail_loc=0
      fail_loc=0
      dd: error writing '/mnt/lustre/d18h.sanity-lfsck/f0': No data available
      1+0 records in
      0+0 records out
      0 bytes copied, 0.00104513 s, 0.0 kB/s
      Trigger layout LFSCK to find out the bad lmm_oi and fix them
      CMD: onyx-35vm6 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t layout -r -o
      Started LFSCK on the device lustre-MDT0000: scrub layout
      CMD: onyx-35vm6 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0000.lfsck_layout |
      			awk '/^status/ { print \$2 }'
      CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0000.lfsck_layout
      CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0001.lfsck_layout
      CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0002.lfsck_layout
      CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0003.lfsck_layout
      CMD: onyx-35vm6 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout
      Data in /mnt/lustre/d18h.sanity-lfsck/f0 should not be broken
      Binary files /mnt/lustre/d18h.sanity-lfsck/f0 and /mnt/lustre/d18h.sanity-lfsck/guard differ
       sanity-lfsck test_18h: @@@@@@ FAIL: (6) Data in /mnt/lustre/d18h.sanity-lfsck/f0 is broken
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master/4558 - 6.8.0-35-generic
      servers: https://build.whamcloud.com/job/lustre-master/4558 - 5.14.0-427.24.1_lustre.el9.x86_64

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-lfsck test_18h - (6) Data in /mnt/lustre/d18h.sanity-lfsck/f0 is broken

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: