Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6109

LFSCK gets "inconsistent" flag and won't stop

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0, Lustre 2.8.0
    • Lustre 2.7.0
    • OpenSFS cluster with 2 MDSs with 1MDT each, 3 OSSs with two OSTs each and three clients running Lustre (master) build # 2812
    • 3
    • 17016

    Description

      While running the performance test (3.3.2) from the LFSCK Phase 3 test plan, I’ve run into a problem. I installed Lustre on all nodes and created a new file system. I don’t know if this matters, but the file system is empty when I run this test. I then run test 3.3.2 from the test plan; set fail_loc on the MDSs, create a small number of objects and run LFSCK. For some reason, LFSCk has flag “inconsistent” on both MDSs.

      On mds01:

      # cat /proc/fs/lustre/mdd/scratch-MDT0000/lfsck_namespace 
      name: lfsck_namespace
      magic: 0xa0629d03
      version: 2
      status: scanning-phase1
      flags: inconsistent
      param: all_targets,create_ostobj,create_mdtobj
      time_since_last_completed: 1155 seconds
      time_since_latest_start: 27 seconds
      time_since_last_checkpoint: N/A
      latest_start_position: 77, N/A, N/A
      last_checkpoint_position: N/A, N/A, N/A
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 461
      checked_phase2: 0
      updated_phase1: 208
      updated_phase2: 0
      failed_phase1: 0
      failed_phase2: 0
      directories: 27
      dirent_repaired: 0
      linkea_repaired: 208
      nlinks_repaired: 0
      multiple_linked_checked: 187
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 0
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 2
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 0
      name_hash_repaired: 0
      success_count: 1
      run_time_phase1: 28 seconds
      run_time_phase2: 0 seconds
      average_speed_phase1: 16 items/sec
      average_speed_phase2: N/A
      real_time_speed_phase1: 16 items/sec
      real_time_speed_phase2: N/A
      current_position: 238551041, N/A, N/A
      

      There is nothing in dmesg except confirmation that fail_loc was set on the MDTs:

      # dmesg
      Lustre: ctl-scratch-MDT0000: super-sequence allocation rc = 0 [0x0000000380000400-0x00000003c0000400):0:mdt
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: ctl-scratch-MDT0000: super-sequence allocation rc = 0 [0x00000003c0000400-0x0000000400000400):1:mdt
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: Skipped 101 previous similar messages
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: Skipped 3 previous similar messages
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: Skipped 100 previous similar messages
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: Skipped 101 previous similar messages
      Lustre: *** cfs_fail_loc=1603, val=0***
      Lustre: Skipped 3 previous similar messages
      

      If I wait a few minutes, the following is in dmesg:

      INFO: task lfsck_namespace:469 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.gefdacb7.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      lfsck_namespa D 0000000000000006     0   469      2 0x00000080
       ffff880478e87ba0 0000000000000046 0000000000000000 ffff880a9849b7e0
       ffff880a9849b7e0 ffff880d7d69f000 ffff880478e87ba0 ffffffffa06d57e9
       ffff880478e66638 ffff880478e87fd8 000000000000fbc8 ffff880478e66638
      Call Trace:
       [<ffffffffa06d57e9>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa06d5a3d>] lu_object_find_at+0x3d/0xe0 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa06d5b1f>] lu_object_find_slice+0x1f/0x80 [obdclass]
       [<ffffffffa0f33619>] lfsck_namespace_assistant_handler_p1+0x2d9/0x1f70 [lfsck]
       [<ffffffffa057d472>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
       [<ffffffffa0f23347>] lfsck_assistant_engine+0x497/0x1e00 [lfsck]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa0f22eb0>] ? lfsck_assistant_engine+0x0/0x1e00 [lfsck]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      INFO: task lfsck_namespace:469 blocked for more than 120 seconds.
            Not tainted 2.6.32-431.29.2.el6_lustre.gefdacb7.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      lfsck_namespa D 0000000000000006     0   469      2 0x00000080
       ffff880478e87ba0 0000000000000046 0000000000000000 ffff880a9849b7e0
       ffff880a9849b7e0 ffff880d7d69f000 ffff880478e87ba0 ffffffffa06d57e9
       ffff880478e66638 ffff880478e87fd8 000000000000fbc8 ffff880478e66638
      Call Trace:
       [<ffffffffa06d57e9>] ? lu_object_find_try+0x99/0x2b0 [obdclass]
       [<ffffffffa06d5a3d>] lu_object_find_at+0x3d/0xe0 [obdclass]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa06d5b1f>] lu_object_find_slice+0x1f/0x80 [obdclass]
       [<ffffffffa0f33619>] lfsck_namespace_assistant_handler_p1+0x2d9/0x1f70 [lfsck]
       [<ffffffffa057d472>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
       [<ffffffffa0f23347>] lfsck_assistant_engine+0x497/0x1e00 [lfsck]
       [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
       [<ffffffffa0f22eb0>] ? lfsck_assistant_engine+0x0/0x1e00 [lfsck]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      On mds02, the following is the state of LFSCK on mds02:

      # cat /proc/fs/lustre/mdd/scratch-MDT0001/lfsck_namespace 
      name: lfsck_namespace
      magic: 0xa0629d03
      version: 2
      status: scanning-phase2
      flags: scanned-once,inconsistent
      param: all_targets,create_ostobj,create_mdtobj
      time_since_last_completed: 2533 seconds
      time_since_latest_start: 1405 seconds
      time_since_last_checkpoint: 1405 seconds
      latest_start_position: 77, N/A, N/A
      last_checkpoint_position: 260571137, [0x3c0000401:0x4:0x0], 0x5d2c710b01284458
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 16
      checked_phase2: 0
      updated_phase1: 0
      updated_phase2: 0
      failed_phase1: 0
      failed_phase2: 0
      directories: 6
      dirent_repaired: 0
      linkea_repaired: 4
      nlinks_repaired: 0
      multiple_linked_checked: 0
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 0
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 2
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 0
      name_hash_repaired: 0
      success_count: 1
      run_time_phase1: 0 seconds
      run_time_phase2: 1405 seconds
      average_speed_phase1: 16 items/sec
      average_speed_phase2: 0 objs/sec
      real_time_speed_phase1: N/A
      real_time_speed_phase2: 0 objs/sec
      current_position: [0x0:0x0:0x0]
      

      I cannot stop nor start LFSCK again. So, I’m not sure what state it is in.

      Trying to start LFSCK again fails:

      # lctl lfsck_start -A -M scratch-MDT0000 -c -C --type namespace 
      Fail to start LFSCK: Operation already in progress
      

      Stopping LFSCK works on mds02:

      # cat /proc/fs/lustre/mdd/scratch-MDT0001/lfsck_namespace 
      name: lfsck_namespace
      magic: 0xa0629d03
      version: 2
      status: stopped
      flags: scanned-once,inconsistent
      param: all_targets,create_ostobj,create_mdtobj
      time_since_last_completed: 2890 seconds
      time_since_latest_start: 1762 seconds
      time_since_last_checkpoint: 10 seconds
      latest_start_position: 77, N/A, N/A
      last_checkpoint_position: 260571137, [0x3c0000401:0x4:0x0], 0x5d2c710b01284458
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 16
      checked_phase2: 0
      updated_phase1: 0
      updated_phase2: 0
      failed_phase1: 0
      failed_phase2: 0
      directories: 6
      dirent_repaired: 0
      linkea_repaired: 4
      nlinks_repaired: 0
      multiple_linked_checked: 0
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 0
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 2
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 0
      name_hash_repaired: 0
      success_count: 1
      run_time_phase1: 0 seconds
      run_time_phase2: 1753 seconds
      average_speed_phase1: 16 items/sec
      average_speed_phase2: 0 objs/sec
      real_time_speed_phase1: N/A
      real_time_speed_phase2: N/A
      current_position: N/A
      

      but does not stop LFSCK on mds01

      # cat /proc/fs/lustre/mdd/scratch-MDT0000/lfsck_namespace 
      name: lfsck_namespace
      magic: 0xa0629d03
      version: 2
      status: scanning-phase1
      flags: inconsistent
      param: all_targets,create_ostobj,create_mdtobj
      time_since_last_completed: 2953 seconds
      time_since_latest_start: 1825 seconds
      time_since_last_checkpoint: N/A
      latest_start_position: 77, N/A, N/A
      last_checkpoint_position: N/A, N/A, N/A
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 461
      checked_phase2: 0
      updated_phase1: 208
      updated_phase2: 0
      failed_phase1: 0
      failed_phase2: 0
      directories: 27
      dirent_repaired: 0
      linkea_repaired: 208
      nlinks_repaired: 0
      multiple_linked_checked: 187
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 0
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 2
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 0
      name_hash_repaired: 0
      success_count: 1
      run_time_phase1: 1826 seconds
      run_time_phase2: 0 seconds
      average_speed_phase1: 0 items/sec
      average_speed_phase2: N/A
      real_time_speed_phase1: 0 items/sec
      real_time_speed_phase2: N/A
      current_position: 238551041, N/A, N/A
      

      Attachments

        1. lfsck_log_mds01_13511.txt
          402 kB
        2. lfsck_log_mds01.txt
          403 kB
        3. lfsck_log_mds02_13511.txt
          11 kB
        4. lfsck_log_mds02.txt
          10 kB
        5. lfsck_logs.tgz
          0.2 kB

        Activity

          [LU-6109] LFSCK gets "inconsistent" flag and won't stop

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13987/
          Subject: LU-6109 osd-ldiskfs: handle no fid-in-dirent correctly
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: a27201d13828d4b9bae9261dcc552c1f767aff21

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13987/ Subject: LU-6109 osd-ldiskfs: handle no fid-in-dirent correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: a27201d13828d4b9bae9261dcc552c1f767aff21

          Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/13987
          Subject: LU-6109 osd-ldiskfs: fix previous patch issue
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: defae4053b7bbe1e013af1a9ec99090d6e7abe1c

          gerrit Gerrit Updater added a comment - Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/13987 Subject: LU-6109 osd-ldiskfs: fix previous patch issue Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: defae4053b7bbe1e013af1a9ec99090d6e7abe1c
          pjones Peter Jones added a comment -

          Landed for 2.7

          pjones Peter Jones added a comment - Landed for 2.7

          Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13511/
          Subject: LU-6109 lfsck: check FID validity before locating object
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 4d408c9aed9adaf1f4e2ea87851728a1cf662594

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13511/ Subject: LU-6109 lfsck: check FID validity before locating object Project: fs/lustre-release Branch: master Current Patch Set: Commit: 4d408c9aed9adaf1f4e2ea87851728a1cf662594
          yong.fan nasf (Inactive) added a comment - - edited

          Thanks James. I think the original issue should have been fixed via above two patches (13392, 13511).

          yong.fan nasf (Inactive) added a comment - - edited Thanks James. I think the original issue should have been fixed via above two patches (13392, 13511).

          I've run with the latest version of the 13511 patch and cannot reproduce this error any more. I've run the test several times in a row and increased the number of objects written and LFSCK completes each time.

          jamesanunez James Nunez (Inactive) added a comment - I've run with the latest version of the 13511 patch and cannot reproduce this error any more. I've run the test several times in a row and increased the number of objects written and LFSCK completes each time.

          There is memory leak in former LFSCK run, as to the subsequent LFSCK was blocked by some stale data. The root reason for the memory leak is because the LFSCK used OSD device to handle OSP (remote MDT) inconsistency by wrong. Such issues have been fixed by LU-5791 patch: http://review.whamcloud.com/13392.

          So I enhanced the 13511 patch (http://review.whamcloud.com/#/c/13511/) against the 13392 patch.
          James, would you please to retry the latest patch? Thanks!

          BTW, I have verified with your method, but cannot reproduce the failures any longer.

          yong.fan nasf (Inactive) added a comment - There is memory leak in former LFSCK run, as to the subsequent LFSCK was blocked by some stale data. The root reason for the memory leak is because the LFSCK used OSD device to handle OSP (remote MDT) inconsistency by wrong. Such issues have been fixed by LU-5791 patch: http://review.whamcloud.com/13392 . So I enhanced the 13511 patch ( http://review.whamcloud.com/#/c/13511/ ) against the 13392 patch. James, would you please to retry the latest patch? Thanks! BTW, I have verified with your method, but cannot reproduce the failures any longer.
          jamesanunez James Nunez (Inactive) added a comment - - edited

          I ran with your latest patch and uploaded a tar gzipped file at lfsck_logs.tgz.

          The directories/files that are create during this test are, in order:
          create a top level directory called test_dir
          create another directory test_dir/sdir-0
          in sdir_0, create 87 files test_dir/sdir-0/z.N, where N=0.{0..28}, 1.{29..57}, 2.{58..86}
          create one remote directory at test_dir/sdir-0/rdir-1 (left empty)
          create 10 local directory test_dir/sdir-0/ldir-{1..10} (left empty)
          create one striped directory at test_dir/sdir-0/rdir-1 (left empty)
          create 1 hard links at test_dir/sdir-0/ln.1 to z.0.0

          Then the directory test_dir/sdir-1 is created with the same files and directories with the same names as above.

          jamesanunez James Nunez (Inactive) added a comment - - edited I ran with your latest patch and uploaded a tar gzipped file at lfsck_logs.tgz. The directories/files that are create during this test are, in order: create a top level directory called test_dir create another directory test_dir/sdir-0 in sdir_0, create 87 files test_dir/sdir-0/z.N, where N=0.{0..28}, 1.{29..57}, 2.{58..86} create one remote directory at test_dir/sdir-0/rdir-1 (left empty) create 10 local directory test_dir/sdir-0/ldir-{1..10} (left empty) create one striped directory at test_dir/sdir-0/rdir-1 (left empty) create 1 hard links at test_dir/sdir-0/ln.1 to z.0.0 Then the directory test_dir/sdir-1 is created with the same files and directories with the same names as above.

          James,

          Thanks for the verification. I found more clew and updated the patch (http://review.whamcloud.com/#/c/13511/). Would you please to retry the latest patch with -1 debug_log enabled? BTW, please tell me what files/subdirs you created under the striped directory for test? Thanks!

          yong.fan nasf (Inactive) added a comment - James, Thanks for the verification. I found more clew and updated the patch ( http://review.whamcloud.com/#/c/13511/ ). Would you please to retry the latest patch with -1 debug_log enabled? BTW, please tell me what files/subdirs you created under the striped directory for test? Thanks!

          I've attached logs from the MDSs with the build for patch 13511; lfsck_log_mds01_13511.txt and lfsck_log_mds02_13511.txt.

          The first time I ran test 3.3.2, it completed with no problems. I ran it a second time and LFSCK namespace on MDS01 is stuck in 'scanning-phase1' as described in this ticket.

          jamesanunez James Nunez (Inactive) added a comment - I've attached logs from the MDSs with the build for patch 13511; lfsck_log_mds01_13511.txt and lfsck_log_mds02_13511.txt. The first time I ran test 3.3.2, it completed with no problems. I ran it a second time and LFSCK namespace on MDS01 is stuck in 'scanning-phase1' as described in this ticket.

          People

            yong.fan nasf (Inactive)
            jamesanunez James Nunez (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: