Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11111

crash doing LFSCK: orph_index_insert()) ASSERTION( !(obj->mod_flags & ORPHAN_OBJ)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0, Lustre 2.12.1
    • Lustre 2.10.4
    • None
    • x86_64, zfs 0.7.9, centos7.4, 2.10.4 plus the "lfsck: load object attr when prepare LFSCK request" patch from LU-10988
    • 3
    • 9223372036854775807

    Description

      Hi,

      I hit this just now when doing a lfsck for LU-10677.

      lctl lfsck_start -M dagg-MDT0000 -t namespace -A -r -C

      2018-07-02 19:00:24 [699080.215320] Lustre: debug daemon will attempt to start writing to /mnt/root/root/lfsck.debug_daemon.dagg-MDTs.20180702.1 (20480000kB max)
      2018-07-02 19:51:27 [702136.419050] LustreError: 199045:0:(mdd_orphans.c:205:orph_index_insert()) ASSERTION( !(obj->mod_flags & ORPHAN_OBJ) ) failed: 
      2018-07-02 19:51:27 [702136.432255] LustreError: 199045:0:(mdd_orphans.c:205:orph_index_insert()) LBUG
      2018-07-02 19:51:27 [702136.440349] Pid: 199045, comm: mdt01_013
      2018-07-02 19:51:27 [702136.445132] 
      2018-07-02 19:51:27 [702136.445132] Call Trace:
      2018-07-02 19:51:27 [702136.450734]  [<ffffffffc05267ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      2018-07-02 19:51:27 [702136.458083]  [<ffffffffc052683c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      2018-07-02 19:51:27 [702136.465078]  [<ffffffffc11e3827>] __mdd_orphan_add+0x8e7/0xa20 [mdd]
      2018-07-02 19:51:27 [702136.472242]  [<ffffffffc09f720c>] ? osd_write_locked+0x3c/0x60 [osd_zfs]
      2018-07-02 19:51:27 [702136.479747]  [<ffffffffc11d33ce>] mdd_finish_unlink+0x17e/0x410 [mdd]
      2018-07-02 19:51:27 [702136.486988]  [<ffffffffc11d56e4>] mdd_unlink+0xae4/0xbf0 [mdd]
      2018-07-02 19:51:27 [702136.493616]  [<ffffffffc10bc728>] mdt_reint_unlink+0xc28/0x11d0 [mdt]
      2018-07-02 19:51:27 [702136.500846]  [<ffffffffc10bfb33>] mdt_reint_rec+0x83/0x210 [mdt]
      2018-07-02 19:51:27 [702136.507628]  [<ffffffffc10a137b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      2018-07-02 19:51:27 [702136.514928]  [<ffffffffc10acf07>] mdt_reint+0x67/0x140 [mdt]
      2018-07-02 19:51:28 [702136.521402]  [<ffffffffc0e752ba>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      2018-07-02 19:51:28 [702136.529074]  [<ffffffffc0e1de2b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      2018-07-02 19:51:28 [702136.537493]  [<ffffffffc0e1a458>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      2018-07-02 19:51:28 [702136.545015]  [<ffffffff810c7c82>] ? default_wake_function+0x12/0x20
      2018-07-02 19:51:28 [702136.552008]  [<ffffffff810bdc4b>] ? __wake_up_common+0x5b/0x90
      2018-07-02 19:51:28 [702136.558588]  [<ffffffffc0e21572>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      2018-07-02 19:51:28 [702136.565605]  [<ffffffffc0e20ae0>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
      2018-07-02 19:51:28 [702136.572580]  [<ffffffff810b4031>] kthread+0xd1/0xe0
      2018-07-02 19:51:28 [702136.578163]  [<ffffffff810b3f60>] ? kthread+0x0/0xe0
      2018-07-02 19:51:28 [702136.583819]  [<ffffffff816c055d>] ret_from_fork+0x5d/0xb0
      2018-07-02 19:51:28 [702136.589900]  [<ffffffff810b3f60>] ? kthread+0x0/0xe0
      2018-07-02 19:51:28 [702136.595542] 
      2018-07-02 19:51:28 [702136.597694] Kernel panic - not syncing: LBUG
      2018-07-02 19:51:28 [702136.602600] CPU: 9 PID: 199045 Comm: mdt01_013 Tainted: P           OE  ------------   3.10.0-693.21.1.el7.x86_64 #1
      2018-07-02 19:51:28 [702136.613744] Hardware name: Dell Inc. PowerEdge R740/0JM3W2, BIOS 1.3.7 02/08/2018
      2018-07-02 19:51:28 [702136.621857] Call Trace:
      2018-07-02 19:51:28 [702136.624946]  [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
      2018-07-02 19:51:28 [702136.630704]  [<ffffffff816a8634>] panic+0xe8/0x21f
      2018-07-02 19:51:28 [702136.636117]  [<ffffffffc0526854>] lbug_with_loc+0x64/0xb0 [libcfs]
      2018-07-02 19:51:28 [702136.642927]  [<ffffffffc11e3827>] __mdd_orphan_add+0x8e7/0xa20 [mdd]
      2018-07-02 19:51:28 [702136.649895]  [<ffffffffc09f720c>] ? osd_write_locked+0x3c/0x60 [osd_zfs]
      2018-07-02 19:51:28 [702136.657201]  [<ffffffffc11d33ce>] mdd_finish_unlink+0x17e/0x410 [mdd]
      2018-07-02 19:51:28 [702136.664232]  [<ffffffffc11d56e4>] mdd_unlink+0xae4/0xbf0 [mdd]
      2018-07-02 19:51:28 [702136.670661]  [<ffffffffc10bc728>] mdt_reint_unlink+0xc28/0x11d0 [mdt]
      2018-07-02 19:51:28 [702136.677681]  [<ffffffffc10bfb33>] mdt_reint_rec+0x83/0x210 [mdt]
      2018-07-02 19:51:28 [702136.684247]  [<ffffffffc10a137b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      2018-07-02 19:51:28 [702136.691331]  [<ffffffffc10acf07>] mdt_reint+0x67/0x140 [mdt]
      2018-07-02 19:51:28 [702136.697583]  [<ffffffffc0e752ba>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      2018-07-02 19:51:28 [702136.705025]  [<ffffffffc0e1de2b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      2018-07-02 19:51:28 [702136.713235]  [<ffffffffc0e1a458>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      2018-07-02 19:51:28 [702136.720538]  [<ffffffff810c7c82>] ? default_wake_function+0x12/0x20
      2018-07-02 19:51:28 [702136.727314]  [<ffffffff810bdc4b>] ? __wake_up_common+0x5b/0x90
      2018-07-02 19:51:28 [702136.733666]  [<ffffffffc0e21572>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      2018-07-02 19:51:28 [702136.740438]  [<ffffffffc0e20ae0>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc]
      2018-07-02 19:51:28 [702136.748310]  [<ffffffff810b4031>] kthread+0xd1/0xe0
      2018-07-02 19:51:28 [702136.753661]  [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
      2018-07-02 19:51:28 [702136.760227]  [<ffffffff816c055d>] ret_from_fork+0x5d/0xb0
      2018-07-02 19:51:28 [702136.766108]  [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
      2018-07-02 19:51:28 [702136.772696] Kernel Offset: disabled
      

      the debug_daemon running at the time, but the file is 6.1G and probably truncated because of the crash. let me know if you want it anyway.

      I've mounted the MDT's back up again with -o skip_lfsck.

      I think the crash happened somewhere in phase2 of the namespace lfsck. the lfsck state after the reboot is below.

      [warble2]root: lctl get_param -n mdd.dagg-MDT000*.lfsck_namespace
      name: lfsck_namespace
      magic: 0xa0621a0b
      version: 2
      status: crashed
      flags: scanned-once,inconsistent
      param: all_targets,create_mdtobj
      last_completed_time: 1528222041
      time_since_last_completed: 2305149 seconds
      latest_start_time: 1530521735
      time_since_latest_start: 5455 seconds
      last_checkpoint_time: 1530525083
      time_since_last_checkpoint: 2107 seconds
      latest_start_position: 266, N/A, N/A
      last_checkpoint_position: 35184372088832, N/A, N/A
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 14945199
      checked_phase2: 4683016
      updated_phase1: 0
      updated_phase2: 1
      failed_phase1: 0
      failed_phase2: 0
      directories: 2205524
      dirent_repaired: 0
      linkea_repaired: 0
      nlinks_repaired: 0
      multiple_linked_checked: 324817
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 1
      dangling_repaired: 92
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 1639566
      striped_shards_repaired: 1
      striped_shards_failed: 0
      striped_shards_skipped: 3
      name_hash_repaired: 0
      linkea_overflow_cleared: 0
      success_count: 9
      run_time_phase1: 2357 seconds
      run_time_phase2: 960 seconds
      average_speed_phase1: 6340 items/sec
      average_speed_phase2: 4878 objs/sec
      average_speed_total: 5917 items/sec
      real_time_speed_phase1: N/A
      real_time_speed_phase2: N/A
      current_position: N/A
      name: lfsck_namespace
      magic: 0xa0621a0b
      version: 2
      status: crashed
      flags: scanned-once,inconsistent
      param: all_targets,create_mdtobj
      last_completed_time: 1528222004
      time_since_last_completed: 2305186 seconds
      latest_start_time: 1530521735
      time_since_latest_start: 5455 seconds
      last_checkpoint_time: 1530525083
      time_since_last_checkpoint: 2107 seconds
      latest_start_position: 266, N/A, N/A
      last_checkpoint_position: 35184372088832, N/A, N/A
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 14356291
      checked_phase2: 4705808
      updated_phase1: 0
      updated_phase2: 0
      failed_phase1: 0
      failed_phase2: 0
      directories: 2185653
      dirent_repaired: 0
      linkea_repaired: 0
      nlinks_repaired: 0
      multiple_linked_checked: 323999
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 11
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 0
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 1639415
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 4
      name_hash_repaired: 0
      linkea_overflow_cleared: 0
      success_count: 10
      run_time_phase1: 2374 seconds
      run_time_phase2: 960 seconds
      average_speed_phase1: 6047 items/sec
      average_speed_phase2: 4901 objs/sec
      average_speed_total: 5717 items/sec
      real_time_speed_phase1: N/A
      real_time_speed_phase2: N/A
      current_position: N/A
      name: lfsck_namespace
      magic: 0xa0621a0b
      version: 2
      status: crashed
      flags: scanned-once,inconsistent
      param: all_targets,create_mdtobj
      last_completed_time: 1528221995
      time_since_last_completed: 2305195 seconds
      latest_start_time: 1530521735
      time_since_latest_start: 5455 seconds
      last_checkpoint_time: 1530525083
      time_since_last_checkpoint: 2107 seconds
      latest_start_position: 266, N/A, N/A
      last_checkpoint_position: 35184372088832, N/A, N/A
      first_failure_position: N/A, N/A, N/A
      checked_phase1: 14368285
      checked_phase2: 4739156
      updated_phase1: 2
      updated_phase2: 1
      failed_phase1: 0
      failed_phase2: 0
      directories: 2187046
      dirent_repaired: 0
      linkea_repaired: 1
      nlinks_repaired: 0
      multiple_linked_checked: 324797
      multiple_linked_repaired: 0
      unknown_inconsistency: 0
      unmatched_pairs_repaired: 0
      dangling_repaired: 88
      multiple_referenced_repaired: 0
      bad_file_type_repaired: 0
      lost_dirent_repaired: 1
      local_lost_found_scanned: 0
      local_lost_found_moved: 0
      local_lost_found_skipped: 0
      local_lost_found_failed: 0
      striped_dirs_scanned: 0
      striped_dirs_repaired: 0
      striped_dirs_failed: 0
      striped_dirs_disabled: 0
      striped_dirs_skipped: 0
      striped_shards_scanned: 1639666
      striped_shards_repaired: 0
      striped_shards_failed: 0
      striped_shards_skipped: 4
      name_hash_repaired: 0
      linkea_overflow_cleared: 0
      success_count: 10
      run_time_phase1: 2387 seconds
      run_time_phase2: 960 seconds
      average_speed_phase1: 6019 items/sec
      average_speed_phase2: 4936 objs/sec
      average_speed_total: 5708 items/sec
      real_time_speed_phase1: N/A
      real_time_speed_phase2: N/A
      current_position: N/A
      

      cheers,
      robin

      Attachments

        Issue Links

          Activity

            [LU-11111] crash doing LFSCK: orph_index_insert()) ASSERTION( !(obj->mod_flags & ORPHAN_OBJ)
            spitzcor Cory Spitz added a comment -

            Since https://review.whamcloud.com/#/c/32959/ was landed after Andreas recommended, "to land this patch as-is, with the knowledge that there is a rare corner case that is missed," should we keep this ticket open or just track the remaining work somewhere else? Either way it seems that https://review.whamcloud.com/#/c/34174/ should be abandoned now.

            spitzcor Cory Spitz added a comment - Since https://review.whamcloud.com/#/c/32959/ was landed after Andreas recommended, "to land this patch as-is, with the knowledge that there is a rare corner case that is missed," should we keep this ticket open or just track the remaining work somewhere else? Either way it seems that https://review.whamcloud.com/#/c/34174/ should be abandoned now.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32959/
            Subject: LU-11111 lfsck: skip orphan processing
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 077570483e75e0610fd45149b926097547c434b8

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32959/ Subject: LU-11111 lfsck: skip orphan processing Project: fs/lustre-release Branch: master Current Patch Set: Commit: 077570483e75e0610fd45149b926097547c434b8

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34174
            Subject: LU-11111 lfsck: skip orphan processing
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 5a7be2a5cd89b67b55daa42725efee62b8e3fb57

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34174 Subject: LU-11111 lfsck: skip orphan processing Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 5a7be2a5cd89b67b55daa42725efee62b8e3fb57

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32935/
            Subject: LU-11111 lfsck: print correct parent FID
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 3583e1fd998c56977adde9abe9c9a9dc9bd88d58

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32935/ Subject: LU-11111 lfsck: print correct parent FID Project: fs/lustre-release Branch: master Current Patch Set: Commit: 3583e1fd998c56977adde9abe9c9a9dc9bd88d58
            laisiyao Lai Siyao added a comment - - edited

            See lustre manual: https://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.deactivating_mdt_ost

            After you copy all dir/files to MDT0, remove old dirs/files(or you can keep them, but they are not accessible after deactivation, and consume space of MDT0 and OSTs), and permanently deactivate MDT1-2, then you can stop MDT1-2.

            I'm working on LU-11301, will update you when I make progress.

            laisiyao Lai Siyao added a comment - - edited See lustre manual: https://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.deactivating_mdt_ost After you copy all dir/files to MDT0, remove old dirs/files(or you can keep them, but they are not accessible after deactivation, and consume space of MDT0 and OSTs), and permanently deactivate MDT1-2, then you can stop MDT1-2. I'm working on LU-11301 , will update you when I make progress.
            scadmin SC Admin added a comment -

            Hi Lai,

            LU-11301 has by far the highest priority for us.

            it hangs the entire cluster on average about once a week I guess. ie. cluster is down. also it's taken to evicting most clients recently too, which means a total reboot of clients, plus an unknown amount of jobs lost.
            these incidents tend to happen in batches, so we can sometimes go a 2 or 3 weeks without incident and then 3 crashes in a row.
            LU-11082 and LU-11301 have both had much the same symptoms.

            we have inherited striping on all dirs across all 3 MDTs. this is the only DNE config that makes sense to us. how do we know where users are going to put 100's of millions of files across 10's of thousands of dirs? we cannot possibly know this. if we set inherited dir striping then the MDTs work like OSTs, and on average we will get good balance across them. multiple MDTs had some file stat() performance benefits, and we assume performance for the rest of ops will improve over time as DNE matures. AFAIK nobody does more than the usual amount of mkdir/rmdir, which yes, is ultra-slow.

            so if this isn't a supported config or is something that's never going to work well, then a) please make that clear in docs b) there is a window of time while the filesystem isn't over half full yet, and we can take a few weeks or a month to rsync all files (probably <2 PiB) onto an unstriped top level dir on MDT0 (which we left there for this purpose), take a couple of days of downtime to tidy up, and then destroy MDT1,MDT2 and make the MDT0 pool bigger. ie. remove DNE entirely. that's the only other config that makes sense to us. will this approach work?

            on your points ->

            1. cool. thanks.
            2. we applied patch from LU-11062 so stack traces are back. will re-try at some stage.
            3. also being able to corrupt the filesystem with just some mkdir's
            4. LU-11301

            cheers,
            robin

            scadmin SC Admin added a comment - Hi Lai, LU-11301 has by far the highest priority for us. it hangs the entire cluster on average about once a week I guess. ie. cluster is down. also it's taken to evicting most clients recently too, which means a total reboot of clients, plus an unknown amount of jobs lost. these incidents tend to happen in batches, so we can sometimes go a 2 or 3 weeks without incident and then 3 crashes in a row. LU-11082 and LU-11301 have both had much the same symptoms. we have inherited striping on all dirs across all 3 MDTs. this is the only DNE config that makes sense to us. how do we know where users are going to put 100's of millions of files across 10's of thousands of dirs? we cannot possibly know this. if we set inherited dir striping then the MDTs work like OSTs, and on average we will get good balance across them. multiple MDTs had some file stat() performance benefits, and we assume performance for the rest of ops will improve over time as DNE matures. AFAIK nobody does more than the usual amount of mkdir/rmdir, which yes, is ultra-slow. so if this isn't a supported config or is something that's never going to work well, then a) please make that clear in docs b) there is a window of time while the filesystem isn't over half full yet, and we can take a few weeks or a month to rsync all files (probably <2 PiB) onto an unstriped top level dir on MDT0 (which we left there for this purpose), take a couple of days of downtime to tidy up, and then destroy MDT1,MDT2 and make the MDT0 pool bigger. ie. remove DNE entirely. that's the only other config that makes sense to us. will this approach work? on your points -> 1. cool. thanks. 2. we applied patch from LU-11062 so stack traces are back. will re-try at some stage. 3. also being able to corrupt the filesystem with just some mkdir's 4. LU-11301 cheers, robin

            Lai,

            you are wrong. This bug caused a unremovable files (ESTALE error) after MDT remount. MDT was kill an PENDING object, but normal namespace have a name and fid of deleted file. It was my start point.
            MDT hung during recovery is unfixed for now, and caused a situation when OSP start to take attributes from MDT/OSP when this target is not accessible. dt_sync is fixed, but osp_getattr not (i have hit it many times).

            shadow Alexey Lyashkov added a comment - Lai, you are wrong. This bug caused a unremovable files (ESTALE error) after MDT remount. MDT was kill an PENDING object, but normal namespace have a name and fid of deleted file. It was my start point. MDT hung during recovery is unfixed for now, and caused a situation when OSP start to take attributes from MDT/OSP when this target is not accessible. dt_sync is fixed, but osp_getattr not (i have hit it many times).
            laisiyao Lai Siyao added a comment -

            Robin, so there are four issues:
            1. MDT hung during recovery, which will be fixed by LU-10888.
            2. lfsck hung, I'd suggest you create a new ticket, and post all process backtraces.
            3. unremovable dirs/files after lfsck, LU-10677.
            4. MDT hung and won't umount, LU-10301.

            Which do you think has the top priority?

            BTW I'm curious how you use DNE, normally the operations that involves multiple MDTs should be rare, such as create/unlink/setattr striped directory, because you only need to create striped directory for certain directories that may contain lots of files, and normal directories are just plain. I saw your comment in LU-11301 mentioned inherited dir striping across all MDTs, do you mean dir default stripe is set, so all dirs are created as striped directory? And do you mkdir/rmdir frequently?

            laisiyao Lai Siyao added a comment - Robin, so there are four issues: 1. MDT hung during recovery, which will be fixed by LU-10888 . 2. lfsck hung, I'd suggest you create a new ticket, and post all process backtraces. 3. unremovable dirs/files after lfsck, LU-10677 . 4. MDT hung and won't umount, LU-10301 . Which do you think has the top priority? BTW I'm curious how you use DNE, normally the operations that involves multiple MDTs should be rare, such as create/unlink/setattr striped directory, because you only need to create striped directory for certain directories that may contain lots of files, and normal directories are just plain. I saw your comment in LU-11301 mentioned inherited dir striping across all MDTs, do you mean dir default stripe is set, so all dirs are created as striped directory? And do you mkdir/rmdir frequently?
            laisiyao Lai Siyao added a comment -

            Hi Robin, it'll take a couple of weeks to finish patch for LU-10888.

            What do you mean remove DNE? do you mean migrate all dir and files to MDT0 and remove other MDTs from the system? but 2.10 doesn't support striped dir migration yet, and patches to support it will land to master (check LU-4684) later.

            laisiyao Lai Siyao added a comment - Hi Robin, it'll take a couple of weeks to finish patch for LU-10888 . What do you mean remove DNE? do you mean migrate all dir and files to MDT0 and remove other MDTs from the system? but 2.10 doesn't support striped dir migration yet, and patches to support it will land to master (check LU-4684 ) later.
            scadmin SC Admin added a comment -

            we would like to lfsck our filesystem. we have a dozen or so new unremovable directories from the MDS hangs. unfortunately a bunch of users have noticed these.

            last time I ran a lfsck it hung and I had to reset the MDS, which risks causing more corruption.

            is there an ETA for a fix so that lfsck won't hang?

            alternatively, suggestions for ways that we can remove DNE from the filesystem would be appreciated.
            thanks.

            cheers,
            robin

            scadmin SC Admin added a comment - we would like to lfsck our filesystem. we have a dozen or so new unremovable directories from the MDS hangs. unfortunately a bunch of users have noticed these. last time I ran a lfsck it hung and I had to reset the MDS, which risks causing more corruption. is there an ETA for a fix so that lfsck won't hang? alternatively, suggestions for ways that we can remove DNE from the filesystem would be appreciated. thanks. cheers, robin
            fan.yong Fan Yong added a comment -

            Current namespace LFSCK logic does not check the ORPHAN flag, as to it may add the orphan object back to its original parent directory because of the stale misguided linkEA. That should be fixed. The fix should be done inside the LFSCK engine, instead of OSD. Because:

            1. The LFSCK is not only for namespace consistency, but also for layout consistency, and may be for more in the future. We can skip orphan object for namespace LFSCK, but still need to verify the layout consistency for orphan object to avoid accessing or destroying wrong OST-object(s).
            2. The trouble may happen for both ldiskfs and ZFS backend.
            fan.yong Fan Yong added a comment - Current namespace LFSCK logic does not check the ORPHAN flag, as to it may add the orphan object back to its original parent directory because of the stale misguided linkEA. That should be fixed. The fix should be done inside the LFSCK engine, instead of OSD. Because: The LFSCK is not only for namespace consistency, but also for layout consistency, and may be for more in the future. We can skip orphan object for namespace LFSCK, but still need to verify the layout consistency for orphan object to avoid accessing or destroying wrong OST-object(s). The trouble may happen for both ldiskfs and ZFS backend.

            People

              laisiyao Lai Siyao
              scadmin SC Admin
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: