Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14511

oak-MDT0000: deleted inode referenced and aborted journal

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • None
    • Lustre 2.12.6
    • None
    • CentOS 7.9
    • 1
    • 9223372036854775807

    Description

      Hello, I think an old problem fo ours came back to haunt us, that I originally described in LU-11578, that was because we tried lfs migrate -m on a directory with early 2.10.x (then we stopped and then lfs migrate -m was disabled). Since then, we have upgraded Oak to Lustre 2.12, and just yesterday, we tried to upgrade from 2.12.5 to 2.12.6 by failing over the MDT but we cannot mount it anymore.

      It's also very likely that since the last umount, I had deleted some files in the bad directory in question /oak/stanford/groups/ruthm/sthiell/anaconda2.off to make some cleanup
      MDT worked fine before un-mounting.

      So at the moment, we can't mount oak-MDT0000 anymore as we hit the following problem:

      [2021-03-10T15:31:29-08:00] [29905.173864] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc^M
      [2021-03-10T15:31:29-08:00] [29905.322318] LustreError: 166-1: MGC10.0.2.51@o2ib5: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail^M
      [2021-03-10T15:31:29-08:00] [29905.336575] Lustre: Evicted from MGS (at MGC10.0.2.51@o2ib5_0) after server handle changed from 0xd5a7458f859bc1ba to 0xd5a7458fad8ca5e1^M
      [2021-03-10T15:31:38-08:00] [29914.525782] LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc^M
      [2021-03-10T15:31:38-08:00] [29914.709510] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 459882777^M
      [2021-03-10T15:31:38-08:00] [29914.722944] Aborting journal on device dm-4-8.^M
      [2021-03-10T15:31:38-08:00] [29914.728143] LDISKFS-fs (dm-4): Remounting filesystem read-only^M
      [2021-03-10T15:31:38-08:00] [29914.734658] LustreError: 8949:0:(osd_scrub.c:1758:osd_ios_lookup_one_len()) Fail to find #459882777 in lost+found (11/0): rc = -5^M
      [2021-03-10T15:31:38-08:00] [29914.752101] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060535^M
      [2021-03-10T15:31:38-08:00] [29914.766588] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060536^M
      [2021-03-10T15:31:38-08:00] [29914.779958] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060537^M
      [2021-03-10T15:31:39-08:00] [29914.793611] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060539^M
      [2021-03-10T15:31:39-08:00] [29914.806965] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060540^M
      [2021-03-10T15:31:39-08:00] [29914.820521] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060542^M
      [2021-03-10T15:31:39-08:00] [29914.834339] LDISKFS-fs error (device dm-4): ldiskfs_lookup:1817: inode #11: comm mount.lustre: deleted inode referenced: 627060544^M
      [2021-03-10T15:31:39-08:00] [29914.978593] Lustre: 8962:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)^M
      [2021-03-10T15:31:39-08:00] [29914.993510] LustreError: 8962:0:(genops.c:556:class_register_device()) oak-OST0133-osc-MDT0001: already exists, won't add^M
      [2021-03-10T15:31:39-08:00] [29915.005744] LustreError: 8962:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17^M
      [2021-03-10T15:31:39-08:00] [29915.018654] Lustre:    cmd=cf001 0:oak-OST0133-osc-MDT0001  1:osp  2:oak-MDT0001-mdtlov_UUID  ^M
      [2021-03-10T15:31:39-08:00] [29915.018654] ^M
      [2021-03-10T15:31:39-08:00] [29915.029929] LustreError: 3800:0:(mgc_request.c:599:do_requeue()) failed processing log: -17^M
      [2021-03-10T15:31:39-08:00] [29915.036492] LustreError: 8949:0:(llog.c:1398:llog_backup()) MGC10.0.2.51@o2ib5: failed to open backup logfile oak-MDT0000T: rc = -30^M
      [2021-03-10T15:31:39-08:00] [29915.036495] LustreError: 8949:0:(mgc_request.c:1879:mgc_llog_local_copy()) MGC10.0.2.51@o2ib5: failed to copy remote log oak-MDT0000: rc = -30^M
      [2021-03-10T15:31:39-08:00] [29915.046574] Lustre: oak-MDT0000: Not available for connect from 10.51.3.5@o2ib3 (not set up)^M
      [2021-03-10T15:31:39-08:00] [29915.076247] LustreError: 8963:0:(tgt_lastrcvd.c:1133:tgt_client_del()) oak-MDT0000: failed to update server data, skip client 9925e6e6-5de6-4 zeroing, rc -30^M
      [2021-03-10T15:31:39-08:00] [29915.127660] LustreError: 8963:0:(obd_config.c:559:class_setup()) setup oak-MDT0000 failed (-30)^M
      [2021-03-10T15:31:39-08:00] [29915.137373] Lustre:    cmd=cf003 0:oak-MDT0000  1:oak-MDT0000_UUID  2:0  3:oak-MDT0000-mdtlov  4:f  ^M
      [2021-03-10T15:31:39-08:00] [29915.137373] ^M
      [2021-03-10T15:31:39-08:00] [29915.149242] LustreError: 15c-8: MGC10.0.2.51@o2ib5: The configuration from log 'oak-MDT0000' failed (-30). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.^M
      [2021-03-10T15:31:39-08:00] [29915.174853] LustreError: 8949:0:(obd_mount_server.c:1397:server_start_targets()) failed to start server oak-MDT0000: -30^M
      [2021-03-10T15:31:39-08:00] [29915.187010] LustreError: 8949:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -30^M
      [2021-03-10T15:31:39-08:00] [29915.197797] LustreError: 8949:0:(obd_config.c:610:class_cleanup()) Device 316 not setup^M
      [2021-03-10T15:31:39-08:00] [29915.315800] Lustre: server umount oak-MDT0000 complete^M
      [2021-03-10T15:31:39-08:00] [29915.321537] LustreError: 8949:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-30)^M
      

      We have been running e2fsck -m 16 -f /dev/mapper/md1-rbod1-ssd-mdt0 for ~7 hours and pass1 is still not finished at this point (it's still running though). This MDT has 857,751,831 used inodes.

      [root@oak-md1-s2 ~]# e2fsck -m 16 -y  /dev/mapper/md1-rbod1-ssd-mdt0
      e2fsck 1.45.6.wc5 (09-Feb-2021)
      oak-MDT0000 contains a file system with errors, check forced.
      Pass 1: Checking inodes, blocks, and sizes
      [Thread 0] Scan group range [0, 3552)
      [Thread 1] Scan group range [3552, 7104)
      [Thread 2] Scan group range [7104, 10656)
      [Thread 3] Scan group range [10656, 14208)
      [Thread 4] Scan group range [14208, 17760)
      [Thread 5] Scan group range [17760, 21312)
      [Thread 6] Scan group range [21312, 24864)
      [Thread 7] Scan group range [24864, 28416)
      [Thread 8] Scan group range [28416, 31968)
      [Thread 9] Scan group range [31968, 35520)
      [Thread 10] Scan group range [35520, 39072)
      [Thread 11] Scan group range [39072, 42624)
      [Thread 12] Scan group range [42624, 46176)
      [Thread 13] Scan group range [46176, 49728)
      [Thread 14] Scan group range [49728, 53280)
      [Thread 15] Scan group range [53280, 57056)
      [Thread 15] Scanned group range [53280, 57056), inodes 9005176
      [Thread 3] Inode 459888758, i_size is 13547843328140, should be 0.  [Thread 3] Fix? yes
      
      [Thread 14] Scanned group range [49728, 53280), inodes 59487037
      [Thread 12] Scanned group range [42624, 46176), inodes 53210338
      [Thread 2] Scanned group range [7104, 10656), inodes 61985386
      [Thread 1] Scanned group range [3552, 7104), inodes 61737338
      [Thread 11] Scanned group range [39072, 42624), inodes 62681960
      

      This is our ldiskfs specs for oak-MDT0000:

      [root@oak-md1-s2 ~]# dumpe2fs -h /dev/mapper/md1-rbod1-ssd-mdt0 
      dumpe2fs 1.45.6.wc5 (09-Feb-2021)
      Filesystem volume name:   oak-MDT0000
      Last mounted on:          /
      Filesystem UUID:          0ed1cfdd-8e25-4b6b-9cb9-7be1e89d70ad
      Filesystem magic number:  0xEF53
      Filesystem revision #:    1 (dynamic)
      Filesystem features:      has_journal ext_attr resize_inode dir_index filetype mmp flex_bg dirdata sparse_super large_file huge_file uninit_bg dir_nlink quota project
      Filesystem flags:         signed_directory_hash 
      Default mount options:    user_xattr acl
      Filesystem state:         clean with errors
      Errors behavior:          Continue
      Filesystem OS type:       Linux
      Inode count:              1869611008
      Block count:              934803456
      Reserved block count:     46740172
      Free blocks:              354807380
      Free inodes:              1011859177
      First block:              0
      Block size:               4096
      Fragment size:            4096
      Reserved GDT blocks:      787
      Blocks per group:         16384
      Fragments per group:      16384
      Inodes per group:         32768
      Inode blocks per group:   4096
      Flex block group size:    16
      Filesystem created:       Mon Feb 13 12:36:07 2017
      Last mount time:          Wed Mar 10 15:33:05 2021
      Last write time:          Wed Mar 10 17:13:04 2021
      Mount count:              17
      Maximum mount count:      -1
      Last checked:             Tue Sep 10 06:37:13 2019
      Check interval:           0 (<none>)
      Lifetime writes:          249 TB
      Reserved blocks uid:      0 (user root)
      Reserved blocks gid:      0 (group root)
      First inode:              11
      Inode size:	          512
      Required extra isize:     28
      Desired extra isize:      28
      Journal inode:            8
      Default directory hash:   half_md4
      Directory Hash Seed:      be3bd996-8da4-4d22-80e4-e7a4c8ce22a0
      Journal backup:           inode blocks
      FS Error count:           16
      First error time:         Wed Mar 10 15:31:38 2021
      First error function:     ldiskfs_lookup
      First error line #:       1817
      First error inode #:      11
      First error block #:      0
      Last error time:          Wed Mar 10 15:32:03 2021
      Last error function:      ldiskfs_lookup
      Last error line #:        1817
      Last error inode #:       11
      Last error block #:       0
      MMP block number:         13560
      MMP update interval:      5
      User quota inode:         3
      Group quota inode:        4
      Project quota inode:      325
      Journal features:         journal_incompat_revoke
      Journal size:             4096M
      Journal length:           1048576
      Journal sequence:         0x4343e81b
      Journal start:            0
      MMP_block:
          mmp_magic: 0x4d4d50
          mmp_check_interval: 5
          mmp_sequence: 0xe24d4d50
          mmp_update_date: Thu Mar 11 02:02:50 2021
          mmp_update_time: 1615456970
          mmp_node_name: oak-md1-s2
          mmp_device_name: /dev/mapper/md1-rbod1-ssd-mdt0
      

      Attachments

        Issue Links

          Activity

            [LU-14511] oak-MDT0000: deleted inode referenced and aborted journal
            pjones Peter Jones added a comment -

            Good news - thanks!

            pjones Peter Jones added a comment - Good news - thanks!

            OI Scrub has completed without error. Thanks for your help, Alex!

            [root@oak-md1-s2 ~]# lctl get_param -n osd-ldiskfs.oak-MDT0000.oi_scrub
            name: OI_scrub
            magic: 0x4c5fd252
            oi_files: 64
            status: completed
            flags:
            param:
            time_since_last_completed: 772 seconds
            time_since_latest_start: 36738 seconds
            time_since_last_checkpoint: 772 seconds
            latest_start_position: 12
            last_checkpoint_position: 1869611009
            first_failure_position: N/A
            checked: 857721498
            updated: 4
            failed: 0
            prior_updated: 0
            noscrub: 23216
            igif: 3
            success_count: 1
            run_time: 35966 seconds
            average_speed: 23848 objects/sec
            real-time_speed: N/A
            current_position: N/A
            lf_scanned: 0
            lf_repaired: 0
            lf_failed: 0 
            sthiell Stephane Thiell added a comment - OI Scrub has completed without error. Thanks for your help, Alex! [root@oak-md1-s2 ~]# lctl get_param -n osd-ldiskfs.oak-MDT0000.oi_scrub name: OI_scrub magic: 0x4c5fd252 oi_files: 64 status: completed flags: param: time_since_last_completed: 772 seconds time_since_latest_start: 36738 seconds time_since_last_checkpoint: 772 seconds latest_start_position: 12 last_checkpoint_position: 1869611009 first_failure_position: N/A checked: 857721498 updated: 4 failed: 0 prior_updated: 0 noscrub: 23216 igif: 3 success_count: 1 run_time: 35966 seconds average_speed: 23848 objects/sec real-time_speed: N/A current_position: N/A lf_scanned: 0 lf_repaired: 0 lf_failed: 0

            Thanks Alex!

            I was able to mount MDT0000 after renaming lost+found.

            Mar 11 12:23:07 oak-md1-s2 kernel: Lustre: Lustre: Build Version: 2.12.6_1_g68bbfcf
            Mar 11 12:23:08 oak-md1-s2 kernel: LDISKFS-fs (dm-2): warning: mounting fs with errors, running e2fsck is recommended
            Mar 11 12:23:08 oak-md1-s2 kernel: LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc
            Mar 11 12:23:08 oak-md1-s2 kernel: LustreError: 137-5: oak-MDT0001_UUID: not available for connect from 10.12.4.77@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server.
            ...
            Mar 11 12:24:12 oak-md1-s2 kernel: LustreError: Skipped 1411 previous similar messages
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: Imperative Recovery not enabled, recovery window 300-900
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDD0000: changelog on
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: in recovery but waiting for the first client to connect
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: Will be in recovery for at least 5:00, or until 1930 clients reconnect
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: nosquash_nids set to 10.49.0.[11-12]@o2ib1 10.50.0.1@o2ib2 10.50.0.[11-12]@o2ib2 10.51.0.[1-2]@o2ib3 10.51.0.[11-18]@o2ib3 10.0.2.[1-3]@o2ib5 10.0.2.[51-58]@o2ib5 10.0.2.[66-77]@o2ib
            Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: root_squash is set to 99:99
            

            OI scrub is currently running, I'm waiting for it to complete to really confirm that we're good to go.

            [root@oak-md1-s2 ~]# lctl get_param -n osd-ldiskfs.oak-MDT0000.oi_scrub
            name: OI_scrub
            magic: 0x4c5fd252
            oi_files: 64
            status: scanning
            flags: recreated,inconsistent,auto,upgrade
            param:
            time_since_last_completed: N/A
            time_since_latest_start: 1233 seconds
            time_since_last_checkpoint: 33 seconds
            latest_start_position: 12
            last_checkpoint_position: 37267791
            first_failure_position: N/A
            checked: 25501392
            updated: 4
            failed: 0
            prior_updated: 0
            noscrub: 0
            igif: 2
            success_count: 0
            run_time: 1233 seconds
            average_speed: 20682 objects/sec
            real-time_speed: 54703 objects/sec
            current_position: 39489872
            scrub_in_prior: no
            scrub_full_speed: yes
            partial_scan: no
            lf_scanned: 0
            lf_repaired: 0
            lf_failed: 0
            
            sthiell Stephane Thiell added a comment - Thanks Alex! I was able to mount MDT0000 after renaming lost+found. Mar 11 12:23:07 oak-md1-s2 kernel: Lustre: Lustre: Build Version: 2.12.6_1_g68bbfcf Mar 11 12:23:08 oak-md1-s2 kernel: LDISKFS-fs (dm-2): warning: mounting fs with errors, running e2fsck is recommended Mar 11 12:23:08 oak-md1-s2 kernel: LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc Mar 11 12:23:08 oak-md1-s2 kernel: LustreError: 137-5: oak-MDT0001_UUID: not available for connect from 10.12.4.77@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. ... Mar 11 12:24:12 oak-md1-s2 kernel: LustreError: Skipped 1411 previous similar messages Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: Imperative Recovery not enabled, recovery window 300-900 Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDD0000: changelog on Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: in recovery but waiting for the first client to connect Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: Will be in recovery for at least 5:00, or until 1930 clients reconnect Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: nosquash_nids set to 10.49.0.[11-12]@o2ib1 10.50.0.1@o2ib2 10.50.0.[11-12]@o2ib2 10.51.0.[1-2]@o2ib3 10.51.0.[11-18]@o2ib3 10.0.2.[1-3]@o2ib5 10.0.2.[51-58]@o2ib5 10.0.2.[66-77]@o2ib Mar 11 12:24:38 oak-md1-s2 kernel: Lustre: oak-MDT0000: root_squash is set to 99:99 OI scrub is currently running, I'm waiting for it to complete to really confirm that we're good to go. [root@oak-md1-s2 ~]# lctl get_param -n osd-ldiskfs.oak-MDT0000.oi_scrub name: OI_scrub magic: 0x4c5fd252 oi_files: 64 status: scanning flags: recreated,inconsistent,auto,upgrade param: time_since_last_completed: N/A time_since_latest_start: 1233 seconds time_since_last_checkpoint: 33 seconds latest_start_position: 12 last_checkpoint_position: 37267791 first_failure_position: N/A checked: 25501392 updated: 4 failed: 0 prior_updated: 0 noscrub: 0 igif: 2 success_count: 0 run_time: 1233 seconds average_speed: 20682 objects/sec real-time_speed: 54703 objects/sec current_position: 39489872 scrub_in_prior: no scrub_full_speed: yes partial_scan: no lf_scanned: 0 lf_repaired: 0 lf_failed: 0

            what do you think?

            sounds like a good idea. unfortunately I don't remember all the details of OI scrub and can't guarantee it doesn't touch that but should be fine to try, IMO.

            bzzz Alex Zhuravlev added a comment - what do you think? sounds like a good idea. unfortunately I don't remember all the details of OI scrub and can't guarantee it doesn't touch that but should be fine to try, IMO.

            this way the inodes mentioned by ldiskfs_lookup:1817 should become "normal" orphans and hopefully ldiskfs stop to remounting read-only.

            bzzz Alex Zhuravlev added a comment - this way the inodes mentioned by ldiskfs_lookup:1817 should become "normal" orphans and hopefully ldiskfs stop to remounting read-only.

            If there is an EIO on #459882777, I've been thinking trying this:

            mount as ldiskfs
            mv lost+found lost+found.old
            mkdir lost+found
            umount

            mount as lustre

            what do you think?

            sthiell Stephane Thiell added a comment - If there is an EIO on #459882777, I've been thinking trying this: mount as ldiskfs mv lost+found lost+found.old mkdir lost+found umount mount as lustre what do you think?

            one (relatively fast, but a bit risky) option is to modify those inodes to "undelete" them.
            if dtime is set to 0 and nlink is set to 0, then ldiskfs shouldn't complain about deleted inodes, AFAICS.
            debugfs has a command to do that interactively:

            # debugfs -w <your device>
            debugfs:  mi <11>
                                      Mode    [040700] 
                                   User ID    [0] 
                                  Group ID    [0] 
                                      Size    [12288] 
                             Creation time    [1615493043] 
                         Modification time    [1615493043] 
                               Access time    [1615493043] 
                             Deletion time    [0] 0
                                Link count    [2] 2
                          Block count high    [0] 
                               Block count    [24] 
                                File flags    [0x80000] 
                                Generation    [0x0] 
                                  File acl    [0] 
                       High 32bits of size    [0] 
                          Fragment address    [0] 
                           Direct Block #0    [127754] 
                           Direct Block #1    [4] 
                           Direct Block #2    [0] 
                           Direct Block #3    [0] 
                           Direct Block #4    [12] 
                           Direct Block #5    [4388] 
                           Direct Block #6    [0] 
                           Direct Block #7    [0] 
                           Direct Block #8    [0] 
                           Direct Block #9    [0] 
                          Direct Block #10    [0] 
                          Direct Block #11    [0] 
                            Indirect Block    [0] 
                     Double Indirect Block    [0] 
                     Triple Indirect Block    [0] 
            debugfs:  
            

            notice lines with "Link count" and "Deletion time"

            bzzz Alex Zhuravlev added a comment - one (relatively fast, but a bit risky) option is to modify those inodes to "undelete" them. if dtime is set to 0 and nlink is set to 0, then ldiskfs shouldn't complain about deleted inodes, AFAICS. debugfs has a command to do that interactively: # debugfs -w <your device> debugfs: mi <11> Mode [040700] User ID [0] Group ID [0] Size [12288] Creation time [1615493043] Modification time [1615493043] Access time [1615493043] Deletion time [0] 0 Link count [2] 2 Block count high [0] Block count [24] File flags [0x80000] Generation [0x0] File acl [0] High 32bits of size [0] Fragment address [0] Direct Block #0 [127754] Direct Block #1 [4] Direct Block #2 [0] Direct Block #3 [0] Direct Block #4 [12] Direct Block #5 [4388] Direct Block #6 [0] Direct Block #7 [0] Direct Block #8 [0] Direct Block #9 [0] Direct Block #10 [0] Direct Block #11 [0] Indirect Block [0] Double Indirect Block [0] Triple Indirect Block [0] debugfs: notice lines with "Link count" and "Deletion time"

            Lustre 2.12.6 + only 1 patch (LU-13609 mgs: fix config_log buffer handling)

            sthiell Stephane Thiell added a comment - Lustre 2.12.6 + only 1 patch ( LU-13609 mgs: fix config_log buffer handling)

            what exact version of Lustre you're using now?

            bzzz Alex Zhuravlev added a comment - what exact version of Lustre you're using now?

            Current e2fsck status is as follow:

             

            [root@oak-md1-s2 ~]# e2fsck -m 16 -y  /dev/mapper/md1-rbod1-ssd-mdt0
            e2fsck 1.45.6.wc5 (09-Feb-2021)
            oak-MDT0000 contains a file system with errors, check forced.
            Pass 1: Checking inodes, blocks, and sizes
            [Thread 0] Scan group range [0, 3552)
            [Thread 1] Scan group range [3552, 7104)
            [Thread 2] Scan group range [7104, 10656)
            [Thread 3] Scan group range [10656, 14208)
            [Thread 4] Scan group range [14208, 17760)
            [Thread 5] Scan group range [17760, 21312)
            [Thread 6] Scan group range [21312, 24864)
            [Thread 7] Scan group range [24864, 28416)
            [Thread 8] Scan group range [28416, 31968)
            [Thread 9] Scan group range [31968, 35520)
            [Thread 10] Scan group range [35520, 39072)
            [Thread 11] Scan group range [39072, 42624)
            [Thread 12] Scan group range [42624, 46176)
            [Thread 13] Scan group range [46176, 49728)
            [Thread 14] Scan group range [49728, 53280)
            [Thread 15] Scan group range [53280, 57056)
            [Thread 15] Scanned group range [53280, 57056), inodes 9005176
            [Thread 3] Inode 459888758, i_size is 13547843328140, should be 0.  [Thread 3] Fix? yes
            
            
            [Thread 14] Scanned group range [49728, 53280), inodes 59487037
            [Thread 12] Scanned group range [42624, 46176), inodes 53210338
            [Thread 2] Scanned group range [7104, 10656), inodes 61985386
            [Thread 1] Scanned group range [3552, 7104), inodes 61737338
            [Thread 11] Scanned group range [39072, 42624), inodes 62681960
            [Thread 13] Scanned group range [46176, 49728), inodes 58501633
            [Thread 10] Scanned group range [35520, 39072), inodes 66242954
            [Thread 9] Scanned group range [31968, 35520), inodes 77570424 

            There is no progress anymore.
            I can see that a thread is blocked in a lock:

            Thread 7 (Thread 0x7f90c6ffd700 (LWP 4352)):
            #0  0x00007f90d37ef39e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
            #1  0x000000000042d23b in e2fsck_write_inode ()
            #2  0x000000000041915f in check_blocks ()
            #3  0x000000000041c565 in e2fsck_pass1_run ()
            #4  0x000000000041fa0f in e2fsck_pass1_thread ()
            #5  0x00007f90d37ebea5 in start_thread () from /lib64/libpthread.so.0
            #6  0x00007f90d351496d in clone () from /lib64/libc.so.6
            

            the other threads seem to be doing something but it's unclear how long it will take:

            Thread 6 (Thread 0x7f90c67fc700 (LWP 4353)):
            #0  0x00007f90d356aa18 in __memmove_ssse3 () from /lib64/libc.so.6
            #1  0x00000000004328d0 in insert_refcount_el ()
            #2  0x0000000000432cf9 in ea_refcount_store ()
            #3  0x0000000000419f92 in check_blocks ()
            #4  0x000000000041a80b in process_inodes.part.45 ()
            #5  0x000000000041c290 in e2fsck_pass1_run ()
            #6  0x000000000041fa0f in e2fsck_pass1_thread ()
            #7  0x00007f90d37ebea5 in start_thread () from /lib64/libpthread.so.0
            #8  0x00007f90d351496d in clone () from /lib64/libc.so.6
            

            I would like to try my idea of getting mounting the filesystem as ldiskfs and getting rid of old inodes from lost+found. The filesystem state is clean with errors, so I guess I can do it. Then, I would start it as lustre again. It would be nice to have your advise on this plan. Thanks!

            sthiell Stephane Thiell added a comment - Current e2fsck status is as follow:   [root@oak-md1-s2 ~]# e2fsck -m 16 -y  /dev/mapper/md1-rbod1-ssd-mdt0 e2fsck 1.45.6.wc5 (09-Feb-2021) oak-MDT0000 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes [Thread 0] Scan group range [0, 3552) [Thread 1] Scan group range [3552, 7104) [Thread 2] Scan group range [7104, 10656) [Thread 3] Scan group range [10656, 14208) [Thread 4] Scan group range [14208, 17760) [Thread 5] Scan group range [17760, 21312) [Thread 6] Scan group range [21312, 24864) [Thread 7] Scan group range [24864, 28416) [Thread 8] Scan group range [28416, 31968) [Thread 9] Scan group range [31968, 35520) [Thread 10] Scan group range [35520, 39072) [Thread 11] Scan group range [39072, 42624) [Thread 12] Scan group range [42624, 46176) [Thread 13] Scan group range [46176, 49728) [Thread 14] Scan group range [49728, 53280) [Thread 15] Scan group range [53280, 57056) [Thread 15] Scanned group range [53280, 57056), inodes 9005176 [Thread 3] Inode 459888758, i_size is 13547843328140, should be 0.  [Thread 3] Fix? yes [Thread 14] Scanned group range [49728, 53280), inodes 59487037 [Thread 12] Scanned group range [42624, 46176), inodes 53210338 [Thread 2] Scanned group range [7104, 10656), inodes 61985386 [Thread 1] Scanned group range [3552, 7104), inodes 61737338 [Thread 11] Scanned group range [39072, 42624), inodes 62681960 [Thread 13] Scanned group range [46176, 49728), inodes 58501633 [Thread 10] Scanned group range [35520, 39072), inodes 66242954 [Thread 9] Scanned group range [31968, 35520), inodes 77570424 There is no progress anymore. I can see that a thread is blocked in a lock: Thread 7 (Thread 0x7f90c6ffd700 (LWP 4352)): #0 0x00007f90d37ef39e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x000000000042d23b in e2fsck_write_inode () #2 0x000000000041915f in check_blocks () #3 0x000000000041c565 in e2fsck_pass1_run () #4 0x000000000041fa0f in e2fsck_pass1_thread () #5 0x00007f90d37ebea5 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f90d351496d in clone () from /lib64/libc.so.6 the other threads seem to be doing something but it's unclear how long it will take: Thread 6 (Thread 0x7f90c67fc700 (LWP 4353)): #0 0x00007f90d356aa18 in __memmove_ssse3 () from /lib64/libc.so.6 #1 0x00000000004328d0 in insert_refcount_el () #2 0x0000000000432cf9 in ea_refcount_store () #3 0x0000000000419f92 in check_blocks () #4 0x000000000041a80b in process_inodes.part.45 () #5 0x000000000041c290 in e2fsck_pass1_run () #6 0x000000000041fa0f in e2fsck_pass1_thread () #7 0x00007f90d37ebea5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f90d351496d in clone () from /lib64/libc.so.6 I would like to try my idea of getting mounting the filesystem as ldiskfs and getting rid of old inodes from lost+found. The filesystem state is clean with errors, so I guess I can do it. Then, I would start it as lustre again. It would be nice to have your advise on this plan. Thanks!

            People

              bzzz Alex Zhuravlev
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: