LFSCK phase II technical debts
(LU-4701)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.2 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.11.0, Lustre 2.10.6 |
| Type: | Technical task | Priority: | Critical |
| Reporter: | Andreas Dilger | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LFSCK | ||
| Rank (Obsolete): | 13609 |
| Description |
|
An existing filesystem on a 2.4.2 MDT that was previously upgraded from Lustre 2.1, 1.8, 1.6, has about 1.8M inodes in use: lfs df -i UUID Inodes IUsed IFree IUse% Mounted on myth-MDT0000_UUID 2621440 1837764 783676 70% /myth[MDT:0] myth-OST0000_UUID 921856 315326 606530 34% /myth[OST:0] myth-OST0001_UUID 475168 168933 306235 36% /myth[OST:1] myth-OST0002_UUID 715264 585400 129864 82% /myth[OST:2] myth-OST0003_UUID 688128 600027 88101 87% /myth[OST:3] myth-OST0004_UUID 921856 118677 803179 13% /myth[OST:4] filesystem summary: 2621440 1837764 783676 70% /myth Running "lctl lfsck_start -t namespace -M myth-MDT0000" it shows the following statistics on completion: lctl get_param mdd.*.lfsck_namespace mdd.myth-MDT0000.lfsck_namespace= name: lfsck_namespace magic: 0xa0629d03 version: 2 status: completed flags: scanned-once,inconsistent param: (null) time_since_last_completed: 3 seconds time_since_latest_start: 126 seconds time_since_last_checkpoint: 3 seconds latest_start_position: 13, N/A, N/A last_checkpoint_position: 2621440, N/A, N/A first_failure_position: N/A, N/A, N/A checked_phase1: 3688305 checked_phase2: 222 updated_phase1: 1762893 updated_phase2: 147 failed_phase1: 0 failed_phase2: 0 dirs: 92332 M-linked: 833 nlinks_repaired: 0 lost_found: 0 success_count: 3 run_time_phase1: 122 seconds run_time_phase2: 0 seconds average_speed_phase1: 30232 items/sec average_speed_phase2: 222 objs/sec real-time_speed_phase1: N/A real-time_speed_phase2: N/A current_position: N/A The number of inodes reported "checked_phase1" is about 2x the number of actual inodes in the filesystem, which looks like a bug if LFSCK is checking each inode twice. Also, the "updated_phase1" value is always showing almost all of the inodes are "updated", even if LFSCK is run multiple times. This seems like a possibly separate bug if this is caused by IGIF inodes, or other problems with older filesystems. |
| Comments |
| Comment by Andreas Dilger [ 17/Apr/14 ] |
|
Note that I haven't tested this on master or b2_5, but it definitely seems like a bug in 2.4. |
| Comment by Andreas Dilger [ 17/Apr/14 ] |
|
Fan Yong, could you please comment on whether this is just an LFSCK accounting problem (e.g. double counting of checked inodes), or if it is actually checking each inode twice (slowing everything down)? Also, is it correct that LFSCK is actually trying to repair every inode in my filesystem every time it is run, or am I misunderstanding what "updated_phase1: 1762893" means? |
| Comment by nasf (Inactive) [ 17/Apr/14 ] |
|
The "checked_phase1" counts: So it is almost equal to double of the real inodes count. The "M-linked" is the sum of the nlink of every multiple-linked object. S if a file has been scanned N times, then it will counted as N times, it seems not well understood for others. What is your expected result? As for the everything has been repaired every time, it should be a bug. I will investigate and fix it. |
| Comment by nasf (Inactive) [ 18/Apr/14 ] |
|
Andreas, would you please to verify that whether you have enabled DIRDATA on your 2.4.2 system or not? If not, then it is normal that "updated_phase1" shows something have been repaired every time; otherwise, it is better to offer me the -1 log on your MDS when run namespace LFSCK. Thanks! As for the "checked_phase1", which value is preferred: the real objects count? or the times of objects have been scanned (which is the current show). Because we use otable-based iteration + directory traversal, then almost every object will be scanned twice. |
| Comment by Andreas Dilger [ 19/Apr/14 ] |
|
I think if there are two different checks being done on the inode then they should be counted in two different counters. Similarly, for the updated_phase1 number, it would be better to have something like "missing_lma:" or "missing_fid_in_dirent:" or similar, so there is an easier way to track what is wrong with the filesystem. I don't mind to keep a summary counter for the whole pass, but it would be good to keep a counter for each type of corruption found and fixed. |
| Comment by Andreas Dilger [ 19/Apr/14 ] |
|
I checked, and you are correct that I did not have DIRDATA enabled on my system. I think in the Lustre 2.6 and later systems it could print a single warning at mount time that this should be enabled on MDTs that do not have it. |
| Comment by nasf (Inactive) [ 20/Apr/14 ] |
|
Here is the patch: |
| Comment by nasf (Inactive) [ 09/May/14 ] |
|
The patch has been landed to master. |
| Comment by Gerrit Updater [ 29/Sep/17 ] |
|
Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/29274 |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29274/ |
| Comment by Gerrit Updater [ 08/Aug/18 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32961 |
| Comment by Gerrit Updater [ 11/Sep/18 ] |
|
John L. Hammond (jhammond@whamcloud.com) merged in patch https://review.whamcloud.com/32961/ |