[LU-2142] "lctl lfsck_start" should start a scrub - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.3.0, Lustre 2.4.0
Affects Version/s: Lustre 2.3.0, Lustre 2.4.0
Labels:
None

Severity:
3
Rank (Obsolete):
5150

Description

Running "lctl lfsck_start -M

{fsname}

-MDT0000" should start a scrub, unless one is already running. However, if the scrub was previously run and completed (leaving last_checkpoint_position == inode_count, it appears a new scrub will not be run because the start position is not reset at the end of the previous lfsck run or the start of the new run:

latest_start_position: 143392770
last_checkpoint_position: 143392769

It makes sense to restart the scrub at the last checkpoint position if it didn't complete for some reason, but if latest_start_position >= inode_count then the start position should be reset to start again. Both Cliff and I were confused by the current behaviour, and it took us a while to determine that "-r" was needed, and I expect that most users will have the same problem. The "-r" option should only be needed in case the admin has to handle some unusual condition where a previous scrub was interrupted, but a new full scrub is desired.

Attachments

Activity

[LU-2142] "lctl lfsck_start" should start a scrub

Andreas Dilger added a comment - 22/Oct/12 4:51 PM

Fan Yong, you are correct. This patch fixes the problem. I must have been testing on my system after rebuilding, but not reloading the modules.

Andreas Dilger added a comment - 22/Oct/12 4:51 PM Fan Yong, you are correct. This patch fixes the problem. I must have been testing on my system after rebuilding, but not reloading the modules.

nasf (Inactive) added a comment - 21/Oct/12 4:19 AM

This is the output from myself test against the latest master branch (top ID I2ff03a611267292d0cd6a465c1eb14023516234b), containing the patch for ~~LU-2142~~ (ID I5b8e9ee51ccbf95ed131b963389c4ecfb92b9035):

[root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub 
name: OI scrub
magic: 0x4c5fd252
oi_files: 64
status: init
flags:
param:
time_since_last_completed: N/A
time_since_latest_start: N/A
time_since_last_checkpoint: N/A
latest_start_position: N/A
last_checkpoint_position: N/A
first_failure_position: N/A
checked: 0
updated: 0
failed: 0
prior_updated: 0
noscrub: 0
igif: 0
success_count: 0
run_time: 0 seconds
average_speed: 0 objects/sec
real-time_speed: N/A
current_position: N/A
[root@RHEL6-nasf-CSW tests]# ../utils/lctl lfsck_start -M lustre-MDT0000
Started LFSCK on the MDT device lustre-MDT0000.
[root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub 
name: OI scrub
magic: 0x4c5fd252
oi_files: 64
status: completed
flags:
param:
time_since_last_completed: 3 seconds
time_since_latest_start: 3 seconds
time_since_last_checkpoint: 3 seconds
latest_start_position: 11
last_checkpoint_position: 100001
first_failure_position: N/A
checked: 206
updated: 0
failed: 0
prior_updated: 0
noscrub: 38
igif: 168
success_count: 1
run_time: 0 seconds
average_speed: 206 objects/sec
real-time_speed: N/A
current_position: N/A
[root@RHEL6-nasf-CSW tests]# 
[root@RHEL6-nasf-CSW tests]# 
[root@RHEL6-nasf-CSW tests]# 
[root@RHEL6-nasf-CSW tests]# ../utils/lctl lfsck_start -M lustre-MDT0000
Started LFSCK on the MDT device lustre-MDT0000.
[root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub 
name: OI scrub
magic: 0x4c5fd252
oi_files: 64
status: completed
flags:
param:
time_since_last_completed: 1 seconds
time_since_latest_start: 1 seconds
time_since_last_checkpoint: 1 seconds
latest_start_position: 11
last_checkpoint_position: 100001
first_failure_position: N/A
checked: 206
updated: 0
failed: 0
prior_updated: 0
noscrub: 0
igif: 206
success_count: 2
run_time: 0 seconds
average_speed: 206 objects/sec
real-time_speed: N/A
current_position: N/A
[root@RHEL6-nasf-CSW tests]#

As you can see, repeatedly run OI scrub by "lctl lfsck_start" can repeatedly trigger OI scrub as our expectation. The condition to re-trigger OI scrub is: the former OI scrub completed "status: completed".

You can judge whether the OI scrub re-triggered by checking the item "checked:": if it is "0", then no re-triggered; otherwise, it is re-triggered.

On the other hand, the OI scrub may skip the new created inodes (only once) since last OI scrub run. So the item "checked:" may be not the same as the real allocated inodes count.

So Andreas, would you please describe in detail which operations you did to reproduce the issues? Then I can analysis what happened. Thanks!

nasf (Inactive) added a comment - 21/Oct/12 4:19 AM This is the output from myself test against the latest master branch (top ID I2ff03a611267292d0cd6a465c1eb14023516234b), containing the patch for LU-2142 (ID I5b8e9ee51ccbf95ed131b963389c4ecfb92b9035): [root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub name: OI scrub magic: 0x4c5fd252 oi_files: 64 status: init flags: param: time_since_last_completed: N/A time_since_latest_start: N/A time_since_last_checkpoint: N/A latest_start_position: N/A last_checkpoint_position: N/A first_failure_position: N/A checked: 0 updated: 0 failed: 0 prior_updated: 0 noscrub: 0 igif: 0 success_count: 0 run_time: 0 seconds average_speed: 0 objects/sec real-time_speed: N/A current_position: N/A [root@RHEL6-nasf-CSW tests]# ../utils/lctl lfsck_start -M lustre-MDT0000 Started LFSCK on the MDT device lustre-MDT0000. [root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub name: OI scrub magic: 0x4c5fd252 oi_files: 64 status: completed flags: param: time_since_last_completed: 3 seconds time_since_latest_start: 3 seconds time_since_last_checkpoint: 3 seconds latest_start_position: 11 last_checkpoint_position: 100001 first_failure_position: N/A checked: 206 updated: 0 failed: 0 prior_updated: 0 noscrub: 38 igif: 168 success_count: 1 run_time: 0 seconds average_speed: 206 objects/sec real-time_speed: N/A current_position: N/A [root@RHEL6-nasf-CSW tests]# [root@RHEL6-nasf-CSW tests]# [root@RHEL6-nasf-CSW tests]# [root@RHEL6-nasf-CSW tests]# ../utils/lctl lfsck_start -M lustre-MDT0000 Started LFSCK on the MDT device lustre-MDT0000. [root@RHEL6-nasf-CSW tests]# cat /proc/fs/lustre/osd-ldiskfs/lustre-MDT0000/oi_scrub name: OI scrub magic: 0x4c5fd252 oi_files: 64 status: completed flags: param: time_since_last_completed: 1 seconds time_since_latest_start: 1 seconds time_since_last_checkpoint: 1 seconds latest_start_position: 11 last_checkpoint_position: 100001 first_failure_position: N/A checked: 206 updated: 0 failed: 0 prior_updated: 0 noscrub: 0 igif: 206 success_count: 2 run_time: 0 seconds average_speed: 206 objects/sec real-time_speed: N/A current_position: N/A [root@RHEL6-nasf-CSW tests]# As you can see, repeatedly run OI scrub by "lctl lfsck_start" can repeatedly trigger OI scrub as our expectation. The condition to re-trigger OI scrub is: the former OI scrub completed "status: completed". You can judge whether the OI scrub re-triggered by checking the item "checked:": if it is "0", then no re-triggered; otherwise, it is re-triggered. On the other hand, the OI scrub may skip the new created inodes (only once) since last OI scrub run. So the item "checked:" may be not the same as the real allocated inodes count. So Andreas, would you please describe in detail which operations you did to reproduce the issues? Then I can analysis what happened. Thanks!

Andreas Dilger added a comment - 20/Oct/12 4:26 AM

Fan Yong, was there another patch landed? It seemed in my testing that this didn't actually fix the problem. As previously stated, it appears that LFSCK is starter, but since the starting inode is not reset, them LFSCK immediately exits without doing anything...

Running "lfsck -r" appears to actually runs check, but not "lfsck" by itself does not appear to start a new scrub.

Andreas Dilger added a comment - 20/Oct/12 4:26 AM Fan Yong, was there another patch landed? It seemed in my testing that this didn't actually fix the problem. As previously stated, it appears that LFSCK is starter, but since the starting inode is not reset, them LFSCK immediately exits without doing anything... Running "lfsck -r" appears to actually runs check, but not "lfsck" by itself does not appear to start a new scrub.

nasf (Inactive) added a comment - 19/Oct/12 2:19 AM - edited

The issue has been fixed as Andreas suggestion.

nasf (Inactive) added a comment - 19/Oct/12 2:19 AM - edited The issue has been fixed as Andreas suggestion.

Andreas Dilger added a comment - 12/Oct/12 7:52 PM

I tested this patch by hand (on master, where it was landed after b2_3 where I assumed it had been tested), but it doesn't appear to have fixed lctl lfsck_start to actually run a scrub when asked. It now reports "Started LFSCK" every time:

# lctl lfsck_start -M testfs-MDT0000 -s 4
Started LFSCK on the MDT device testfs-MDT0000.

But it doesn't actually seem to run a scrub (-s 4 to make the scrub slow enough to watch:

lime_since_last_completed: 5 seconds
time_since_latest_start: 5 seconds
time_since_last_checkpoint: 5 seconds
latest_start_position: 50002
last_checkpoint_position: 50001
success_count: 17
run_time: 32 seconds

It resets the start time, but not latest_start_position or the run time, so the scrub takes zero seconds to "finish" but doesn't actually do anything. Running with the "-r" option does seem to start a full scrub:

time_since_last_completed: 88 seconds
time_since_latest_start: 10 seconds
time_since_last_checkpoint: 10 seconds
latest_start_position: 11
last_checkpoint_position: N/A
run_time: 10 seconds

But I would think that lctl lfsck_start should actually start a scrub, like the command is called, instead of only doing so if -r is given. If there is already a scrub running, it should continue to run, but if one is not running a new full scrub should be started...

Seems the patch isn't quite working yet.

Andreas Dilger added a comment - 12/Oct/12 7:52 PM I tested this patch by hand (on master, where it was landed after b2_3 where I assumed it had been tested), but it doesn't appear to have fixed lctl lfsck_start to actually run a scrub when asked. It now reports "Started LFSCK" every time: # lctl lfsck_start -M testfs-MDT0000 -s 4 Started LFSCK on the MDT device testfs-MDT0000. But it doesn't actually seem to run a scrub ( -s 4 to make the scrub slow enough to watch: lime_since_last_completed: 5 seconds time_since_latest_start: 5 seconds time_since_last_checkpoint: 5 seconds latest_start_position: 50002 last_checkpoint_position: 50001 success_count: 17 run_time: 32 seconds It resets the start time, but not latest_start_position or the run time, so the scrub takes zero seconds to "finish" but doesn't actually do anything. Running with the "-r" option does seem to start a full scrub: time_since_last_completed: 88 seconds time_since_latest_start: 10 seconds time_since_last_checkpoint: 10 seconds latest_start_position: 11 last_checkpoint_position: N/A run_time: 10 seconds But I would think that lctl lfsck_start should actually start a scrub, like the command is called, instead of only doing so if -r is given. If there is already a scrub running, it should continue to run, but if one is not running a new full scrub should be started... Seems the patch isn't quite working yet.

nasf (Inactive) added a comment - 11/Oct/12 9:40 PM

Patch for master:

http://review.whamcloud.com/#change,4250

nasf (Inactive) added a comment - 11/Oct/12 9:40 PM Patch for master: http://review.whamcloud.com/#change,4250

Andreas Dilger added a comment - 11/Oct/12 12:20 PM

Patch for b2_3 is http://review.whamcloud.com/4252

Andreas Dilger added a comment - 11/Oct/12 12:20 PM Patch for b2_3 is http://review.whamcloud.com/4252

nasf (Inactive) added a comment - 10/Oct/12 11:19 PM

If the OI scrub scanning policy is adjusted as above, we need consider more. For example:

The last OI scrub scanning finished at the ino# 100'000. And then some new file is created, its ino# may be larger than the last OI scrub finished position, such as 100'001, it also may reuse some deleted inode, so the ino# may be smaller than the last OI scrub finished position, such as 50,001. Under such case, if the system admin run OI scrub again, it will cause different OI scrub behavior: for former one, it is continue scan from 100'000, and finished at the ino# 100'001; for later one, it will reset the scanning from the device beginning, and re-scan the whole MDT device. So from the sysadmin view, the OI scrub behavior become unpredictable. I do not think it is expected.

So I suggest to use "-r" explicitly to reset the scanning position. If someone wants to re-run OI scrub before former instance finished, he/she can stop current OI scrub explicitly by "lctl lfsck_stop" firstly, then runs OI scrub again by "lctl lfsck_start -r". I do not think it is so trouble.

nasf (Inactive) added a comment - 10/Oct/12 11:19 PM If the OI scrub scanning policy is adjusted as above, we need consider more. For example: The last OI scrub scanning finished at the ino# 100'000. And then some new file is created, its ino# may be larger than the last OI scrub finished position, such as 100'001, it also may reuse some deleted inode, so the ino# may be smaller than the last OI scrub finished position, such as 50,001. Under such case, if the system admin run OI scrub again, it will cause different OI scrub behavior: for former one, it is continue scan from 100'000, and finished at the ino# 100'001; for later one, it will reset the scanning from the device beginning, and re-scan the whole MDT device. So from the sysadmin view, the OI scrub behavior become unpredictable. I do not think it is expected. So I suggest to use "-r" explicitly to reset the scanning position. If someone wants to re-run OI scrub before former instance finished, he/she can stop current OI scrub explicitly by "lctl lfsck_stop" firstly, then runs OI scrub again by "lctl lfsck_start -r". I do not think it is so trouble.

People

Assignee:: nasf (Inactive)

Reporter:: Andreas Dilger

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 10/Oct/12 10:03 PM

Updated:: 22/Oct/12 4:51 PM

Resolved:: 19/Oct/12 2:19 AM