[LU-7429] sanity-lfsck test_23c: @@@@@@ FAIL: (8) unexpected size Created: 16/Nov/15 Updated: 14/Nov/19 Resolved: 24/Jan/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0 |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | parinay v kondekar (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Configuration : 4 node setup 1 MDS, 1 OSS, 2 patchless Clients Server 2.7.62 git hash 049252c stdout.log == sanity-lfsck test 23c: LFSCK can repair dangling name entry (3) == 02:09:55 (1446948595) ##### The objectA has multiple hard links, one of them corresponding to the name entry_B. But there is something wrong for the name entry_B and cause entry_B to references non-exist object_C. In the first-stage scanning, the LFSCK will think the entry_B as dangling, and re-create the lost object_C. And then others modified the re-created object_C. When the LFSCK comes to the second-stage scanning, it will find that the former re-creating object_C maybe wrong and try to replace the object_C with the real object_A. But because object_C has been modified, so the LFSCK cannot replace it. ##### Inject failure stub on MDT0 to simulate dangling name entry fail_loc=0x1621 fail_loc=0 'ls' should fail because of dangling name entry fail_val=10 fail_loc=0x1602 Trigger namespace LFSCK to find out dangling name entry Started LFSCK on the device lustre-MDT0000: scrub namespace stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory Waiting 32 secs for update stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory Waiting 22 secs for update stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory Waiting 12 secs for update stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory Waiting 2 secs for update stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/d0/foo': No such file or directory Update not seen after 32s: wanted '0' got '' stat: cannot stat `/mnt/lustre/d23c.sanity-lfsck/guard': No such file or directory name: lfsck_namespace magic: 0xa0621a0b version: 2 status: scanning-phase2 flags: scanned-once,inconsistent param: create_mdtobj last_completed_time: N/A time_since_last_completed: N/A latest_start_time: 1446948595 time_since_latest_start: 33 seconds last_checkpoint_time: 1446948595 time_since_last_checkpoint: 33 seconds latest_start_position: 12, N/A, N/A last_checkpoint_position: 25037, N/A, N/A first_failure_position: N/A, N/A, N/A checked_phase1: 8 checked_phase2: 0 updated_phase1: 1 updated_phase2: 0 failed_phase1: 0 failed_phase2: 0 directories: 4 dirent_repaired: 2 linkea_repaired: 0 nlinks_repaired: 0 multiple_linked_checked: 1 multiple_linked_repaired: 0 unknown_inconsistency: 0 unmatched_pairs_repaired: 0 dangling_repaired: 0 multiple_referenced_repaired: 0 bad_file_type_repaired: 0 lost_dirent_repaired: 0 local_lost_found_scanned: 0 local_lost_found_moved: 0 local_lost_found_skipped: 0 local_lost_found_failed: 0 striped_dirs_scanned: 0 striped_dirs_repaired: 0 striped_dirs_failed: 0 striped_dirs_disabled: 0 striped_dirs_skipped: 0 striped_shards_scanned: 0 striped_shards_repaired: 0 striped_shards_failed: 0 striped_shards_skipped: 0 name_hash_repaired: 0 success_count: 0 run_time_phase1: 0 seconds run_time_phase2: 9 seconds average_speed_phase1: 8 items/sec average_speed_phase2: 0 objs/sec average_speed_total: 0 items/sec real_time_speed_phase1: N/A real_time_speed_phase2: 0 objs/sec current_position: [0x0:0x0:0x0] sanity-lfsck test_23c: @@@@@@ FAIL: (8) unexpected size Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4812:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:4843:error() = /usr/lib64/lustre/tests/sanity-lfsck.sh:3114:test_23c() = /usr/lib64/lustre/tests/test-framework.sh:5090:run_one() = /usr/lib64/lustre/tests/test-framework.sh:5127:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:4944:run_test() = /usr/lib64/lustre/tests/sanity-lfsck.sh:3137:main() Dumping lctl log to /tmp/test_logs/1446948549/sanity-lfsck.test_23c.*.1446948628.log FAIL 23c (34s) stderr.log pdsh@fre0203: fre0201: ssh exited with exit code 1 pdsh@fre0203: fre0202: ssh exited with exit code 1 Using TIMEOUT=20 excepting tests: Seagate-Bug-Id: MRP-3134 |
| Comments |
| Comment by parinay v kondekar (Inactive) [ 07/Jan/16 ] |
|
PTLDEBUG=-1 logs attached. |
| Comment by Gerrit Updater [ 15/Feb/16 ] |
|
kirtan.shetty (kirtan.shetty@seagate.com) uploaded a new patch: http://review.whamcloud.com/18452 |
| Comment by nasf (Inactive) [ 17/Feb/16 ] |
wait_update_facet client "stat $DIR/$tdir/d0/foo |
awk '/Size/ { print \\\$2 }'" "0" 32 || {
stat $DIR/$tdir/guard
$SHOW_NAMESPACE
error "(8) unexpected size"
}
Above logic does not expect the LFSCK to complete, instead, it hopes the LFSCK to find out the dangling name entry and re-create the lost MDT-object. Such repairing should happen during the "scanning-phase1". But according to the logs, the LFSCK has moved to the "scanning-phase2". So the expected repairing should have happened already. But the test results shows that it did NOT. So only waiting more time for the LFSCK may not fix the root issue, unless it happened at very rare corner, means that before the "$SHOW_NAMESPACE", it was in "phase1", but when "$SHOW_NAMESPACE", it moved to "phase2". If that is true, then we need to think why the NAMESPACE LFSCK run so slow. |
| Comment by Gerrit Updater [ 22/Jun/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/20916 |
| Comment by Gerrit Updater [ 11/Jul/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/20916/ |
| Comment by nasf (Inactive) [ 12/Jul/16 ] |
|
The patch has been landed to master. |
| Comment by Gerrit Updater [ 19/Jul/16 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/21412 |
| Comment by Gerrit Updater [ 27/Jul/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/21412/ |
| Comment by Gerrit Updater [ 09/Jan/17 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/24763 |
| Comment by nasf (Inactive) [ 09/Jan/17 ] |
|
Reopen the ticket for the pending patch https://review.whamcloud.com/24763 |
| Comment by Gerrit Updater [ 24/Jan/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/24763/ |