[LU-2694] lfsck in e2fsprogs is out of date Created: 28/Jan/13 Updated: 22/Mar/13 Resolved: 12/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0, Lustre 2.1.5 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Niu Yawei (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 6283 | ||||||||
| Description |
|
Looks the old lfsck in e2fsprogs hasn't been actively maintained for quite some time, and it can't work with the latest lustre now. One obvious defect is: lustre has changed the objects directory on OST after FID-on-OST landing, the lfsck is still searching objects under old O/0. |
| Comments |
| Comment by Niu Yawei (Inactive) [ 04/Feb/13 ] |
|
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/baa5681c-6c5c-11e2-91d6-52540035b04c. |
| Comment by Jian Yu [ 25/Feb/13 ] |
|
Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/176 lfsck also failed with the same issue: https://maloo.whamcloud.com/test_sets/6a4f7e3e-7d78-11e2-85d0-52540035b04c |
| Comment by Peter Jones [ 27/Feb/13 ] |
|
Niu is going to look into this |
| Comment by Niu Yawei (Inactive) [ 04/Mar/13 ] |
|
first, we'd fix the is_empty_fs in t-f, which can cause unexpected error for lfsck.sh: http://review.whamcloud.com/5576 |
| Comment by Niu Yawei (Inactive) [ 05/Mar/13 ] |
|
The However, I still occasionally hit failure when cleanup lustre after running lfsck: LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) header@ffff880078391ec0[0x0, 3, [0x100000000:0x26:0x0] hash]{
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....lovsub@ffff880078391f58[0]
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....osc@ffff880078390e28id: 38 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) } header@ffff880078391ec0
LustreError: 5539:0:(lov_object.c:184:lov_init_sub())
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) } header@ffff880075173070
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) owned.
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) header@ffff8800773987a8[0x0, 1, [0x6c:0xc68285fb:0x0]]
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) try to own.
LustreError: 5539:0:(lcommon_cl.c:1211:cl_file_inode_init()) Failure to initialize cl object [0x6c:0xc68285fb:0x0]: -5
LustreError: 5539:0:(llite_lib.c:2161:ll_prep_inode()) new_inode -fatal: rc -5
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) header@ffff880078391920[0x0, 3, [0x100010000:0x28:0x0] hash]{
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) ....lovsub@ffff8800783919b8[0]
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) ....osc@ffff880078390768id: 40 gr: 0 idx: 1 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) } header@ffff880078391920
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) stripe 0 is already owned.
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) header@ffff8800773987a8[0x0, 1, [0xc6:0xc6828600:0x0] hash]{
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....vvp@ffff880077398840(- 0 0) inode: ffff8800675d0ab8 198/3330442752 100644 1 1 ffff880077398840 [0xc6:0xc6828600:0x0]
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....lov@ffff880074430830stripes: 1, valid, lsm{ffff880065b44240 0x0BD10BD0 1 1 0}:
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) header@ffff880078391920[0x0, 3, [0x100010000:0x28:0x0] hash]{
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....lovsub@ffff8800783919b8[0]
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....osc@ffff880078390768id: 40 gr: 0 idx: 1 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) } header@ffff880078391920
LustreError: 5539:0:(lov_object.c:184:lov_init_sub())
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) } header@ffff8800773987a8
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) owned.
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) header@ffff88004f4f2dd8[0x0, 1, [0x200000400:0x66:0x0]]
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) try to own.
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) header@ffff88004f4a0470[0x0, 3, [0x100000000:0x2a:0x0] hash]{
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) ....lovsub@ffff88004f4a0508[0]
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) ....osc@ffff880049da2d08id: 42 gr: 0 idx: 0 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) } header@ffff88004f4a0470
LustreError: 5539:0:(lov_object.c:183:lov_init_sub()) stripe 0 is already owned.
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) header@ffff88004f4f2cd0[0x0, 1, [0xc9:0xc6828603:0x0] hash]{
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....vvp@ffff88004f4f2d68(- 0 0) inode: ffff88007286cb78 201/3330442755 100644 1 1 ffff88004f4f2d68 [0xc9:0xc6828603:0x0]
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) ....lov@ffff88007286de80stripes: 1, valid, lsm{ffff88006f83cbc0 0x0BD10BD0 1 1 0}:
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) header@ffff88004f4a0470[0x0, 3, [0x100000000:0x2a:0x0] hash]{
LustreError: 5538:0:(lov_object.c:183:lov_init_sub()) header@ffff8800751719d0[0x0, 3, [0x100010000:0x29:0x0] hash]{
LustreError: 5538:0:(lov_object.c:183:lov_init_sub()) ....lovsub@ffff880075171a68[0]
LustreError: 5538:0:(lov_object.c:183:lov_init_sub()) ....osc@ffff8800751708c8id: 41 gr: 0 idx: 1 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5538:0:(lov_object.c:183:lov_init_sub()) } header@ffff8800751719d0
LustreError: 5538:0:(lov_object.c:183:lov_init_sub()) stripe 0 is already owned.
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) header@ffff880075173388[0x0, 1, [0x200000400:0x68:0x0] hash]{
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) ....vvp@ffff880075173420(- 0 0) inode: ffff88007ad1f678 144115205255725160/33554436 100644 1 1 ffff880075173420 [0x200000400:0x68:0x0]
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) ....lov@ffff880075172430stripes: 1, valid, lsm{ffff88006e9419c0 0x0BD10BD0 1 1 0}:
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) header@ffff8800751719d0[0x0, 3, [0x100010000:0x29:0x0] hash]{
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) ....lovsub@ffff880075171a68[0]
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) ....osc@ffff8800751708c8id: 41 gr: 0 idx: 1 gen: 0 kms_valid: 1 kms 1048576 rc: 0 force_sync: 0 min_xid: 0 size: 1048576 mtime: 1362472487 atime: 0 ctime: 1362472487 blocks: 2048
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) } header@ffff8800751719d0
LustreError: 5538:0:(lov_object.c:184:lov_init_sub())
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) } header@ffff880075173388
LustreError: 5538:0:(lov_object.c:184:lov_init_sub()) owned.
LustreError: 5538:0:(lov_object.c:185:lov_init_sub()) header@ffff88007a5b7ee0[0x0, 1, [0xc8:0xc6828602:0x0]]
LustreError: 5538:0:(lov_object.c:185:lov_init_sub()) try to own.
LustreError: 5539:0:(lov_object.c:184:lov_init_sub()) owned.
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) header@ffff88004f4f2ac0[0x0, 1, [0x200000400:0x69:0x0]]
LustreError: 5539:0:(lov_object.c:185:lov_init_sub()) try to own.
Lustre: DEBUG MARKER: lfsck : @@@@@@ FAIL: remove sub-test dirs failed
Which seems related to |
| Comment by Niu Yawei (Inactive) [ 06/Mar/13 ] |
|
Seems above error message is caused by duplicated files referencing same object in test dir, it's not related to |
| Comment by Niu Yawei (Inactive) [ 06/Mar/13 ] |
|
Remove the test directory before cleanup master, otherwise, the duplicated files (with same object) could cause trouble when removing directory: http://review.whamcloud.com/5606 |
| Comment by Peter Jones [ 12/Mar/13 ] |
|
Landed for 2.4 |