[LU-367] lfsck 1.41.90.wc2: illegal flag specified to DB->open Created: 27/May/11 Updated: 26/Oct/11 Resolved: 02/Jun/11 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Jian Yu | Assignee: | Andreas Dilger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre Branch: b1_8 MGS/MDS Node: client-6-ib (1 combo MGS/MDT) $ pdsh |
||
| Attachments: |
|
| Story Points: | 1 |
| Severity: | 3 |
| Rank (Obsolete): | 10145 |
| Description |
|
lfsck test failed as follows: lfsck -c -l --mdsdb /home/yujian/test_logs/mdsdb --ostdb /home/yujian/test_logs/ostdb-0 /home/yujian/test_logs/ostdb-1 /home/yujian/test_logs/ostdb-2 /home/yujian/test_logs/ostdb-3 /home/yujian/test_logs/ostdb-4 /home/yujian/test_logs/ostdb-5 /mnt/lustre lfsck 1.41.90.wc2 (14-May-2011) illegal flag specified to DB->open /home/yujian/test_logs/mdsdb:mdshdr : Invalid argument /usr/lib64/lustre/tests/test-framework.sh: line 2269: 12801 Segmentation fault (core dumped) lfsck -c -l --mdsdb /home/yujian/test_logs/mdsdb --ostdb /home/yujian/test_logs/ostdb-0 /home/yujian/test_logs/ostdb-1 /home/yujian/test_logs/ostdb-2 /home/yujian/test_logs/ostdb-3 /home/yujian/test_logs/ostdb-4 /home/yujian/test_logs/ostdb-5 /mnt/lustre lfsck : @@@@@@ FAIL: lfsck -c -l --mdsdb /home/yujian/test_logs/mdsdb --ostdb /home/yujian/test_logs/ostdb-0 /home/yujian/test_logs/ostdb-1 /home/yujian/test_logs/ostdb-2 /home/yujian/test_logs/ostdb-3 /home/yujian/test_logs/ostdb-4 /home/yujian/test_logs/ostdb-5 /mnt/lustre returned 139, should be <= 1 Dumping lctl log to /home/yujian/test_logs/2011-05-27/014304/lfsck..*.1306485882.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-05-27/014304/lfsck-1306485882.tar.bz2 Dmesg on client-1-ib showed: lfsck[12801]: segfault at 5 ip 00007fd0ed3b4ff7 sp 00007fff411b5e50 error 4 in libc-2.12.so[7fd0ed36d000+175000] Maloo report: https://maloo.whamcloud.com/test_sets/9648ba14-883d-11e0-b4df-52540025f9af The logs and db files are attached. |
| Comments |
| Comment by Andreas Dilger [ 27/May/11 ] |
|
It looks from the e2fsprogs versions that there are 2 different distros being tested on the MDS/OSS (RHEL5) and on the clients (RHEL6). It seems possible that this will result in different versions of db4 being used, which has caused compatibility issues in the past. I need to confirm whether or not the lfsck run was done on the client or on the MDS. If lfsck was run on the MDS then this is a non-issue, but seems like it may be a reason for this problem. Probably a short-term solution is to record the version of e2fsck/lfsck into the MDSDB, and verify this on the OSTs and client, so that there are no surprises. |
| Comment by Peter Jones [ 27/May/11 ] |
|
Andreas Please reassign if someone else should work on this but it sounds like you are Peter |
| Comment by Andreas Dilger [ 27/May/11 ] |
|
I was able to reproduce this error on my local filesystem by building the MDSDB and OSTDBs on a system with db4-4.2 and then running lfsck on a client system with db4-4.7. Both systems were running e2fsprogs-1.41.90.wc2 RPMs on x86_64 built from the same source, but for their respective distros. I don't think this issue is a blocker, though it makes sense to avoid confusion like this in the future by having the MDSDB store the db4 version that was used, or something. That said, storing it in the database doesn't help, because opening the database is the problem in the first place. |
| Comment by Andreas Dilger [ 28/May/11 ] |
|
Patch has been submitted to http://review.whamcloud.com/867. |
| Comment by Andreas Dilger [ 02/Jun/11 ] |
|
Patch was verified by Yu Jian in https://maloo.whamcloud.com/test_sets/747400f4-8c27-11e0-aab9-52540025f9af, and has been landed to the e2fsprogs master-lustre branch. |
| Comment by Build Master (Inactive) [ 02/Jun/11 ] |
|
Integrated in Andreas Dilger : 4ef693a23fda00eec24840a3d072f8fe466b845f
|
| Comment by Build Master (Inactive) [ 02/Jun/11 ] |
|
Integrated in Andreas Dilger : 4ef693a23fda00eec24840a3d072f8fe466b845f
|
| Comment by Build Master (Inactive) [ 02/Jun/11 ] |
|
Integrated in Andreas Dilger : 4ef693a23fda00eec24840a3d072f8fe466b845f
|
| Comment by Build Master (Inactive) [ 02/Jun/11 ] |
|
Integrated in Andreas Dilger : 4ef693a23fda00eec24840a3d072f8fe466b845f
|
| Comment by Andreas Dilger [ 02/Jun/11 ] |
|
Patch appears to fix the problem (tested manually). |
| Comment by Build Master (Inactive) [ 14/Jun/11 ] |
|
Integrated in Andreas Dilger : df8f009b6cd67d8a2b5750c1143480b9b644446d
|
| Comment by Build Master (Inactive) [ 14/Jun/11 ] |
|
Integrated in Andreas Dilger : df8f009b6cd67d8a2b5750c1143480b9b644446d
|
| Comment by Build Master (Inactive) [ 14/Jun/11 ] |
|
Integrated in Andreas Dilger : df8f009b6cd67d8a2b5750c1143480b9b644446d
|
| Comment by Build Master (Inactive) [ 14/Jun/11 ] |
|
Integrated in Andreas Dilger : df8f009b6cd67d8a2b5750c1143480b9b644446d
|