[LU-999] Test failure on test suite lfsck Created: 16/Jan/12 Updated: 06/Feb/12 Resolved: 06/Feb/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.2.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Zhenyu Xu |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 6485 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/cbcbbd2e-3f50-11e1-990e-5254004bbbd3. fat-intel-1vm4: error getting mds_hdr (3685469441:8) in /tmp/mdsdb: DB_NOTFOUND: No matching key/data pair found |
| Comments |
| Comment by Peter Jones [ 16/Jan/12 ] |
|
Bobi Could you please look into this one? Thanks Peter |
| Comment by Andreas Dilger [ 16/Jan/12 ] |
|
The first thing to check is if the same version of e2fsprogs is installed on the MDS and OSS? Next, check if the version of db4 used by e2fsck is the same on both. This bug was hit in the past, and was due to db4 version mismatches. Please search bugzilla for this error messages. |
| Comment by Zhenyu Xu [ 31/Jan/12 ] |
|
error msg is: fat-intel-1vm4: error getting mds_hdr (3685469441:8) in /tmp/mdsdb: DB_NOTFOUND: No matching key/data pair found the corresponding e2fsck code are: memset(&mds_hdr, 0, sizeof(mds_hdr));
mds_hdr.mds_magic = MDS_MAGIC; // ====> 0xDBABCD01 == 3685469441
memset(&key, 0, sizeof(key));
memset(&data, 0, sizeof(data));
key.data = &mds_hdr.mds_magic;
key.size = sizeof(mds_hdr.mds_magic);
data.data = &mds_hdr;
data.size = sizeof(mds_hdr);
data.ulen = sizeof(mds_hdr);
data.flags = DB_DBT_USERMEM;
rc = mds_hdrdb->get(mds_hdrdb, NULL, &key, &data, 0);
if (rc) {
fprintf(stderr,"error getting mds_hdr ("LPU64":%u) in %s: %s\n",
mds_hdr.mds_magic, (int)sizeof(mds_hdr.mds_magic),
ctx->lustre_mdsdb, db_strerror(rc));
ctx->flags |= E2F_FLAG_ABORT;
goto out;
}
e2fsck cannot find the correct mds header magic value in /tmp/mdsdb when generating ost db. This could be caused by db4 version mismatch, but I checked the test session info https://maloo.whamcloud.com/test_sessions/99aea334-3f4f-11e1-990e-5254004bbbd3, MDS(fat-intel-1vm3) and OSTs(fat-intel-1vm4) use the same build image (Kernel Version: 2.6.32-131.17.1.el6_lustre.ge126ace.x86_64 Lustre Version: jenkins-arch=x86_64,build_type=server,distro=el6,ib_stack=inkern) |
| Comment by Andreas Dilger [ 06/Feb/12 ] |
|
Is the /tmp/mdsdb file available on the OSS node, and is it definitely the right one (i.e. not left over from some previous run)? I haven't checked this code to verify if it will fail with a "file not found" if the mdsdb file is missing entirely. Since the OSS and MDS are running in different VM images, it may be that the file is not being copied to the OSS correctly. |
| Comment by Zhenyu Xu [ 06/Feb/12 ] |
|
Sarah, Would you please check whether /tmp is a shared directory among MDS and OSS, and whether /tmp/mdsdb on the OSS node is exactly the same one as on the MDS node if we got another hit? Thanks. |
| Comment by Jian Yu [ 06/Feb/12 ] |
|
It seems this is the same issue as |
| Comment by Zhenyu Xu [ 06/Feb/12 ] |
|
yes, I think it's dup of |