[LU-1743] failure on recovery-small test_18a: Created: 13/Aug/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 4135 | ||||||||||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/0e82b422-e499-11e1-9681-52540035b04c. The sub-test test_18a failed with the following error:
Info required for matching: recovery-small 18a |
| Comments |
| Comment by Andreas Dilger [ 13/Aug/12 ] |
|
This has hit 8 times on orion-head-sync and master branches since it first appeared on 2012-08-11 21:24:33: https://maloo.whamcloud.com/test_sets/245cb502-e4a5-11e1-af05-52540035b04c This coincides with the change to 1GB MDT filesystem size, though I'm not quite sure how that would have caused the problem. We'll see whether this test starts passing now that the MDT size has been increased to 2GB. |
| Comment by Di Wang [ 14/Aug/12 ] |
|
Not sure whether it is related with this problem, but it is a clear mistake here : CMD: client-22vm3 /usr/sbin/mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=1073741824 --mkfsoptions=\"-E lazy_itable_init\" --mountfsoptions=errors=remount-ro,iopen_nopriv,user_xattr,acl --reformat /dev/lvm-MDS/P1 Permanent disk data: device size = 1024MB --device-size=1073741824, I guess it intend to set MDSSIZE to 1GB, instead of 1TB, though the max device_size of /dev/lvm-MDS-P1 might be just 1GB. But I am afraid some test script might based on this MDSSIZE? Chris, could you please fix this? |
| Comment by Sarah Liu [ 09/Jan/13 ] |
|
another instance: https://maloo.whamcloud.com/test_sets/48c94224-5792-11e2-8772-52540035b04c |
| Comment by Jinshan Xiong (Inactive) [ 31/Jul/13 ] |
|
The following instance: https://maloo.whamcloud.com/test_sets/b532ff50-f953-11e2-8917-52540035b04c clearly shows that one OST was not ready while setstripe was running, so function lod_qos_parse_config(): if ((v1->lmm_stripe_offset >= d->lod_desc.ld_tgt_count) && (v1->lmm_stripe_offset != (typeof(v1->lmm_stripe_offset))(-1))) { CERROR("invalid offset: %x\n", v1->lmm_stripe_offset); RETURN(-EINVAL); } returned EINVAL because stripe_offset equaled to ld_tgt_count which was 1. I didn't look further to figure out why ost2 was not up yet but probably it was due to previous recovery. |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |