[LU-1743] failure on recovery-small test_18a: Created: 13/Aug/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-6036 Interop recovery-small test_18a: test... Closed
Related
is related to LU-1741 Test failure on test suite conf-sanit... Resolved
Severity: 3
Rank (Obsolete): 4135

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/0e82b422-e499-11e1-9681-52540035b04c.

The sub-test test_18a failed with the following error:

error on ioctl 0x4008669a for '/mnt/lustre/d0.recovery-small/d18/f.recovery-small.18a' (3): Invalid argument
error: setstripe: create stripe file '/mnt/lustre/d0.recovery-small/d18/f.recovery-small.18a' failed
recovery-small test_18a: @@@@@@ FAIL: test_18a failed with 3

Info required for matching: recovery-small 18a



 Comments   
Comment by Andreas Dilger [ 13/Aug/12 ]

This has hit 8 times on orion-head-sync and master branches since it first appeared on 2012-08-11 21:24:33:

https://maloo.whamcloud.com/test_sets/245cb502-e4a5-11e1-af05-52540035b04c
https://maloo.whamcloud.com/test_sets/0e82b422-e499-11e1-9681-52540035b04c
https://maloo.whamcloud.com/test_sets/9ebe4ef4-e48d-11e1-9681-52540035b04c
https://maloo.whamcloud.com/test_sets/2f2343e8-e46e-11e1-af05-52540035b04c
https://maloo.whamcloud.com/test_sets/9527d7f6-e452-11e1-9681-52540035b04c
https://maloo.whamcloud.com/test_sets/8f584f40-e452-11e1-af05-52540035b04c
https://maloo.whamcloud.com/test_sets/b31a44c6-e44b-11e1-9681-52540035b04c
https://maloo.whamcloud.com/test_sets/d5e5b064-e426-11e1-93f8-52540035b04c

This coincides with the change to 1GB MDT filesystem size, though I'm not quite sure how that would have caused the problem. We'll see whether this test starts passing now that the MDT size has been increased to 2GB.

Comment by Di Wang [ 14/Aug/12 ]

Not sure whether it is related with this problem, but it is a clear mistake here

: CMD: client-22vm3 /usr/sbin/mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=1073741824 --mkfsoptions=\"-E lazy_itable_init\" --mountfsoptions=errors=remount-ro,iopen_nopriv,user_xattr,acl --reformat /dev/lvm-MDS/P1

Permanent disk data:
Target: lustre-MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x65
(MDT MGS first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr,acl
Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/usr/sbin/l_getidentity

device size = 1024MB
formatting backing filesystem ldiskfs on /dev/lvm-MDS/P1

--device-size=1073741824, I guess it intend to set MDSSIZE to 1GB, instead of 1TB, though the max device_size of /dev/lvm-MDS-P1 might be just 1GB. But I am afraid some test script might based on this MDSSIZE?

Chris, could you please fix this?

Comment by Sarah Liu [ 09/Jan/13 ]

another instance: https://maloo.whamcloud.com/test_sets/48c94224-5792-11e2-8772-52540035b04c

Comment by Jinshan Xiong (Inactive) [ 31/Jul/13 ]

The following instance:

https://maloo.whamcloud.com/test_sets/b532ff50-f953-11e2-8917-52540035b04c

clearly shows that one OST was not ready while setstripe was running, so function lod_qos_parse_config():

        if ((v1->lmm_stripe_offset >= d->lod_desc.ld_tgt_count) &&
            (v1->lmm_stripe_offset != (typeof(v1->lmm_stripe_offset))(-1))) {
                CERROR("invalid offset: %x\n", v1->lmm_stripe_offset);
                RETURN(-EINVAL);
        }

returned EINVAL because stripe_offset equaled to ld_tgt_count which was 1. I didn't look further to figure out why ost2 was not up yet but probably it was due to previous recovery.

Comment by Andreas Dilger [ 29/May/17 ]

Close old ticket.

Generated at Sat Feb 10 01:19:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.