[LU-5118] Failure on test suite sanity test_300g: expect 2 get 1 for /mnt/lustre/d300g.sanity/striped_dir/test2 Created: 29/May/14  Updated: 13/Oct/21  Resolved: 13/Oct/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Low Priority Votes: 0
Labels: dne
Environment:

server and client: lustre-master build #2052 DNE mode


Severity: 3
Rank (Obsolete): 14115

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/36242834-ddf9-11e3-9f85-52540035b04c.

The sub-test test_300g failed with the following error:

expect 2 get 1 for /mnt/lustre/d300g.sanity/striped_dir/test2

== sanity test 300g: check default striped directory for striped directory == 12:49:57 (1400269797)
 sanity test_300g: @@@@@@ FAIL: expect 2 get 1 for /mnt/lustre/d300g.sanity/striped_dir/test2 


 Comments   
Comment by James Nunez (Inactive) [ 17/Dec/14 ]

I experienced the same issue with two MDSs and one MDT each with lustre-master build # 2771. Results are at https://testing.hpdd.intel.com/test_sets/f326bd9e-8618-11e4-ac52-5254006e85c2

Comment by James Nunez (Inactive) [ 25/Jan/15 ]

Experienced this same error with lustre-master 2.6.92. Results at https://testing.hpdd.intel.com/test_sets/37e63f92-9f0d-11e4-91b3-5254006e85c2

Comment by James Nunez (Inactive) [ 10/Feb/15 ]

I've hit this error again with lustre-master tag 2.6.93. Results at https://testing.hpdd.intel.com/test_sessions/fff27cc4-addd-11e4-a0b6-5254006e85c2

Comment by Li Xi (Inactive) [ 24/May/16 ]

I can reproduce this problem very easily. Something is broken when default offset is -1.

Following is how to reproduce:

[root@atest-vm134 tests]# MDSCOUNT=2 OSTCOUNT=2 OSTSIZE=2097152 mgs_HOST=atest-vm134 mds1_HOST=atest-vm134 mds2_HOST=atest-vm135 ost1_HOST=atest-vm136 ost2_HOST=atest-vm137 MGSDEV=/dev/sdb MDSDEV1=/dev/sdb MDSDEV2=/dev/sdb OSTDEV1=/dev/sdb OSTDEV2=/dev/sdb CLIENT1=atest-vm138 RCLIENTS=atest-vm139 PDSH="pdsh -R ssh -S -w" NAME=ncli SHARED_DIRECTORY=/mnt/shared /usr/lib64/lustre/tests/auster -r -v -D /tmp/test_logs/log sanity.sh --only 300h
Started at Tue May 24 17:56:43 JST 2016
Lustre is not mounted, trying to do setup ... 
Stopping clients: atest-vm134,atest-vm138,atest-vm139 /mnt/lustre (opts:)
Stopping clients: atest-vm134,atest-vm138,atest-vm139 /mnt/lustre2 (opts:)
Loading modules from /usr/lib64/lustre
detected 4 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
../libcfs/libcfs/libcfs options: 'cpu_npartitions=2'
debug=vfstrace rpctrace dlmtrace neterror ha config                   ioctl super lfsck
subsystem_debug=all -lnet -lnd -pinger
quota/lquota options: 'hash_lqs_cur_bits=3'
Formatting mgs, mds, osts
Format mds1: /dev/sdb
Format mds2: /dev/sdb
Format ost1: /dev/sdb
Format ost2: /dev/sdb
Checking servers environments
Checking clients atest-vm134,atest-vm138,atest-vm139 environments
Loading modules from /usr/lib64/lustre
detected 4 online CPUs by sysfs
Force libcfs to create 2 CPU partitions
debug=vfstrace rpctrace dlmtrace neterror ha config                   ioctl super lfsck
subsystem_debug=all -lnet -lnd -pinger
Setup mgs, mdt, osts
Starting mds1:   /dev/sdb /mnt/mds1
Started lustre-MDT0000
Starting mds2:   /dev/sdb /mnt/mds2
Started lustre-MDT0001
Starting ost1:   /dev/sdb /mnt/ost1
Started lustre-OST0000
Starting ost2:   /dev/sdb /mnt/ost2
Started lustre-OST0001
Starting client: atest-vm134:  -o user_xattr,flock atest-vm134@tcp:/lustre /mnt/lustre
Starting client atest-vm134,atest-vm138,atest-vm139:  -o user_xattr,flock atest-vm134@tcp:/lustre /mnt/lustre
Started clients atest-vm134,atest-vm138,atest-vm139: 
atest-vm134@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock)
atest-vm134@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock)
atest-vm134@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock)
Using TIMEOUT=20
seting jobstats to procname_uid
Setting lustre.sys.jobid_var from disable to procname_uid
Waiting 90 secs for update
Updated after 2s: wanted 'procname_uid' got 'procname_uid'
disable quota as required
running: sanity.sh ONLY=300h 
run_suite sanity /usr/lib64/lustre/tests/sanity.sh
-----============= acceptance-small: sanity ============----- Tue May 24 17:57:21 JST 2016
Running: bash /usr/lib64/lustre/tests/sanity.sh
atest-vm139: Checking config lustre mounted on /mnt/lustre
atest-vm138: Checking config lustre mounted on /mnt/lustre
atest-vm134: Checking config lustre mounted on /mnt/lustre
Checking servers environments
Checking clients atest-vm134,atest-vm138,atest-vm139 environments
Using TIMEOUT=20
disable quota as required
osd-ldiskfs.track_declares_assert=1
osd-ldiskfs.track_declares_assert=1
osd-ldiskfs.track_declares_assert=1
osd-ldiskfs.track_declares_assert=1
running as uid/gid/euid/egid 500/500/500/500, groups:
 [touch] [/mnt/lustre/d0_runas_test/f25253]
excepting tests: 132 76 42a 42b 42c 42d 45 51d 68b
skipping tests SLOW=no: 24o 24D 27m 64b 68 71 77f 78 115 124b 230d 401
preparing for tests involving mounts
mke2fs 1.42.13.wc3 (28-Aug-2015)

debug=-1
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4
resend_count is set to 4 4


== sanity test 300h: check default striped directory for striped directory == 17:57:25 (1464080245)
checking striped_dir 2 1
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test1
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x1:0x0]
     0           [0x280000402:0x1:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test2
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x2:0x0]
     0           [0x280000402:0x2:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test3
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x3:0x0]
     0           [0x280000402:0x3:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test4
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x4:0x0]
     0           [0x280000402:0x4:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
checking striped_dir 1 0
SSSSSSSSSSSSSSSSSSSSSS
1
/mnt/lustre/d300h.sanity/striped_dir/test1
lmv_stripe_count: 1 lmv_stripe_offset: 0
mdtidx           FID[seq:oid:ver]
     0           [0x280000401:0x2:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
1
/mnt/lustre/d300h.sanity/striped_dir/test2
lmv_stripe_count: 1 lmv_stripe_offset: 0
mdtidx           FID[seq:oid:ver]
     0           [0x280000401:0x3:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
1
/mnt/lustre/d300h.sanity/striped_dir/test3
lmv_stripe_count: 1 lmv_stripe_offset: 0
mdtidx           FID[seq:oid:ver]
     0           [0x280000401:0x4:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
1
/mnt/lustre/d300h.sanity/striped_dir/test4
lmv_stripe_count: 1 lmv_stripe_offset: 0
mdtidx           FID[seq:oid:ver]
     0           [0x280000401:0x5:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
checking striped_dir 2 1
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test1
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x5:0x0]
     0           [0x280000402:0x5:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test2
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x6:0x0]
     0           [0x280000402:0x6:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test3
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x7:0x0]
     0           [0x280000402:0x7:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test4
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x8:0x0]
     0           [0x280000402:0x8:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
checking striped_dir 2 -1
SSSSSSSSSSSSSSSSSSSSSS
2
/mnt/lustre/d300h.sanity/striped_dir/test1
lmv_stripe_count: 2 lmv_stripe_offset: 1
mdtidx           FID[seq:oid:ver]
     1           [0x2c0000402:0x9:0x0]
     0           [0x280000402:0x9:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
SSSSSSSSSSSSSSSSSSSSSS
1
/mnt/lustre/d300h.sanity/striped_dir/test2
lmv_stripe_count: 1 lmv_stripe_offset: 0
mdtidx           FID[seq:oid:ver]
     0           [0x280000401:0x6:0x0]
EEEEEEEEEEEEEEEEEEEEEEEEEE
 sanity test_300h: @@@@@@ FAIL: stripe count 2 != 1 for /mnt/lustre/d300h.sanity/striped_dir/test2 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4706:error_noexit()
  = /usr/lib64/lustre/tests/test-framework.sh:4737:error()
  = /usr/lib64/lustre/tests/sanity.sh:13499:test_300_check_default_striped_dir()
  = /usr/lib64/lustre/tests/sanity.sh:13561:test_300h()
  = /usr/lib64/lustre/tests/test-framework.sh:4984:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:5021:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:4838:run_test()
  = /usr/lib64/lustre/tests/sanity.sh:13574:main()
Dumping lctl log to /tmp/test_logs/log/sanity.test_300h.*.1464080247.log
FAIL 300h (3s)
debug=+snapshot
debug=+snapshot
debug=-snapshot
debug=-snapshot
== sanity test complete, duration 8 sec == 17:57:29 (1464080249)

I added following lines in sanity.sh:

        mkdir $DIR/$tdir/$dirname/{test1,test2,test3,test4} ||
                                               error "create dirs failed"
        for dir in $(find $DIR/$tdir/$dirname/*); do
                echo SSSSSSSSSSSSSSSSSSSSSS
                $LFS getdirstripe -c $dir
                $LFS getdirstripe $dir
                echo EEEEEEEEEEEEEEEEEEEEEEEEEE
...
Generated at Sat Feb 10 01:48:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.