[LU-1209] sanity.sh subtest test_133d failed with "samedir_rename_size count error" Created: 13/Mar/12  Updated: 16/Apr/14  Resolved: 16/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1, Lustre 2.6.0
Fix Version/s: Lustre 2.3.0

Type: Bug Priority: Major
Reporter: Maloo Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: yuc2

Issue Links:
Duplicate
is duplicated by LU-1266 2.1.1<->2.2 Test failure on test suit... Resolved
Related
is related to LU-1193 test script incompatibility when runn... Resolved
Severity: 3
Rank (Obsolete): 4079

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/025d7aac-6ce3-11e1-9174-5254004bbbd3.

This subtest is reporting 14% failure over the past 100 runs, so there must be some race or unexpected result causing the failure.

The sub-test test_133d failed with the following error:

samedir_rename_size count error

Info required for matching: sanity 133d



 Comments   
Comment by Andreas Dilger [ 13/Mar/12 ]

Di, can you please take a look into why this is failing. It would also be good to search in Maloo (Results->Search->Subtests) for other cases of this test 133d failure, mark them with this bug, and add the test URLs here for future reference.

Comment by Andreas Dilger [ 13/Mar/12 ]

It seems this may relate to a test script interop issue. I noticed that all if the test failures are taking about twice as long as the passes, but this may also relate to improvements from the 2.2 pdirops.

Comment by Di Wang [ 14/Mar/12 ]

I check the maloo results, it seems others is related with 1193. But for this one, client and server are running the same version. Unfortunately, the log is not enough for me to figure out the reason. I will add more info in the test.

And I also make a patch to check lustre version to check whether the server is capable to run some tests as you said in 1193. Please check. http://review.whamcloud.com/#change,2309

Comment by Oleg Drokin [ 27/Apr/12 ]

another occurence in https://maloo.whamcloud.com/test_sets/c00eebb6-9039-11e1-98a1-525400d2bfa6 with available logs
https://maloo.whamcloud.com/test_sets/c00eebb6-9039-11e1-98a1-525400d2bfa6

seems that there test also failed that was the root error, but cannot see if the original report was about the same issue or not due to no logs.

In any case it might be related

Comment by Di Wang [ 28/Apr/12 ]

Oleg, could you please land http://review.whamcloud.com/#change,2309 ? So I can have more info here. Thanks.

Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,client,el5,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,client,el6,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,server,el5,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,server,el5,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,server,el6,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,server,el5,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,client,el6,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,client,el5,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,client,el5,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,server,el6,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,client,el6,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Build Master (Inactive) [ 29/Apr/12 ]

Integrated in lustre-master » i686,server,el6,ofa #497
LU-1209 tests: add debug info to 133d (Revision 2ad3935e89aac78ce73f3bcdbecd8286cfa52970)

Result = SUCCESS
Oleg Drokin : 2ad3935e89aac78ce73f3bcdbecd8286cfa52970
Files :

  • lustre/tests/sanity.sh
Comment by Ian Colle (Inactive) [ 03/Jul/12 ]

Happened again: https://maloo.whamcloud.com/test_sets/63204c18-c4d4-11e1-af06-52540035b04c

Comment by Andreas Dilger [ 07/Jul/12 ]

Again: https://maloo.whamcloud.com/test_sets/2d51efd8-c7fe-11e1-ba35-52540035b04c

It is currently reporting a 29% failure rate in Maloo.

Comment by Andreas Dilger [ 07/Jul/12 ]
== sanity test 133d: Verifying rename_stats ====== 19:02:48 (1341626568)
mdt.lustre-MDT0000.rename_stats
mdt.lustre-MDT0000.rename_stats=clear
total: 512 creates in 0.91 seconds: 565.37 creates/second
source rename dir size: 32K
target rename dir size: 4K
mdt.lustre-MDT0000.rename_stats=
rename_stats:
- snapshot_time:  1341626570.213120
- same_dir       
      64KB: { sample:   1, pct: 100, cum_pct: 100 }
/usr/lib64/lustre/tests/sanity.sh: line 7552: [: : integer expression expected
 sanity test_133d: @@@@@@ FAIL: samedir_rename_size error
Comment by Di Wang [ 07/Jul/12 ]

Hmm, the dir size is 32K, but somehow it record rename under 64K size. it seems dir size is not consistency during the whole process.

Comment by Di Wang [ 07/Jul/12 ]

Ah, we should get dir size after rename, since rename will change the dir size. Here is the fix http://review.whamcloud.com/#change,3298

Comment by Jodi Levi (Inactive) [ 27/Sep/12 ]

Please reopen ticket if additional work is needed.

Comment by Sarah Liu [ 08/Apr/13 ]

Hit this issue again when upgrade from 1.8.9 to 2.4 and then add one new MDT:

https://maloo.whamcloud.com/test_sets/a02cc9b2-9ec5-11e2-975f-52540035b04c

Comment by Li Wei (Inactive) [ 10/Apr/13 ]

https://maloo.whamcloud.com/test_sets/ecc68362-a14f-11e2-b1c3-52540035b04c

Comment by Jian Yu [ 13/Dec/13 ]

Here is the fix http://review.whamcloud.com/#change,3298

The above patch exists on Lustre b2_4 branch build #67. However, the test still failed:
https://maloo.whamcloud.com/test_sets/596be6ba-6351-11e3-8c76-52540035b04c

Comment by Nathaniel Clark [ 03/Mar/14 ]

Hit issue on review-zfs on master (pre 2.6):
https://maloo.whamcloud.com/test_sets/d1c63508-a095-11e3-aa36-52540035b04c

Comment by James Nunez (Inactive) [ 15/Apr/14 ]

Hit this on review-ldiskfs https://maloo.whamcloud.com/test_sets/ecfe391a-c41c-11e3-a793-52540035b04c

Comment by Bob Glossman (Inactive) [ 15/Apr/14 ]

another, in review_dne_part-1
https://maloo.whamcloud.com/test_sets/5c1fa7e2-c46e-11e3-823c-52540035b04c

Comment by Andreas Dilger [ 16/Apr/14 ]

The failures recently reported against this bug are actually caused by http://review.whamcloud.com/7803 landing (incorrectly allowed by TEI-1508). Oleg has submitted http://review.whamcloud.com/9978 to fix the regression. I've been marking all related failures with LU-3963.

Generated at Sat Feb 10 01:14:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.