[LU-233] "Russian doll" test log tarballs Created: 22/Apr/11  Updated: 16/Aug/16  Resolved: 16/Aug/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: Lustre 2.2.0

Type: Bug Priority: Minor
Reporter: Li Wei (Inactive) Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 4763

 Description   

Test Framework uses the following code to collect logs when tests fail:

gather_logs () {
    # ...

    local archive=$LOGDIR/${TESTSUITE}-$ts.tar.bz2
    tar -jcf $archive $LOGDIR/*$ts* $LOGDIR/*${TESTSUITE}*

    # ...
}

Note that the names of the archives begin with $TESTSUITE, which means, if there are multiple failures in the same suite, then the archive of each such failure will contain all the prior archives.

In real world, I've encountered a test log directory with an over 6 GB size, thanks to those Russian doll tarballs. The test results couldn't be uploaded to Maloo, probably because of the huge size.



 Comments   
Comment by Jian Yu [ 25/Apr/11 ]

I also hit the above issue while running sanity-quota.sh on b1_8 and encountering multiple failures in test 1:
https://maloo.whamcloud.com/test_sets/929da4aa-6c91-11e0-b32b-52540025f9af

There are 30 cycles in test 1 when SLOW=yes. And from the above report, we could see that the test failed in cycle 1 but did not break, which caused the following cycles fail. For each failed cycle, gather_logs() would be performed to make a tarball for all of the previous logs, which caused tar hang after cycle 16.

Here is a list of the tarballs in the log dir:

[root@client-13 070850]# ls -hlt *.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   64G Apr 21  2011 sanity-quota-1303425009.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   32G Apr 21 15:30 sanity-quota-1303416508.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   16G Apr 21 13:08 sanity-quota-1303412214.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  8.0G Apr 21 11:56 sanity-quota-1303410060.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  4.0G Apr 21 11:20 sanity-quota-1303408971.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  2.0G Apr 21 11:02 sanity-quota-1303408402.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody 1013M Apr 21 10:52 sanity-quota-1303408082.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  506M Apr 21 10:47 sanity-quota-1303407908.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  252M Apr 21 10:44 sanity-quota-1303407784.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  126M Apr 21 10:42 sanity-quota-1303407719.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   63M Apr 21 10:41 sanity-quota-1303407626.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   32M Apr 21 10:40 sanity-quota-1303407595.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody   16M Apr 21 10:39 sanity-quota-1303407568.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  7.8M Apr 21 10:39 sanity-quota-1303407546.tar.bz2
-rw-r--r-- 1 nfsnobody nfsnobody  5.1M Apr 21 10:38 sanity-quota-1303407511.tar.bz2
Comment by Chris Gearing (Inactive) [ 25/Apr/11 ]

If we are just dealing with the case of yaml and Maloo then I do not think we do not need this files at all. Does anybody know the purpose of tar'ing the log files like this.

Comment by Jian Yu [ 25/Apr/11 ]

Here is result after searching the CVS history:

  grev        2009/09/15 07:46:24 GMT
  
  Modified:    tests    test-framework.sh recovery-mds-scale.sh
               tests/cfg local.sh
  Log:
  b=20237
  i=Manoj.Joseph
  i=Robert.Read
  gather and archive the logs

The original purpose of gather_logs() was to collect and archive the logs for recovery-*-scale tests and it was only used in the recovery-*-scale.sh scripts. bug 20237 contains the detailed info.

The landing of auto-vetting test-framework modified error_noexit() to use gather_logs, which made it widely used among the acc-sm tests.

On master branch:

commit e4cf956f93a4384d19ea73e601a6651710703492
Author: Manoj Joseph <manoj.joseph@sun.com>
Date:   Wed Jan 20 02:06:26 2010 -0700

    b=20057 Autovetting and test-framework enhancements
    
    Test-framework and script changes to support autovetting and buffalo V2
    
    i=rread
    i=grev

On b1_8 branch:

commit 30d9df1a69d325c416ed7027ddd34464f097396f
Author: root <root@murdoch.sodor>
Date:   Tue Dec 21 14:00:06 2010 +0000

    LU-123 Port yaml and auster to b1_8
    
    Changes to add the yaml data logging from the 2.0 branch in the 1.8 branch, this
    patch was created by applying the 2.0 yml patch to 1.8 and then resolving the issues.
    <~snip~>
    Change-Id: I602a3534f17544d857aa0a9f9f82d2873fb73a39
    Signed-off-by: Chris Gearing <chris@whamcloud.com>
    Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
    Reviewed-on: http://review.whamcloud.com/421
Comment by Jian Yu [ 25/Apr/11 ]

IMHO, for non-recovey-*-scale tests, the codes for archiving the logs are not needed. For recovery-*-scale tests, since they produce specific logs (*run_

{dd,tar,dbench,iozone}

.sh*), we need enhance Maloo to support importing those logs first, then remove the codes for archiving the logs.

Comment by Robert Read (Inactive) [ 25/Apr/11 ]

I'd like to see the recovery-*-scale tests rewritten anyway. I don't trust them to produce reliable results.

Comment by Chris Gearing (Inactive) [ 26/Apr/11 ]

What in particular do you not like about the recovery-*-scale tests?

Comment by Chris Gearing (Inactive) [ 04/Oct/11 ]

LU-734 can track the changes to recovery.

Comment by Chris Gearing (Inactive) [ 04/Oct/11 ]

Change made to remove Russian dolls whilst keeping behaviour of recovery-*-scale tests

Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,server,el5,ofa #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-mds-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,server,el6,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/test-framework.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-random-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,client,el5,ofa #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-double-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,client,el6,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-mds-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,server,el5,ofa #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/test-framework.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,server,el5,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-mds-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,client,el5,inkernel #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/test-framework.sh
  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-random-scale.sh
Comment by Build Master (Inactive) [ 19/Jan/12 ]

Integrated in lustre-master » i686,client,el5,ofa #431
LU-233 test: Remove taring of log files in gather logs. (Revision 383acceb3a045098ea4b93ed07633f701b04f4fe)

Result = SUCCESS
Oleg Drokin : 383acceb3a045098ea4b93ed07633f701b04f4fe
Files :

  • lustre/tests/recovery-mds-scale.sh
  • lustre/tests/recovery-random-scale.sh
  • lustre/tests/recovery-double-scale.sh
  • lustre/tests/test-framework.sh
Comment by James A Simmons [ 16/Aug/16 ]

Looks like everything landed here long ago.

Generated at Sat Feb 10 01:05:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.