[LU-233] "Russian doll" test log tarballs Created: 22/Apr/11 Updated: 16/Aug/16 Resolved: 16/Aug/16 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | Lustre 2.2.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Li Wei (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4763 |
| Description |
|
Test Framework uses the following code to collect logs when tests fail: gather_logs () {
# ...
local archive=$LOGDIR/${TESTSUITE}-$ts.tar.bz2
tar -jcf $archive $LOGDIR/*$ts* $LOGDIR/*${TESTSUITE}*
# ...
}
Note that the names of the archives begin with $TESTSUITE, which means, if there are multiple failures in the same suite, then the archive of each such failure will contain all the prior archives. In real world, I've encountered a test log directory with an over 6 GB size, thanks to those Russian doll tarballs. The test results couldn't be uploaded to Maloo, probably because of the huge size. |
| Comments |
| Comment by Jian Yu [ 25/Apr/11 ] |
|
I also hit the above issue while running sanity-quota.sh on b1_8 and encountering multiple failures in test 1: There are 30 cycles in test 1 when SLOW=yes. And from the above report, we could see that the test failed in cycle 1 but did not break, which caused the following cycles fail. For each failed cycle, gather_logs() would be performed to make a tarball for all of the previous logs, which caused tar hang after cycle 16. Here is a list of the tarballs in the log dir: [root@client-13 070850]# ls -hlt *.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 64G Apr 21 2011 sanity-quota-1303425009.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 32G Apr 21 15:30 sanity-quota-1303416508.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 16G Apr 21 13:08 sanity-quota-1303412214.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 8.0G Apr 21 11:56 sanity-quota-1303410060.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 4.0G Apr 21 11:20 sanity-quota-1303408971.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 2.0G Apr 21 11:02 sanity-quota-1303408402.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 1013M Apr 21 10:52 sanity-quota-1303408082.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 506M Apr 21 10:47 sanity-quota-1303407908.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 252M Apr 21 10:44 sanity-quota-1303407784.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 126M Apr 21 10:42 sanity-quota-1303407719.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 63M Apr 21 10:41 sanity-quota-1303407626.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 32M Apr 21 10:40 sanity-quota-1303407595.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 16M Apr 21 10:39 sanity-quota-1303407568.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 7.8M Apr 21 10:39 sanity-quota-1303407546.tar.bz2 -rw-r--r-- 1 nfsnobody nfsnobody 5.1M Apr 21 10:38 sanity-quota-1303407511.tar.bz2 |
| Comment by Chris Gearing (Inactive) [ 25/Apr/11 ] |
|
If we are just dealing with the case of yaml and Maloo then I do not think we do not need this files at all. Does anybody know the purpose of tar'ing the log files like this. |
| Comment by Jian Yu [ 25/Apr/11 ] |
|
Here is result after searching the CVS history: grev 2009/09/15 07:46:24 GMT
Modified: tests test-framework.sh recovery-mds-scale.sh
tests/cfg local.sh
Log:
b=20237
i=Manoj.Joseph
i=Robert.Read
gather and archive the logs
The original purpose of gather_logs() was to collect and archive the logs for recovery-*-scale tests and it was only used in the recovery-*-scale.sh scripts. bug 20237 contains the detailed info. The landing of auto-vetting test-framework modified error_noexit() to use gather_logs, which made it widely used among the acc-sm tests. On master branch: commit e4cf956f93a4384d19ea73e601a6651710703492
Author: Manoj Joseph <manoj.joseph@sun.com>
Date: Wed Jan 20 02:06:26 2010 -0700
b=20057 Autovetting and test-framework enhancements
Test-framework and script changes to support autovetting and buffalo V2
i=rread
i=grev
On b1_8 branch: commit 30d9df1a69d325c416ed7027ddd34464f097396f
Author: root <root@murdoch.sodor>
Date: Tue Dec 21 14:00:06 2010 +0000
LU-123 Port yaml and auster to b1_8
Changes to add the yaml data logging from the 2.0 branch in the 1.8 branch, this
patch was created by applying the 2.0 yml patch to 1.8 and then resolving the issues.
<~snip~>
Change-Id: I602a3534f17544d857aa0a9f9f82d2873fb73a39
Signed-off-by: Chris Gearing <chris@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/421
|
| Comment by Jian Yu [ 25/Apr/11 ] |
|
IMHO, for non-recovey-*-scale tests, the codes for archiving the logs are not needed. For recovery-*-scale tests, since they produce specific logs (*run_ {dd,tar,dbench,iozone}.sh*), we need enhance Maloo to support importing those logs first, then remove the codes for archiving the logs. |
| Comment by Robert Read (Inactive) [ 25/Apr/11 ] |
|
I'd like to see the recovery-*-scale tests rewritten anyway. I don't trust them to produce reliable results. |
| Comment by Chris Gearing (Inactive) [ 26/Apr/11 ] |
|
What in particular do you not like about the recovery-*-scale tests? |
| Comment by Chris Gearing (Inactive) [ 04/Oct/11 ] |
|
|
| Comment by Chris Gearing (Inactive) [ 04/Oct/11 ] |
|
Change made to remove Russian dolls whilst keeping behaviour of recovery-*-scale tests |
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 19/Jan/12 ] |
|
Integrated in Result = SUCCESS
|
| Comment by James A Simmons [ 16/Aug/16 ] |
|
Looks like everything landed here long ago. |