[LU-3809] replay-single test_58c does not clean up after failure Created: 21/Aug/13  Updated: 05/Sep/13  Resolved: 05/Sep/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.5.0

Type: Bug Priority: Critical
Reporter: James Nunez (Inactive) Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: zfs

Issue Links:
Related
is related to LU-3787 replay-single test_90: wrong stripe: ... Resolved
Severity: 3
Rank (Obsolete): 9841

 Description   

This issue relates to the following test suite run: https://maloo.whamcloud.com/sub_tests/67f742f0-097f-11e3-901c-52540035b04c
The sub-test test_90 failed with the following error:
wrong stripe: f0, uuid: lustre-OST0000_UUID lustre-OST0000_UUID

From the client test log, we see

== replay-single test 90: lfs find identifies the missing striped file segments == 14:57:00 (1376949420)
Create the files
err: find: filename|dirname must either precede options or follow options
find files matching given attributes recursively in directory tree.
usage: find <directory|filename> ...
     [[!] --atime|-A [+-]N] [[!] --ctime|-C [+-]N]
     [[!] --mtime|-M [+-]N] [[!] --mdt|-m <uuid|index,...>]
     [--maxdepth|-D N] [[!] --name|-n <pattern>]
     [[!] --ost|-O <uuid|index,...>] [--print|-p] [--print0|-P]
     [[!] --size|-s [+-]N[bkMGTPE]]
     [[!] --stripe-count|-c [+-]<stripes>]
     [[!] --stripe-index|-i <index,...>]
     [[!] --stripe-size|-S [+-]N[kMGT]] [[!] --type|-t <filetype>]
     [[!] --gid|-g|--group|-G <gid>|<gname>]
     [[!] --uid|-u|--user|-U <uid>|<uname>] [[!] --pool <pool>]
     [[!] --layout|-L released,raid0]
	 !: used before an option indicates 'NOT' requested attribute
	 -: used before a value indicates 'AT MOST' requested value
	 +: used before a value indicates 'AT LEAST' requested value

error opening /mnt/lustre/d0.replay-single/d90/file: No such file or directory (2)
llapi_semantic_traverse: Failed to open '/mnt/lustre/d0.replay-single/d90/file': No such file or directory (2)
error: getstripe failed for /mnt/lustre/d0.replay-single/d90/file.
 replay-single test_90: @@@@@@ FAIL: wrong stripe: f0, uuid: lustre-OST0000_UUID lustre-OST0000_UUID 

There are two problems here. The first is that the variable file is missing the “$” in front of it and this causes getstripe to file. This is taken care of in LU-3787.

The reason for opening this ticket is that the UUID for the OST is returned from lfs_osts as “lustre-OST0000_UUID lustre-OST0000_UUID”. The UUID should be “lustre-OST0000_UUID”.

Looking back at the last 12 occurrences of this error, all of them are with ZFS.



 Comments   
Comment by Nathaniel Clark [ 21/Aug/13 ]

This is due to test_58c failing without cleaning up after itself. Because 58c mounts the fs a second time, there will be two UUIDs provided by ostuuid_from_index, so there are two options for fixing this issue. Have 58c trap failures and umount $MOUNT2 and/or call ostuuid_from_index with the directory as the second argument and thus only get one UUID.

Comment by Andreas Dilger [ 21/Aug/13 ]

The LU-3787 ticket can be related to improving the error message for test_90(), this bug can focus on fixing test_58c to clean up after itself on failure.

Comment by Nathaniel Clark [ 22/Aug/13 ]

http://review.whamcloud.com/7419

Comment by Jodi Levi (Inactive) [ 05/Sep/13 ]

Patch landed to master so closing ticket.

Generated at Sat Feb 10 01:37:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.