[LU-3657] sanity test_27A: setstripe bad address Created: 29/Jul/13  Updated: 05/Sep/17  Resolved: 05/Sep/17

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Maloo Assignee: James Nunez (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9431

 Description   

This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/7a96f5b6-f68f-11e2-b7e5-52540035b04c.

This test failure looks like it was caused by repeating loss of connection between client and servers. Many instances in debug log like:

Lustre: lustre-OST0000-osc-ffff88007cf0dc00: Connection to lustre-OST0000 (at 10.10.4.119@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-OST0000-osc-ffff88007cf0dc00: Connection restored to lustre-OST0000 (at 10.10.4.119@tcp)

The sub-test test_27a failed with the following error:

test_27A failed with 14

Info required for matching: sanity 27a



 Comments   
Comment by nasf (Inactive) [ 30/Jul/13 ]

Another failure instance:

https://maloo.whamcloud.com/test_sets/540f8052-f897-11e2-a8b3-52540035b04c

Comment by James Nunez (Inactive) [ 06/Aug/13 ]

Another failure at https://maloo.whamcloud.com/test_sets/75b44ecc-fbd0-11e2-8c6e-52540035b04c .

The test_log says:
== sanity test 27A: check filesystem-wide default LOV EA values == 19:07:21 (1375409241)
error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
error: setstripe: create stripe file '/mnt/lustre' failed
error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
error: setstripe: create stripe file '/mnt/lustre' failed
sanity test_27A: @@@@@@ FAIL: test_27A failed with 14

Comment by James Nunez (Inactive) [ 06/Aug/13 ]

In all cases that I've looked at, about 12, when test 27A fails with this error, tests 65i and 65j also fail with similar error messages. From test 65i test_log, https://maloo.whamcloud.com/test_sets/314c6232-f8b6-11e2-a8b3-52540035b04c ,

== sanity test 65i: set non-default striping on root directory (bug 6367)=== 06:01:24 (1375102884)
error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
error: setstripe: create stripe file '/mnt/lustre' failed
 sanity test_65i: @@@@@@ FAIL: test_65i failed with 14 
Comment by James Nunez (Inactive) [ 14/Aug/13 ]

There are several tests in sanity that have the same "error on ioctl", but some do not fail because there is no call to error in case of failure. In all of these cases, they are calling setstripe on the mount point /mnt/lustre.

Here is output from the suite_stdout for all tests that error on setstripe tests :

16:56:14:== sanity test 27A: check filesystem-wide default LOV EA values == 16:56:07 (1374882967)
16:56:14:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
16:56:14:error: setstripe: create stripe file '/mnt/lustre' failed
16:56:15:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
16:56:15:error: setstripe: create stripe file '/mnt/lustre' failed
16:56:15: sanity test_27A: @@@@@@ FAIL: test_27A failed with 14 

17:12:17:== sanity test 56a: check /usr/bin/lfs getstripe == 17:12:02 (1374883922)
17:12:17:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
17:12:17:error: setstripe: create stripe file '/mnt/lustre' failed
17:12:17:/usr/bin/lfs getstripe --recursive passed.

17:12:17:== sanity test 56g: check lfs find -name =============================== 17:12:04 (1374883924)
17:12:17:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
17:12:17:error: setstripe: create stripe file '/mnt/lustre' failed

17:12:18:== sanity test 56h: check lfs find ! -name =============================== 17:12:05 (1374883925)
17:12:18:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
17:12:18:error: setstripe: create stripe file '/mnt/lustre' failed

17:17:01:== sanity test 65i: set non-default striping on root directory (bug 6367)=== 17:16:51 (1374884211)
17:17:01:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
17:17:01:error: setstripe: create stripe file '/mnt/lustre' failed
17:17:01: sanity test_65i: @@@@@@ FAIL: test_65i failed with 14 

17:17:38:== sanity test 65j: set default striping on root directory (bug 6367)=== 17:17:12 (1374884232)
17:17:38:error on ioctl 0x4008669a for '/mnt/lustre' (3): Bad address
17:17:38:error: setstripe: create stripe file '/mnt/lustre' failed
17:17:38: sanity test_65j: @@@@@@ FAIL: setstripe failed 

17:28:01:== sanity test 118j: Simulate unrecoverable OST side error ============ 17:27:58 (1374884878)
17:28:01:7+0 records in
17:28:01:7+0 records out
17:28:01:458752 bytes (459 kB) copied, 0.00454288 s, 101 MB/s
17:28:01:CMD: client-24vm4 lctl set_param fail_loc=0x220
17:28:01:fail_loc=0x220
17:28:01:write: Bad address

Note: test 118j gets a Bad address on write not setstripe.

Comment by James Nunez (Inactive) [ 21/Aug/17 ]

I've looked through the sanity logs in Maloo for the past year. The only time I see the "Bad address" message is for sanity test 118j and this, I believe, is due to the failloc set, OBD_FAIL_BRW_WRITE_BULK2, in the test.

Does anyone else have recent, say the last six months,examples of this "Bad address" error message occurring in our testing? If not, I will be closing this ticket.

Comment by James Nunez (Inactive) [ 05/Sep/17 ]

We can reopen this ticket or open a new ticket if we see errors like this again.

Generated at Sat Feb 10 01:35:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.