[LU-5373] Failure on test suite sanity test_33b: FAIL: test_33b failed with 2 Created: 18/Jul/14  Updated: 16/Jan/15  Resolved: 16/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Major
Reporter: Maloo Assignee: Bob Glossman (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

server: lustre-b2_6-rc2 RHEL6 ldiskfs
client: SLES11 SP3


Severity: 3
Rank (Obsolete): 14978

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/cd5613b2-0dd2-11e4-972c-5254006e85c2.

The sub-test test_33b failed with the following error:

test_33b failed with 2

== sanity test 33b: test open file with malformed flags (No panic) =================================== 18:24:17 (1405473857)
running as uid/gid/euid/egid 500/500/500/500, groups:
 [openfile] [-f] [1286739555] [/mnt/lustre/d33/f33]
Error in opening file "/mnt/lustre/d33/f33"(flags=1286739555) 2: No such file or directory
 sanity test_33b: @@@@@@ FAIL: test_33b failed with 2 


 Comments   
Comment by Oleg Drokin [ 21/Jul/14 ]

So mkdir and chown worked fine and yet later file creation in that dir failed? That's kind of strange.

Comment by Bob Glossman (Inactive) [ 03/Sep/14 ]

another instance:
https://testing.hpdd.intel.com/test_sets/59550492-3329-11e4-bf24-5254006e85c2

this is also with sles11sp3 client. I wonder if this is only seen on sles11sp3.

Comment by Minh Diep [ 04/Sep/14 ]

it'd seen here with el7 client
https://testing.hpdd.intel.com/test_sets/06458494-2fc7-11e4-957a-5254006e85c2

Comment by Bob Glossman (Inactive) [ 15/Nov/14 ]

seen again in master, el7 client:
https://testing.hpdd.intel.com/test_sets/31a088ae-6c81-11e4-8bd3-5254006e85c2

Comment by Bob Glossman (Inactive) [ 15/Nov/14 ]

I'm starting to think this test is invalid in kernels newer than 2.6. I see consistent, repeatable failures in any manual invocation of openfile with any illegal combination of open flags. The set of illegal flags in test 33b is just one example. I see similar errors with opens not on lustre, in /tmp or other directories. It always fails and returns ENOENT. This is seen in sles11sp3 (3.0 kernels), sles12 (3.12 kernels), and el7 (3.10 kernels).

Strongly suspect generic open code is more picky about open flags in newer kernels and is returning an error before ever reaching lustre.

Comment by Andreas Dilger [ 08/Dec/14 ]

Bob, is this just a matter of fixing the test to ignore the return code of this test? The test comment is "no panic", which we would detect via "/proc/sys/lnet/catastrophe", so it might be enough to add "|| true" at the end.

Comment by Bob Glossman (Inactive) [ 08/Dec/14 ]

Andreas, yes you are correct. I think it may be as simple as teaching the test to ignore the return value of $OPENFILE command. As far as I can tell this test is just trying to see a panic doesn't happen, it doesn't really care of the command succeeds or not. Or at least it shouldn't care.

I'm just wondering if it's even worth it to keep this test around. I've never seen it actually cause the panic it's checking against. Suspect it went in a long time ago to look for a problem that was fixed a long time ago.

Comment by Gerrit Updater [ 08/Dec/14 ]

Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/12992
Subject: LU-5373 test: ignore command return value in sanity test_33b
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8eb4a5d72b3bfbe4febc921391da0fbb329832e8

Comment by Bob Glossman (Inactive) [ 08/Dec/14 ]

following Andreas' suggestion I've pushed a simple fix that just ignores the command return value. If anybody objects to this solution please add a review comment.

Still think it might be better to just delete the test since it doesn't seem to be testing anything useful.

Comment by Andreas Dilger [ 09/Dec/14 ]

Bob, I'd prefer to keep the test around. The whole point of a test that always passes is that you know when it fails in the future.

Comment by Gerrit Updater [ 10/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12992/
Subject: LU-5373 test: ignore command return value in sanity test_33b
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9b1569f56a1504e89e29c769900fedcbaad4abe7

Comment by Jodi Levi (Inactive) [ 16/Jan/15 ]

Patch landed to Master.

Generated at Sat Feb 10 01:50:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.