[LU-2872] Test timeout failure on test suite sanity-quota test_1 Created: 26/Feb/13  Updated: 29/Oct/14  Resolved: 29/Oct/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: performance, zfs

Issue Links:
Related
is related to LU-2887 sanity-quota test_12a: slow due to ZF... Resolved
Severity: 3
Rank (Obsolete): 6944

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/72650e6c-8071-11e2-b777-52540035b04c.

The sub-test test_1 failed with the following error:

test failed to respond and timed out

Info required for matching: sanity-quota 1

Client console log:

13:43:31:Lustre: DEBUG MARKER: == sanity-quota test 1: Block hard limit (normal use and out of quota) == 13:43:26 (1361915006)
13:43:42:Lustre: DEBUG MARKER: /usr/sbin/lctl mark User quota \(block hardlimit:10 MB\)
13:43:42:Lustre: DEBUG MARKER: User quota (block hardlimit:10 MB)
13:43:42:Lustre: DEBUG MARKER: /usr/sbin/lctl mark Write...
13:43:42:Lustre: DEBUG MARKER: Write...
13:53:56:Lustre: DEBUG MARKER: /usr/sbin/lctl mark Write out of block quota ...
13:53:56:Lustre: DEBUG MARKER: Write out of block quota ...
14:11:38:LustreError: 6386:0:(vvp_io.c:1085:vvp_io_commit_write()) Write page 2528 of inode ffff88007bb40b38 failed -122
14:11:38:LustreError: 6386:0:(vvp_io.c:1085:vvp_io_commit_write()) Write page 2528 of inode ffff88007bb40b38 failed -122
14:11:38:Lustre: DEBUG MARKER: cancel_lru_locks osc start
14:11:38:Lustre: DEBUG MARKER: cancel_lru_locks osc stop


 Comments   
Comment by Nathaniel Clark [ 27/Mar/13 ]

Recent failures:

https://maloo.whamcloud.com/test_sets/27a52b52-9677-11e2-8c64-52540035b04c
https://maloo.whamcloud.com/test_sets/eca93144-9665-11e2-9abb-52540035b04c
https://maloo.whamcloud.com/test_sets/44c758b0-9616-11e2-9abb-52540035b04c

All these failures test_0 fails before hand (dd runs at ~600KB/s)

One test has test_0 fail (487KB/s) and test_1 runs in 3398s
https://maloo.whamcloud.com/test_sets/3d0b7658-9641-11e2-9abb-52540035b04c

I think this may be a slow zfs performance issue (a la LU-2887)

When this test completes in "normal" time on zfs it still takes ~1900 seconds.

Comment by Nathaniel Clark [ 28/Mar/13 ]

Reduce fail dd rate for test 0 and mark test 1 SLOW for zfs
http://review.whamcloud.com/5876

Comment by Christopher Morrone [ 17/Apr/13 ]

If there is a problem that makes the zfs version that slow, I would consider it a bug. I'd very much prefer to have the bug fixed than to ignore the problem. Either that or I would like a very good explanation about why we don't care about it being that slow.

Comment by Nathaniel Clark [ 17/Apr/13 ]

Chris, The bug isn't being ignored. Performance is an issue under certain circumstances and that's being tracked in LU-2887. This bug (and a bunch of others) are waiting on a resolution to that issue. I should have linked these bugs sooner. Sorry for the confusion.

Comment by Jodi Levi (Inactive) [ 22/Apr/13 ]

Nathaniel,
Can this ticket now be closed with Change,5876 landing and the remaining performance issues being tracked under LU-2887? Or is additional work needed on this ticket?

Comment by Andreas Dilger [ 01/Oct/14 ]

sanity-quota.sh test_1 is still being skipped for ZFS filesystems.

Comment by Nathaniel Clark [ 01/Oct/14 ]

Re-enable for ZFS
http://review.whamcloud.com/12157

Comment by Nathaniel Clark [ 29/Oct/14 ]

Patch Landed to master (prior to 2.6.54)

Generated at Sat Feb 10 01:28:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.