[LU-4425] Test failure on test suite conf-sanity, subtest test_28 Created: 02/Jan/14  Updated: 16/May/14  Resolved: 16/May/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 12154

 Description   

This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/55b7cddc-7304-11e3-9955-52540035b04c.

The sub-test test_28 failed with the following error:

check lustre.llite.max_read_ahead_whole_mb failed!

Info required for matching: conf-sanity 28



 Comments   
Comment by James A Simmons [ 02/Jan/14 ]

To repeat what was posted on LU-3319.

Test conf-sanity 28 fails 4% of time with the following error.

LustreError: 18595:0:(lproc_llite.c:367:ll_max_read_ahead_whole_mb_seq_write()) can't set max_read_ahead_whole_mb more than max_read_ahead_per_file_mb: 0
LustreError: 18595:0:(obd_config.c:1443:class_process_proc_seq_param()) writing proc entry max_read_ahead_whole_mb err -34

Looking at the code this means the variable ra_max_pages is being set to zero. Looking at llite_lib.c
we see that

pages = si.totalram - si.totalhigh;
...
sbi->ll_ra_info.ra_max_pages_per_file = min(pages / 32,
SBI_DEFAULT_READAHEAD_MAX);

so for some reason si_meminfo() is reporting no free memory.

Possible idea is for test 28 to set max_readahead_per_file_mb before the test start since this is the reason for the test fails.

Comment by Keith Mannthey (Inactive) [ 02/Jan/14 ]

meminfo can report 0 memory if memory is exhausted. Memory is a limited resource and the kernel keeps the last little bit protected. Are there any other signs of memory pressure on the system?

Comment by Andreas Dilger [ 09/Jan/14 ]

Is there any correlation between the test failures and running on SLES clients? I'm wondering if there is any chance that this could be related to seq_file changes somehow?

Comment by James A Simmons [ 09/Jan/14 ]

This is showing up with an earlier patch for 7290. Waiting to see if the new patch has this problem still.

Comment by James A Simmons [ 21/Jan/14 ]

This problems seems to have gone away. We can close this ticket if that is the case but can be reopen if the problem appears again.

Comment by James A Simmons [ 04/Feb/14 ]

Does this problem still happen? If not can we close this ticket.

Comment by James A Simmons [ 16/May/14 ]

llite proc bug no longer shows up. Peter you can close this ticket.

Comment by Peter Jones [ 16/May/14 ]

As you wish

Generated at Sat Feb 10 01:42:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.