[LU-2245] failure on sanity.sh test_101c: Small 4k read IO 1! - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.4.0
Labels:
None
Environment:
server/client are both RHEL6

Severity:
3
Rank (Obsolete):
5317

Description

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/34cdc80e-21c2-11e2-b552-52540035b04c.

The sub-test test_101c failed with the following error:

Small 4k read IO 1!

== sanity test 101c: check stripe_size aligned read-ahead =================== 00:07:32 (1351494452)
CMD: client-21-ib /usr/sbin/lctl set_param -n obdfilter.*.read_cache_enable=0
client-21-ib: error: set_param: /proc/{fs,sys}/{lnet,lustre}/obdfilter/*/read_cache_enable: Found no match
CMD: client-21-ib /usr/sbin/lctl set_param -n obdfilter.*.writethrough_cache_enable=0
client-21-ib: error: set_param: /proc/{fs,sys}/{lnet,lustre}/obdfilter/*/writethrough_cache_enable: Found no match
osc.lustre-OST0000-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0001-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0002-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0003-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0004-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0005-osc-ffff8803348f2000.rpc_stats=0
osc.lustre-OST0006-osc-ffff8803348f2000.rpc_stats=0

6.364788s, 102.967MB/s
osc.lustre-OST0000-osc-ffff8803348f2000 rpc check passed!
osc.lustre-OST0001-osc-ffff8803348f2000 rpc check passed!
 sanity test_101c: @@@@@@ FAIL: Small 4k read IO 1! 
  Trace dump:

Attachments

Issue Links

is related to

LU-3963 cleanup libcfs wrappers

Resolved

Activity

[LU-2245] failure on sanity.sh test_101c: Small 4k read IO 1!

Andreas Dilger added a comment - 17/Apr/14 5:03 PM

Closing this issue since we haven't seen this failure recently, outside of the previously reported and unrelated issue.

Andreas Dilger added a comment - 17/Apr/14 5:03 PM Closing this issue since we haven't seen this failure recently, outside of the previously reported and unrelated issue.

Andreas Dilger added a comment - 17/Apr/14 5:02 PM

This recent burst of failures was caused by patch 7803 landing.

Andreas Dilger added a comment - 17/Apr/14 5:02 PM This recent burst of failures was caused by patch 7803 landing.

James Nunez (Inactive) added a comment - 15/Apr/14 2:52 PM

I'm hitting this error with different values for the RPCs at https://maloo.whamcloud.com/test_sets/ecfe391a-c41c-11e3-a793-52540035b04c

Small 32k read IO 240 !

James Nunez (Inactive) added a comment - 15/Apr/14 2:52 PM I'm hitting this error with different values for the RPCs at https://maloo.whamcloud.com/test_sets/ecfe391a-c41c-11e3-a793-52540035b04c Small 32k read IO 240 !

Andreas Dilger added a comment - 08/Feb/13 2:20 PM

Found only two failures of test_101c() in the past 4 weeks:
https://maloo.whamcloud.com/sub_tests/313362c4-6bac-11e2-a7b9-52540035b04c
https://maloo.whamcloud.com/sub_tests/9e3f4c7a-5d83-11e2-8199-52540035b04c

Lowered priority, since there are more important bugs to fix for now.

Andreas Dilger added a comment - 08/Feb/13 2:20 PM Found only two failures of test_101c() in the past 4 weeks: https://maloo.whamcloud.com/sub_tests/313362c4-6bac-11e2-a7b9-52540035b04c https://maloo.whamcloud.com/sub_tests/9e3f4c7a-5d83-11e2-8199-52540035b04c Lowered priority, since there are more important bugs to fix for now.

Di Wang (Inactive) added a comment - 14/Jan/13 5:26 PM

I just checked these failures, Sorry for the delay. The test is failed because there are single page RPC during read-ahead. One possible reason might be we do random 64k reading here, if some pages are being evicted because of memory pressure, and these area(64) are being read again, we might see single page RPC here. Probably we should add no-duplicate options in reads, i.e. no area should be read twice even for random read.

But I am not sure it is the real reason here, since there are not enough debug log here. Is this reproducible ?

Di Wang (Inactive) added a comment - 14/Jan/13 5:26 PM I just checked these failures, Sorry for the delay. The test is failed because there are single page RPC during read-ahead. One possible reason might be we do random 64k reading here, if some pages are being evicted because of memory pressure, and these area(64) are being read again, we might see single page RPC here. Probably we should add no-duplicate options in reads, i.e. no area should be read twice even for random read. But I am not sure it is the real reason here, since there are not enough debug log here. Is this reproducible ?

Jian Yu added a comment - 20/Nov/12 2:39 AM

More instances:
https://maloo.whamcloud.com/test_sets/313ec4a8-2fe2-11e2-866f-52540035b04c
https://maloo.whamcloud.com/test_sets/669d8a72-2fb4-11e2-b30e-52540035b04c

Jian Yu added a comment - 20/Nov/12 2:39 AM More instances: https://maloo.whamcloud.com/test_sets/313ec4a8-2fe2-11e2-866f-52540035b04c https://maloo.whamcloud.com/test_sets/669d8a72-2fb4-11e2-b30e-52540035b04c

Li Wei (Inactive) added a comment - 18/Nov/12 8:35 PM

Wang Di, could you take a look at this?

Li Wei (Inactive) added a comment - 18/Nov/12 8:35 PM Wang Di, could you take a look at this?

Sarah Liu added a comment - 02/Nov/12 7:04 PM

SLES11 SP2 client also has this issue:
https://maloo.whamcloud.com/test_sets/d0072b9c-2534-11e2-9e7c-52540035b04c

Sarah Liu added a comment - 02/Nov/12 7:04 PM SLES11 SP2 client also has this issue: https://maloo.whamcloud.com/test_sets/d0072b9c-2534-11e2-9e7c-52540035b04c

Sarah Liu added a comment - 31/Oct/12 12:26 AM - edited

{comment removed, unrelated to this bug}

Sarah Liu added a comment - 31/Oct/12 12:26 AM - edited {comment removed, unrelated to this bug}

Li Wei (Inactive) added a comment - 29/Oct/12 7:47 PM

The procfs access problems should be fixed by updating the test to use set_obdfilter_param(), as the parameters in question are now exported by OSDs.

Li Wei (Inactive) added a comment - 29/Oct/12 7:47 PM The procfs access problems should be fixed by updating the test to use set_obdfilter_param(), as the parameters in question are now exported by OSDs.

People

Assignee:: Li Wei (Inactive)

Reporter:: Maloo

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 29/Oct/12 1:35 PM

Updated:: 17/Apr/14 5:03 PM

Resolved:: 17/Apr/14 5:03 PM