[LU-4421] Failure on test suite sanity test_120e: 1 blocking RPC occured Created: 30/Dec/13  Updated: 05/May/14  Resolved: 05/May/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: dne

Issue Links:
Duplicate
duplicates LU-4206 Sanity test_120e fails with 1 blockin... Resolved
Related
is related to LU-4747 sanity tests 120* call to "lctl get_p... Resolved
Severity: 3
Rank (Obsolete): 12139

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/8cbadcc4-706c-11e3-a3b4-52540035b04c.

The sub-test test_120e failed with the following error:

1 blocking RPC occured.

test log shows:

== sanity test 120e: Early Lock Cancel: unlink test == 10:25:53 (1388255153)
ldlm.namespaces.lustre-MDT0000-mdc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-MDT0001-mdc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-MDT0002-mdc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-MDT0003-mdc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0000-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0001-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0002-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0003-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0004-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0005-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0006-osc-ffff8800697aa400.lru_size=200
ldlm.namespaces.lustre-OST0007-osc-ffff8800697aa400.lru_size=200
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00208266 s, 246 kB/s
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00540223 s, 94.8 kB/s
error: get_param: /proc/{fs,sys}/{lnet,lustre}/ldlm/services/ldlm_canceld/stats: Found no match
error: get_param: /proc/{fs,sys}/{lnet,lustre}/ldlm/services/ldlm_canceld/stats: Found no match
 sanity test_120e: @@@@@@ FAIL: 1 blocking RPC occured. 

Hit this issue in patch http://review.whamcloud.com/#/c/7087/, not sure if this is master problem or just the patch



 Comments   
Comment by Jodi Levi (Inactive) [ 08/Jan/14 ]

Fan Yong,
Can you look into this one and comment?

Comment by Swapnil Pimpale (Inactive) [ 30/Mar/14 ]

Another instance where this bug was hit: https://maloo.whamcloud.com/test_sets/a9708b10-b799-11e3-97ab-52540035b04c

Comment by Bob Glossman (Inactive) [ 01/Apr/14 ]

another:
https://maloo.whamcloud.com/test_sets/42cc4338-b92c-11e3-a578-52540035b04c

Comment by James Nunez (Inactive) [ 14/Apr/14 ]

I've hit the same error at https://maloo.whamcloud.com/test_sets/805a8136-c27f-11e3-a886-52540035b04c in review-dne-part-1. The error message is the same, but, as you can see below, I don't get the get_param error

...
ldlm.namespaces.lustre-OST0005-osc-ffff88007e7b4c00.lru_size=200
ldlm.namespaces.lustre-OST0006-osc-ffff88007e7b4c00.lru_size=200
ldlm.namespaces.lustre-OST0007-osc-ffff88007e7b4c00.lru_size=200
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00499895 s, 102 kB/s
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00506382 s, 101 kB/s
CMD: client-32vm3 /usr/sbin/lctl get_param -n ldlm.services.ldlm_canceld.stats
CMD: client-32vm3 /usr/sbin/lctl get_param -n ldlm.services.ldlm_canceld.stats
 sanity test_120e: @@@@@@ FAIL: 1 blocking RPC occured. 

Looking at the two cases above https://maloo.whamcloud.com/test_sets/a9708b10-b799-11e3-97ab-52540035b04c and https://maloo.whamcloud.com/test_sets/42cc4338-b92c-11e3-a578-52540035b04c, there is no get_param error in those logs. So, I think the get_param error was corrected in LU-4747, but there still an issue with the blocking RPC.

Should we close this ticket and open a new one with the slightly modified client test log?

Comment by Andreas Dilger [ 05/May/14 ]

Duplicate of LU-4206.

Generated at Sat Feb 10 01:42:34 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.