[LU-581] 1.8<->2.1 interop: sanity test 120: FAIL: 1 blocking RPC occured Created: 09/Aug/11 Updated: 27/May/15 Resolved: 27/May/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.0.0, Lustre 2.1.1, Lustre 2.1.2, Lustre 1.8.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | Lai Siyao |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Old Lustre Version: 1.8.6-wc1 New Lustre Version: 2.0.66.0 Clean upgrading (Lustre servers and clients were upgraded all at once) from Lustre 1.8.6-wc1 to Lustre 2.0.66.0 under the following configuration: OSS1: RHEL5/x86_64 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Bugzilla ID: | 23,338 | ||||||||
| Rank (Obsolete): | 4383 | ||||||||
| Description |
|
After the upgrading, sanity test 120a failed on Lustre 2.0.66.0 as follows: == sanity test 120a: Early Lock Cancel: mkdir test == 03:12:14 (1312884734) ldlm.namespaces.lustre-MDT0000-mdc-ffff88031cc93400.lru_size=400 ldlm.namespaces.lustre-OST0000-osc-ffff88031cc93400.lru_size=400 ldlm.namespaces.lustre-OST0001-osc-ffff88031cc93400.lru_size=400 sanity test_120a: @@@@@@ FAIL: 1 blocking RPC occured. Please refer to the Maloo report for more logs: https://maloo.whamcloud.com/test_sets/a570d34e-c278-11e0-8bdf-52540025f9af sanity test 120 {c,d,e,f}also failed. This is an known issue: bug 23338. |
| Comments |
| Comment by Jian Yu [ 16/Aug/11 ] |
|
Lai Siyao would work on this ticket. |
| Comment by Lai Siyao [ 31/Aug/11 ] |
|
The cause is because ELC (early lock cancel) flags is set in LMV layer, but for 1.8 <-> 2.1 case, LMV doesn't exist. I will move ELC flags setting into llite for 2.1. |
| Comment by Jian Yu [ 05/Sep/11 ] |
|
The same issue occurred while running sanity tests after clean upgrading from Lustre 1.8.5/1.8.6-wc1 to 2.1.0: |
| Comment by Lai Siyao [ 09/Oct/11 ] |
|
review is on http://review.whamcloud.com/#change,1339 |
| Comment by Jian Yu [ 24/Feb/12 ] |
|
The same issue occurred while running sanity tests after clean upgrading from Lustre 1.8.7-wc1 to 2.1.1: |
| Comment by Jian Yu [ 07/Jun/12 ] |
|
The same issue occurred while running sanity tests after clean upgrading from Lustre 1.8.8-wc1 to 2.1.2: client-1: == sanity test 120a: Early Lock Cancel: mkdir test == 04:40:35 (1339069235) client-1: ldlm.namespaces.lustre-MDT0000-mdc-ffff8802a90f9800.lru_size=400 client-1: ldlm.namespaces.lustre-OST0000-osc-ffff8802a90f9800.lru_size=400 client-1: ldlm.namespaces.lustre-OST0001-osc-ffff8802a90f9800.lru_size=400 client-1: sanity test_120a: @@@@@@ FAIL: 1 blocking RPC occured. client-1: Dumping lctl log to /home/yujian/test_logs/2012-06-07/043300/sanity.test_120a.*.1339069236.log client-1: FAIL 120a (3s) Maloo report: https://maloo.whamcloud.com/test_sets/c250c2d6-b0d3-11e1-99ce-52540035b04c |
| Comment by Lai Siyao [ 04/Jul/12 ] |
|
Hi Yujian, the main cause is that 2.x config contains section for lmv, but 1.8 not. And some ELC logic is implemented in lmv layer. A simple way to fix it is to regenerate config logs, please refer to http://wiki.lustre.org/manual/LustreManual18_HTML/ConfiguringLustre.html 4.3.11. If this works, do you think it's acceptable. IIRC, regenerating config logs is a common way to fix config issues in upgrade, is it? |
| Comment by Jian Yu [ 05/Jul/12 ] |
|
Hi Lai, |
| Comment by Andreas Dilger [ 27/May/15 ] |
|
Haven't seen this in a long time. |