[LU-2864] 2.1.4<->2.4.0 interop: replay-dual test_16: FAIL: post-failover df: 1 Created: 26/Feb/13 Updated: 18/Mar/13 Resolved: 18/Mar/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.1.4 |
| Fix Version/s: | Lustre 2.1.5 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/176 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 6930 | ||||||||
| Description |
|
replay-dual test 16 failed as follows: client-19vm2: stat: cannot read file system information for `/mnt/lustre': Interrupted system call replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 Console log on the client node client-19vm1 showed that: 08:21:52:Lustre: Server MGS version (2.3.61.0) is much newer than client version. Consider upgrading client (2.1.4) 08:21:52:Lustre: Skipped 39 previous similar messages 08:22:53:LustreError: 8555:0:(client.c:2631:ptlrpc_replay_interpret()) @@@ status 0, old was -19 req@ffff88007b69e800 x1427685198043914/t73014444058(73014444058) o36->lustre-MDT0000-mdc-ffff880075efa000@10.10.4.222@tcp:12/10 lens 488/416 e 5 to 0 dl 1361550203 ref 2 fl Interpret:R/4/0 rc 0/0 08:23:46:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 Console log on the MDS client-19vm3 showed that: 08:22:08:LustreError: 16656:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid. req@ffff880066c32850 x1427685267096850/t0(4295075319) o101->7f085d1a-9095-4317-769f-312b8fd60840@10.10.4.221@tcp:0/0 lens 528/1136 e 0 to 0 dl 1361550149 ref 1 fl Interpret:/4/0 rc 0/0 08:22:08:LustreError: 16657:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid. req@ffff88005ac70050 x1427685267111916/t0(4295084317) o101->7f085d1a-9095-4317-769f-312b8fd60840@10.10.4.221@tcp:0/0 lens 544/1152 e 0 to 0 dl 1361550149 ref 1 fl Interpret:/4/0 rc 0/0 08:22:08:LustreError: 16656:0:(mdt_recovery.c:417:mdt_last_rcvd_update()) Trying to overwrite bigger transno:on-disk: 68719476750, new: 4295139013 replay: 1. see LU-617. 08:22:08:LustreError: 16656:0:(osd_handler.c:829:osd_trans_stop()) Failure in transaction hook: -75 08:22:20:LustreError: 16656:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid. req@ffff880062863850 x1427685266970737/t0(4294986536) o101->7289a3d0-4e8e-df25-b688-f949ec08245d@10.10.4.221@tcp:0/0 lens 544/1152 e 0 to 0 dl 1361550154 ref 1 fl Interpret:/4/0 rc 0/0 08:22:20:LustreError: 16655:0:(mdt_recovery.c:417:mdt_last_rcvd_update()) Trying to overwrite bigger transno:on-disk: 4295147786, new: 4295080543 replay: 1. see LU-617. 08:22:20:LustreError: 16655:0:(osd_handler.c:829:osd_trans_stop()) Failure in transaction hook: -75 08:23:01:Lustre: lustre-MDT0000: recovery is timed out, evict stale exports 08:23:01:Lustre: lustre-MDT0000: disconnecting 1 stale clients 08:23:44:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 Maloo report: https://maloo.whamcloud.com/test_sets/a19d48ee-7d78-11e2-85d0-52540035b04c |
| Comments |
| Comment by Jian Yu [ 27/Feb/13 ] |
|
Hi Yang Sheng, Could you please look into the console logs of the failure in this ticket to see whether the patch for |
| Comment by Yang Sheng [ 27/Feb/13 ] |
|
Yes, yujian, We can port it to b2_1 if the lu-951 patch is pass inspection and landed. |
| Comment by Jian Yu [ 16/Mar/13 ] |
|
Hi Yang Sheng, Could you please backport http://review.whamcloud.com/5531 to Lustre b2_1 branch? Thanks. |
| Comment by Yang Sheng [ 16/Mar/13 ] |
|
Patch commit to: http://review.whamcloud.com/5743 |
| Comment by Peter Jones [ 18/Mar/13 ] |
|
Landed for 2.1.5 |