[LU-2864] 2.1.4<->2.4.0 interop: replay-dual test_16: FAIL: post-failover df: 1 Created: 26/Feb/13  Updated: 18/Mar/13  Resolved: 18/Mar/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.4
Fix Version/s: Lustre 2.1.5

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: Yang Sheng
Resolution: Fixed Votes: 0
Labels: None
Environment:

Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/176
Lustre master server build: http://build.whamcloud.com/job/lustre-master/1269
Distro/Arch: RHEL6.3/x86_64


Issue Links:
Related
is related to LU-951 Test failure on test suite replay-sin... Resolved
Severity: 3
Rank (Obsolete): 6930

 Description   

replay-dual test 16 failed as follows:

client-19vm2: stat: cannot read file system information for `/mnt/lustre': Interrupted system call
 replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 

Console log on the client node client-19vm1 showed that:

08:21:52:Lustre: Server MGS version (2.3.61.0) is much newer than client version. Consider upgrading client (2.1.4)
08:21:52:Lustre: Skipped 39 previous similar messages
08:22:53:LustreError: 8555:0:(client.c:2631:ptlrpc_replay_interpret()) @@@ status 0, old was -19  req@ffff88007b69e800 x1427685198043914/t73014444058(73014444058) o36->lustre-MDT0000-mdc-ffff880075efa000@10.10.4.222@tcp:12/10 lens 488/416 e 5 to 0 dl 1361550203 ref 2 fl Interpret:R/4/0 rc 0/0
08:23:46:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 

Console log on the MDS client-19vm3 showed that:

08:22:08:LustreError: 16656:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid.  req@ffff880066c32850 x1427685267096850/t0(4295075319) o101->7f085d1a-9095-4317-769f-312b8fd60840@10.10.4.221@tcp:0/0 lens 528/1136 e 0 to 0 dl 1361550149 ref 1 fl Interpret:/4/0 rc 0/0
08:22:08:LustreError: 16657:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid.  req@ffff88005ac70050 x1427685267111916/t0(4295084317) o101->7f085d1a-9095-4317-769f-312b8fd60840@10.10.4.221@tcp:0/0 lens 544/1152 e 0 to 0 dl 1361550149 ref 1 fl Interpret:/4/0 rc 0/0
08:22:08:LustreError: 16656:0:(mdt_recovery.c:417:mdt_last_rcvd_update()) Trying to overwrite bigger transno:on-disk: 68719476750, new: 4295139013 replay: 1. see LU-617.
08:22:08:LustreError: 16656:0:(osd_handler.c:829:osd_trans_stop()) Failure in transaction hook: -75
08:22:20:LustreError: 16656:0:(mdt_open.c:1468:mdt_reint_open()) @@@ OPEN & CREAT not in open replay/by_fid.  req@ffff880062863850 x1427685266970737/t0(4294986536) o101->7289a3d0-4e8e-df25-b688-f949ec08245d@10.10.4.221@tcp:0/0 lens 544/1152 e 0 to 0 dl 1361550154 ref 1 fl Interpret:/4/0 rc 0/0
08:22:20:LustreError: 16655:0:(mdt_recovery.c:417:mdt_last_rcvd_update()) Trying to overwrite bigger transno:on-disk: 4295147786, new: 4295080543 replay: 1. see LU-617.
08:22:20:LustreError: 16655:0:(osd_handler.c:829:osd_trans_stop()) Failure in transaction hook: -75
08:23:01:Lustre: lustre-MDT0000: recovery is timed out, evict stale exports
08:23:01:Lustre: lustre-MDT0000: disconnecting 1 stale clients
08:23:44:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-dual test_16: @@@@@@ FAIL: post-failover df: 1 

Maloo report: https://maloo.whamcloud.com/test_sets/a19d48ee-7d78-11e2-85d0-52540035b04c



 Comments   
Comment by Jian Yu [ 27/Feb/13 ]

Hi Yang Sheng,

Could you please look into the console logs of the failure in this ticket to see whether the patch for LU-951 is needed to be cherry-picked or ported to Lustre b2_1 branch or not? Thanks.

Comment by Yang Sheng [ 27/Feb/13 ]

Yes, yujian, We can port it to b2_1 if the lu-951 patch is pass inspection and landed.

Comment by Jian Yu [ 16/Mar/13 ]

Hi Yang Sheng,

Could you please backport http://review.whamcloud.com/5531 to Lustre b2_1 branch? Thanks.

Comment by Yang Sheng [ 16/Mar/13 ]

Patch commit to: http://review.whamcloud.com/5743

Comment by Peter Jones [ 18/Mar/13 ]

Landed for 2.1.5

Generated at Sat Feb 10 01:28:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.