[LU-16431] Close request is dropped during replay Created: 23/Dec/22  Updated: 03/Feb/23  Resolved: 03/Feb/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Andriy Skulysh Assignee: Andriy Skulysh
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

It is reproduced with replay-dual/26

00010000:00080000:0.0:1647814116.268086:0:14276:0:(ldlm_lib.c:2020:check_for_next_transno()) waking for next (8589940272)
00010000:00000001:0.0:1647814116.268087:0:14276:0:(ldlm_lib.c:2380:get_next_transno()) Process entered
00000020:00080000:0.0:1647814116.268087:0:14276:0:(update_recovery.c:605:distribute_txn_get_next_transno()) lustre-MDT0000: Next update transno 8589940272
00010000:00000001:0.0:1647814116.268088:0:14276:0:(ldlm_lib.c:2397:get_next_transno()) Process leaving (rc=8589940272 : 8589940272 : 200001630)
00000020:00000001:0.0:1647814116.268091:0:14276:0:(update_recovery.c:1313:distribute_txn_replay_handle()) Process entered
00000020:00080000:0.0:1647814116.268098:0:14276:0:(update_records.c:74:update_records_dump()) master transno = 8589940272 batchid = 4294967706 flags = 0 ops = 4 params = 3
00000020:00080000:0.0:1647814116.268101:0:14276:0:(update_records.c:93:update_records_dump()) update 0th [0x200000403:0x3:0x0] attr_set params_count = 1
00000020:00080000:0.0:1647814116.268103:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
00000020:00080000:0.0:1647814116.268106:0:14276:0:(update_records.c:93:update_records_dump()) update 1th [0x200000400:0xa:0x0] attr_set params_count = 1
00000020:00080000:0.0:1647814116.268108:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
00000020:00080000:0.0:1647814116.268109:0:14276:0:(update_records.c:93:update_records_dump()) update 2th [0x240000401:0xa:0x0] attr_set params_count = 1
00000020:00080000:0.0:1647814116.268111:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b568a 0th off = 0 size = 208
00000020:00080000:0.0:1647814116.268113:0:14276:0:(update_records.c:93:update_records_dump()) update 3th [0x200000001:0x15:0x0] write params_count = 2
00000020:00080000:0.0:1647814116.268115:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b5762 0th off = 1 size = 32
00000020:00080000:0.0:1647814116.268116:0:14276:0:(update_records.c:108:update_records_dump()) param = ffffa00a600b578a 1th off = 2 size = 8
(update_recovery.c:716:update_is_committed()) Update of [0x200000403:0x3:0x0]on MDT0 is not committed
00010000:00080000:0.0:1647814116.269438:0:14276:0:(ldlm_lib.c:2016:check_for_next_transno()) waking for duplicate req (8589940272)
00010000:00000001:0.0:1647814116.269439:0:14276:0:00010000:00080000:0.0:1647814116.269450:0:14276:0:(ldlm_lib.c:2418:drop_duplicate_replay_req()) @@@ remove t8589940272 from 192.168.101.19@tcp because of duplicate update records are found.
  req@ffffa00af4688050 x1727858180388096/t0(8589940272) o35->c9ceeeb0-66fe-f7a3-7273-2f5ddf46ff11@192.168.101.19@tcp:197/0 lens 392/0 e 0 to 0 dl 1647814122 ref 1 fl Complete:/4/ffffffff rc 0/-1 job:'dbench.0'
00010000:00020000:0.0:1647814116.269456:0:14276:0:(ldlm_lib.c:2432:drop_duplicate_replay_req()) @@@ wrong opc 35 from 192.168.101.19@tcp
  req@ffffa00af4688050 x1727858180388096/t0(8589940272) o35->c9ceeeb0-66fe-f7a3-7273-2f5ddf46ff11@192.168.101.19@tcp:197/0 lens 392/0 e 0 to 0 dl 1647814122 ref 1 fl Complete:/4/ffffffff rc 0/-1 job:'dbench.0'


 Comments   
Comment by Gerrit Updater [ 23/Dec/22 ]

"Andriy Skulysh <andriy.skulysh@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49506
Subject: LU-16431 mds: Close request is dropped during replay
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: edf894b57a493e53f3bec3b7ddcc72b3aaf65a61

Comment by Gerrit Updater [ 03/Feb/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49506/
Subject: LU-16431 mds: Close request is dropped during replay
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a801cee0ce9add2cc652b3c5f1da1a14d43748e9

Comment by Peter Jones [ 03/Feb/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:26:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.