Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
None
-
3
-
9223372036854775807
Description
We ran IOR with LNet router failure simulation and encountered data corruption which seems to be reproducible on master.
The following scenario happens:
- a client thread writes some data to page N of file X
- page N is transfered to the OSS, the processing thread sleeps somewhere
- the original BRW request timeouts and the client resends page N
- page N is successfully written to disk, the client receives the reply and clears PG_Writeback
- a client thread writes different data to the same page N of file X
- page N with the new data is successfully written to disk, the client receives the reply and clears PG_Writeback
- the OSS thread from step 2 wakes up and writes stale data to disk [data corruption]
A reproducer will be uploaded shortly.
Attachments
Activity
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32281/
Subject: LU-10958 ofd: data corruption due to RPC reordering
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 35679a730bf0b7a8d4ce84cadc3ecc7c289ef491
Proposed for 2.14.0. With -RC1 already available, I realize that its candidacy might not hold.
No, I don't think so. There's nothing wrong in the md layer except the delay itself which makes it possible for the resent RPC and the RPC after it to complete before the initial delayed RPC. This delay is the analogue of OBD_FAIL_OST_BRW_PAUSE_BULK2 from https://review.whamcloud.com/#/c/32165/6/lustre/tests/recovery-small.sh
Delay can happen anywhere on a non-RTOS system.
In our scenario the real delay happens in the dm/mdraid layer after the bulk transfer succeeded.
Isn't that a problem of the DM/mdraid later that it is reordering writes incorrectly? If the OST thread submit writes to disk as A, A', B, but the disk writes A', B, A because A was blocked in the IO stack, then there isn't much we can do about it.
Hi Peter,
Can we get some attention on this bug? It's low hanging fruit.
-cf
the logic should be generic enough. I have a passing memory about bulk xids being incremented on resend, but that should not affect detection of double request processing.
This topic was raised again during review. It seems like ptlrpc_server_check_resend_in_progress() was written in such a way as not to drop brw resends in progress since bulk transfer may fail after match bits get changed. In our scenario the real delay happens in the dm/mdraid layer after the bulk transfer succeeded.
No -EBUSY from ptlrpc_server_request_add() calls in the attached log:
root@panda:/tmp/134# egrep "server_request_add.*leaving" recovery-small.test_134.debug_log.panda-testbox.1524758128.log 00000100:00000001:1.0:1524758084.215027:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758084.296625:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.372998:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.374387:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.375747:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.377223:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.378622:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.379762:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.381124:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758084.618182:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) ... 00000100:00000001:2.0:1524758109.428916:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758109.429419:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758109.429583:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758109.429676:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758109.429710:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758109.429816:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758109.429839:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758109.429948:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758109.430084:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758109.430352:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758109.430510:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758109.491963:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758111.940049:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758111.940090:0:28570:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.429713:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.429848:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758114.429912:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758114.430457:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.430567:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758114.430693:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.430713:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758114.431758:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.431789:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758114.432571:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758114.433376:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758114.433438:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758114.499932:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758116.948122:0:29382:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758116.948779:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758119.430075:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.430712:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.430857:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758119.430972:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758119.431110:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758119.431420:0:28566:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758119.432337:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758119.432462:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758119.432919:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.433354:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.433531:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.434360:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758119.508004:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758121.956131:0:29382:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758121.956785:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.431106:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.431686:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.431826:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.432540:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.432774:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.432916:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.433065:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.433192:0:28957:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758124.433610:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.433857:0:28958:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.435074:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758124.435867:0:28839:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758124.515970:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.464702:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.464840:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.465510:0:28256:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.465654:0:28256:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758126.465754:0:28255:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.465784:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.465793:0:28256:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.465928:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.466064:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.466192:0:28568:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.468126:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.468374:0:28252:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.481228:0:28573:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.483509:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.521548:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758126.591581:0:28259:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758126.599733:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758126.603148:0:29095:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758126.606446:0:30161:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758126.613821:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.629660:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758126.651194:0:28259:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758126.964226:0:29382:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1524758126.964243:0:30160:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758128.709591:0:29095:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758128.759534:0:28588:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758128.762873:0:29095:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:3.0:1524758128.763899:0:28567:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758128.767953:0:28572:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1524758128.780653:0:28572:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1524758128.785932:0:28258:0:(service.c:1732:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0)
Andrew Perepechko (c17827@cray.com) uploaded a new patch: https://review.whamcloud.com/32281
Subject: LU-10958 ofd: data corruption due to RPC reordering
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f0f14e60a61c227734c8cebfe2cb1c0cd2397f20
I uploaded the logs from my text box as 134.tar.bz2. Something like the following happens:
root@panda:/tmp/134# egrep "tgt_brw_write|tgt_brw_read" recovery-small.test_134.debug_log.panda-testbox.1524758128.log (1) writing a file of 4096 zeros 00000020:00000001:1.0:1524758106.298943:0:28572:0:(tgt_handler.c:2270:tgt_brw_write()) Process entered 00000020:00000040:1.0:1524758106.299164:0:28572:0:(tgt_handler.c:2385:tgt_brw_write()) Client use short io for data transfer, size = 4096 00000020:00000040:1.0:1524758106.299205:0:28572:0:(tgt_handler.c:2446:tgt_brw_write()) Checksum 1 from 12345-0@lo OK: 10000001 00000020:00000001:1.0:1524758106.341298:0:28572:0:(tgt_handler.c:2514:tgt_brw_write()) Process leaving (rc=0 : 0 : 0) (2) copying this file to /tmp for later cmp 00000020:00000001:1.0:1524758106.362982:0:28572:0:(tgt_handler.c:1956:tgt_brw_read()) Process entered 00000020:00008000:1.0:1524758106.378496:0:28572:0:(tgt_handler.c:2089:tgt_brw_read()) checksum at read origin: 10000001 00000020:00000001:1.0:1524758106.378764:0:28572:0:(tgt_handler.c:2174:tgt_brw_read()) Process leaving (rc=4096 : 4096 : 1000) 00000020:00000001:1.0:1524758106.381031:0:28572:0:(tgt_handler.c:1956:tgt_brw_read()) Process entered 00000020:00008000:1.0:1524758106.381160:0:28572:0:(tgt_handler.c:2089:tgt_brw_read()) checksum at read origin: 1 00000020:00000001:1.0:1524758106.381374:0:28572:0:(tgt_handler.c:2174:tgt_brw_read()) Process leaving (rc=0 : 0 : 0) (3) initial brw rpc 00000020:00000001:1.0:1524758106.521557:0:28572:0:(tgt_handler.c:2270:tgt_brw_write()) Process entered 00000020:00000040:1.0:1524758106.521743:0:28572:0:(tgt_handler.c:2385:tgt_brw_write()) Client use short io for data transfer, size = 4096 00000020:00000040:1.0:1524758106.521785:0:28572:0:(tgt_handler.c:2446:tgt_brw_write()) Checksum 2 from 12345-0@lo OK: 4ebccd1f (4) here things get stalled and the client resends this rpc 00000020:00000001:1.0:1524758126.481399:0:30161:0:(tgt_handler.c:2270:tgt_brw_write()) Process entered 00000020:00000020:1.0:1524758126.481435:0:30161:0:(tgt_handler.c:2361:tgt_brw_write()) @@@ clear resent/replay req grant info req@ffff880125623c50 x1598824716964176/t0(0) o4->60f13dec-9165-f217-961e-91740a2150d5@0@lo:0/0 lens 4704/448 e 0 to 0 dl 1524758146 ref 1 fl Interpret:/2/0 rc 0/0 00000020:00000040:1.0:1524758126.481634:0:30161:0:(tgt_handler.c:2385:tgt_brw_write()) Client use short io for data transfer, size = 4096 00000020:00000001:1.0:1524758126.482205:0:30161:0:(tgt_handler.c:2514:tgt_brw_write()) Process leaving (rc=0 : 0 : 0) (5) the client is now able to rewrite old data 00000020:00000001:0.0:1524758126.606629:0:30161:0:(tgt_handler.c:2270:tgt_brw_write()) Process entered 00000020:00000040:0.0:1524758126.606824:0:30161:0:(tgt_handler.c:2385:tgt_brw_write()) Client use short io for data transfer, size = 4096 00000020:00000040:0.0:1524758126.606865:0:30161:0:(tgt_handler.c:2446:tgt_brw_write()) Checksum 4 from 12345-0@lo OK: 10000001 00000020:00000001:0.0:1524758126.612424:0:30161:0:(tgt_handler.c:2514:tgt_brw_write()) Process leaving (rc=0 : 0 : 0) (6) the data from the initial brw rpc get written to disk (corruption) 00000020:00000001:1.0:1524758128.533347:0:28572:0:(tgt_handler.c:2514:tgt_brw_write()) Process leaving (rc=0 : 0 : 0) (7) the final data verification 00000020:00000001:1.0:1524758128.768109:0:28572:0:(tgt_handler.c:1956:tgt_brw_read()) Process entered 00000020:00008000:1.0:1524758128.778266:0:28572:0:(tgt_handler.c:2089:tgt_brw_read()) checksum at read origin: 4ebccd1f 00000020:00000001:1.0:1524758128.778548:0:28572:0:(tgt_handler.c:2174:tgt_brw_read()) Process leaving (rc=4096 : 4096 : 1000) 00000020:00000001:1.0:1524758128.780809:0:28572:0:(tgt_handler.c:1956:tgt_brw_read()) Process entered 00000020:00008000:1.0:1524758128.780932:0:28572:0:(tgt_handler.c:2089:tgt_brw_read()) checksum at read origin: 1 00000020:00000001:1.0:1524758128.781146:0:28572:0:(tgt_handler.c:2174:tgt_brw_read()) Process leaving (rc=0 : 0 : 0
Hello,
Is a backport planned for the b2_12 branch for this issue ?