Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
3
-
9223372036854775807
Description
It is reproducible with DOM test from sanity, not always.
The error relates to DOM component and show ESTALE read.
Jan 20 17:21:38 devvm-rk1 kernel: Lustre: DEBUG MARKER: == sanity test 273b: DoM: race writeback and object destroy ========================================================== 17:21:38 (1737390098) Jan 20 17:21:41 devvm-rk1 kernel: LustreError: lustre-MDT0000-mdc-ffffa0816bceb000: operation ost_read to node 192.168.122.55@tcp failed: rc = -116
The situation was next
- client closes a file
- MDT sends blocking AST to a client for a IBIT 0x40 lock
- client processing BLAST and call
mdc_lock_flush()>mdc_lock_discard_pages()>osc_page_gang_lookup()->osc_brw_prep_request()
this brw is for a one byte and outside DoM component size - server does not find a lock for this brw and return ESTALE
Intresting msgs are next
0000008:00000020:2.0:1737390101.104518:0:1928870:0:(osc_request.c:1833:osc_brw_prep_request()) Using short io for data transfer, size = 1 00000008:00000020:2.0:1737390101.104520:0:1928870:0:(osc_request.c:741:osc_announce_cached()) lustre-MDT0000-mdc-ffffa0816bceb000: dirty: 0 undirty: 1008074752 dropped 0 grant: 30810112 cl_lost_grant 0 00000008:00100000:2.0:1737390101.104522:0:1928870:0:(osc_request.c:2003:osc_brw_prep_request()) brw rpc ffffa08172c109c0 - object 0x200004284:2265 offset 1048576<>1048577 00000008:00000001:2.0:1737390101.104522:0:1928870:0:(osc_request.c:2006:osc_brw_prep_request()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1737390101.104523:0:1928870:0:(jobid.c:937:lustre_get_jobid()) Process entered 00000100:00000001:2.0:1737390101.104524:0:1928870:0:(jobid.c:977:lustre_get_jobid()) Process leaving (rc=0 : 0 : 0) 00000100:00000040:2.0:1737390101.104527:0:1928870:0:(ptlrpcd.c:300:ptlrpcd_add_req()) @@@ add req [00000000d3889700] to pc [ptlrpcd_01_01+1] req@ffffa08172c109c0 x1821780884276864/t0(0) o3->lustre-MDT0000-mdc-ffffa0816bceb000@192.168.122.55@tcp:13/10 lens 488/448 e 0 to 0 dl 0 ref 1 fl New:NQU/200/ffffffff rc 0/-1 job:'ldlm_bl.0' uid:0 gid:0