[LU-11150] Use LCK_CR lock mode when send change layout intent to MDS Created: 17/Jul/18  Updated: 16/Aug/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Qian Yingjin (Inactive) Assignee: Qian Yingjin
Resolution: Unresolved Votes: 0
Labels: None

Attachments: HTML File log    
Rank (Obsolete): 9223372036854775807

 Description   

In current implementation, the client uses LCK_EX lock mode for layout lock when issue a layout change intent RPC to MDS. This will grant LCK_EX layout lock to the client. However, the latter restarting I/O will refresh layout, which only matches LCK_CR | LCK_CW | LCK_PR | LCK_PW mode of cached layout lock locally. it will result in reacquiring CR layout lock from MDS and cancel the previously granted EX layout lock to the client via lock blocking callback.
This patch avoids the this kind of unnecessary lock conflict by using LCK_CR mode directly when issue the layout change intent RPC to MDS. And when MDS received this request, it will first take EX layout lock and change the layout at will according to the write intent, and release the EX layout lock. After that, return a CR layout lock together with layout informatiom to client for latter I/O.



 Comments   
Comment by Gerrit Updater [ 17/Jul/18 ]

Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/32824
Subject: LU-11150 llite: Use LCK_CR lock mode when change layout
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6870ba724736c4876a68b0e093f7445d9e02d965

Comment by Qian Yingjin (Inactive) [ 18/Jul/18 ]

The attachment contains the log about about FLR write.

The commands to collect the log are as follows:

 

lfs mirror create -N2 /mnt/lustre/test
lctl dk
echo "QQQQ" > /mnt/lustre/test
lctl dk > log

 

From the log, we can see :

00000002:00010000:0.0:1531750260.797779:0:32684:0:(mdc_locks.c:722:mdc_finish_enqueue()) ### layout lock returned by: layout, lvb_len: 240 ns: lustre-MDT0000-mdc-ffff88001fc95000 lock: ffff88001b094ec0/0x7957eded4f46b416 lrc: 3/0,1 mode: EX/EX res: [0x200000401:0x1:0x0].0x0 bits 0x8/0x0 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x7957eded4f46b41d expref: -99 pid: 32684 timeout: 0 lvb_type: 3

 

00000002:00010000:0.0:1531750260.797782:0:32684:0:(mdc_locks.c:1049:mdc_finish_intent_lock()) ### matching against this ns: lustre-MDT0000-mdc-ffff88001fc95000 lock: ffff88001b094ec0/0x7957eded4f46b416 lrc: 3/0,1 mode: EX/EX res: [0x200000401:0x1:0x0].0x0 bits 0x8/0x0 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x7957eded4f46b41d expref: -99 pid: 32684 timeout: 0 lvb_type: 3

It returns the EX layout lock to the client. And the granted layout lock is cancelled later when refresh layout.

Comment by Jinshan Xiong [ 18/Jul/18 ]

That means it exists a problem for this particular code path, please go the MDT to find the corresponding code and make a fix.

The reason it requests a EX lock on the client is for early cancel.

Generated at Sat Feb 10 02:41:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.