[LU-14711] Canceling lock with a lot of cached data can take a lot of time Created: 26/May/21 Updated: 20/Jan/23 Resolved: 04/Oct/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.0 |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Oleg Drokin | Assignee: | Oleg Drokin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
On clients with large amounts of RAM it's possible to have large thinly-striped files to have a single object with a lot of pages cached. When such a lock is then canceled iterating over all of those pages takes a long time during which three are no RPCs to be sent (e.g. because we are truncating the lock or if the lock is PR). Here's a simple testcase I have lfs setstripe /mnt/lustre -c 2 dd if=/dev/zero of=/mnt/lustre/testfile1 bs=4096k count=1 dd if=/dev/zero of=/mnt/lustre/testfile2 bs=4096k count=800 mv /mnt/lustre/testfile1 /mnt/lustre/testfile2 Now the the destroy for the 3.2G file causes every of both stripes to be destroyed and according to the logs even at default log level the process takes 4.7s, so if the file was 30x bigger (100G) we'd already spend 141 second just iterating over pages on this particular machine. 00010000:00010000:0.0:1622008589.369887:0:5816:0:(ldlm_request.c:1150:ldlm_cli_cancel_local()) ### client-side cancel ns: lustre-OST0001-osc-ffff880316ae0800 lock: ffff88039a18cd80/0xfe254c0b2e6873ba lrc: 3/0,0 mode: PW/PW res: [0x19:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x428400010000 nid: local remote: 0xfe254c0b2e6873c1 expref: -99 pid: 11550 timeout: 0 lvb_type: 1 00000080:00200000:0.0:1622008589.369896:0:5816:0:(vvp_io.c:1717:vvp_io_init()) [0x200000401:0x18:0x0] ignore/verify layout 1/0, layout version 0 restore needed 0 00000080:00200000:0.0:1622008594.161234:0:5816:0:(vvp_io.c:313:vvp_io_fini()) [0x200000401:0x18:0x0] ignore/verify layout 1/0, layout version 0 need write layout 0, restore needed 0 00010000:00010000:0.0:1622008594.161266:0:5816:0:(ldlm_request.c:1209:ldlm_cancel_pack()) ### packing ns: lustre-OST0001-osc-ffff880316ae0800 lock: ffff88039a18cd80/0xfe254c0b2e6873ba lrc: 2/0,0 mode: --/PW res: [0x19:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x4c69400010000 nid: local remote: 0xfe254c0b2e6873c1 expref: -99 pid: 11550 timeout: 0 lvb_type: 1 We need to send something to the server if cancel is taking a long time just to prolong the lock and indicate we are still there. This is not super ideal because of course instant cancel RPC sounds better on the surface but is trickier to implement in all cases but DESTROY where we are sure no more data could be added to the mapping. |
| Comments |
| Comment by Oleg Drokin [ 26/May/21 ] |
|
tangentially related to speed up processing is |
| Comment by Andreas Dilger [ 26/May/21 ] |
|
Per earlier discussion, it may be possible that sending a zero-byte read or write to the OST with the cancelling DLM lock handle would be enough to prolong the lock timeout on the OSS, and avoid eviction. However, reducing the time that page eviction takes would also be desirable, such as |
| Comment by Oleg Drokin [ 28/May/21 ] |
|
zero sized io sadly does not work so I'll do 1 byte io with "discard me" flag, old servers not aware of the flag will do io, new servers will discard the io altogether. As I am adding a patch here, I just realized that just prolonging the lock from client side is still only a half measure, the client that sent a lock cancel is still going to timeout in 600 seconds (at_max). Though they will resend so at least no evictions, but the chatter in the logs will be substantial. Something to keep in mind. |
| Comment by Gerrit Updater [ 28/May/21 ] |
|
Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43857 |
| Comment by Gerrit Updater [ 29/May/21 ] |
|
Oleg Drokin (green@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43869 |
| Comment by Gerrit Updater [ 14/Jun/21 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43857/ |
| Comment by Gerrit Updater [ 13/Aug/21 ] |
|
"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44654 |
| Comment by Gerrit Updater [ 17/Sep/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43869/ |
| Comment by Gerrit Updater [ 04/Oct/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44654/ |
| Comment by Peter Jones [ 04/Oct/21 ] |
|
Landed for 2.15 |