[LU-14711] Canceling lock with a lot of cached data can take a lot of time - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.15.0
Affects Version/s: Lustre 2.15.0
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

On clients with large amounts of RAM it's possible to have large thinly-striped files to have a single object with a lot of pages cached.

When such a lock is then canceled iterating over all of those pages takes a long time during which three are no RPCs to be sent (e.g. because we are truncating the lock or if the lock is PR).

Here's a simple testcase I have

lfs setstripe /mnt/lustre -c 2
dd if=/dev/zero of=/mnt/lustre/testfile1 bs=4096k count=1
dd if=/dev/zero of=/mnt/lustre/testfile2 bs=4096k count=800
mv /mnt/lustre/testfile1 /mnt/lustre/testfile2

Now the the destroy for the 3.2G file causes every of both stripes to be destroyed and according to the logs even at default log level the process takes 4.7s, so if the file was 30x bigger (100G) we'd already spend 141 second just iterating over pages on this particular machine.

00010000:00010000:0.0:1622008589.369887:0:5816:0:(ldlm_request.c:1150:ldlm_cli_cancel_local()) ### client-side cancel ns: lustre-OST0001-osc-ffff880316ae0800 lock: ffff88039a18cd80/0xfe254c0b2e6873ba lrc: 3/0,0 mode: PW/PW res: [0x19:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x428400010000 nid: local remote: 0xfe254c0b2e6873c1 expref: -99 pid: 11550 timeout: 0 lvb_type: 1
00000080:00200000:0.0:1622008589.369896:0:5816:0:(vvp_io.c:1717:vvp_io_init()) [0x200000401:0x18:0x0] ignore/verify layout 1/0, layout version 0 restore needed 0
00000080:00200000:0.0:1622008594.161234:0:5816:0:(vvp_io.c:313:vvp_io_fini()) [0x200000401:0x18:0x0] ignore/verify layout 1/0, layout version 0 need write layout 0, restore needed 0
00010000:00010000:0.0:1622008594.161266:0:5816:0:(ldlm_request.c:1209:ldlm_cancel_pack()) ### packing ns: lustre-OST0001-osc-ffff880316ae0800 lock: ffff88039a18cd80/0xfe254c0b2e6873ba lrc: 2/0,0 mode: --/PW res: [0x19:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->1048575) flags: 0x4c69400010000 nid: local remote: 0xfe254c0b2e6873c1 expref: -99 pid: 11550 timeout: 0 lvb_type: 1

We need to send something to the server if cancel is taking a long time just to prolong the lock and indicate we are still there. This is not super ideal because of course instant cancel RPC sounds better on the surface but is trickier to implement in all cases but DESTROY where we are sure no more data could be added to the mapping.

Attachments

Issue Links

is duplicated by

LU-14885 sanity test_903: __ptlrpc_prep_bulk_page()) ASSERTION( page != ((void *)0) ) failed

Resolved

is related to

LU-11290 Batch callbacks in osc_page_gang_lookup

Resolved

LU-13134 try to use slab allocation for cl_page

Resolved

Activity

People

Assignee:: Oleg Drokin

Reporter:: Oleg Drokin

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 26/May/21 7:24 PM

Updated:: 03/Jul/24 4:33 AM

Resolved:: 04/Oct/21 5:14 PM