Details
-
Bug
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
Add a sanity test (test_101k) that verifies drop_caches actually evicts Lustre client page cache pages.
Original investigation reframed: Lustre's drop_caches behavior is correct as designed. The kernel-level reclaim path (mapping_evict_folio in mm/truncate.c) checks the folio refcount against (nr_pages + has_private + 1) before calling release_folio. With Lustre's normal page state, this means a folio in the page cache with no in-flight RPC has refcount 3 (page cache + vvp_page_init's get_page + find_lock_entries elevation), which passes the threshold and gets evicted via ll_release_folio -> do_release_page.
What was confusing: write pages stay pinned for replay until the OST commits the write, because osc_brw_prep_request uses ptlrpc_bulk_kiov_pin_ops, which holds an extra get_page on every page in the bulk descriptor until ptlrpc_release_bulk_page_pin runs. This is intentional and required for write replay during recovery – the window between completion and commit can be arbitrarily long.
That means:
- Pages from a read-only or already-committed file: drop_caches works.
- Pages from a freshly-written file (commit not yet received): drop_caches DOES NOT evict them, by design.
There is no bug. test_101k validates the working behavior:
1. Write 4 MiB + conv=fsync (forces server commit -> bulk pin released)
2. drop_caches=1
3. Re-read and confirm at least one ost_read happened (cache miss)
4. Same flow without writing: cancel_lru_locks osc, read, drop, re-read, confirm cache miss
Side note (worth a separate ticket if anyone cares): global sync(2) does NOT drain Lustre's bulk pins. Only fsync(fd) (or per-file conv=fsync) does, because only fsync goes through ll_fsync -> OST_SYNC. Making sync(2) wait for OST commit would be expensive on every call, so leaving it as-is.
The original symptom report was investigated under https://review.whamcloud.com/65222 (now abandoned – the proposed fix was a no-op in dead code, see review for details).