Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.11.0
-
None
-
RHEL 7 + master 063a83ab1fe518e52dbc7fb5f6e9d092b20f44e9.
-
3
-
9223372036854775807
Description
while testing an own patch I hit a this panic.
As I see, it's result of not atomically counting of unstable pages.
osc_io_commit_async add to cache and unlock an osc object, and size will be update a short after. osc_io_unplug called from brw_queue_work and found a pages in cache, when tries to count an unstable pages number while panic a hit.
this panic can easy replicated if we have delay a size updates with patch
diff --git a/lustre/osc/osc_io.c b/lustre/osc/osc_io.c index 3d353324f1..5471b96ec1 100644 --- a/lustre/osc/osc_io.c +++ b/lustre/osc/osc_io.c @@ -267,7 +267,7 @@ int osc_io_commit_async(const struct lu_env *env, struct osc_object *osc = cl2osc(ios->cis_obj); struct cl_page *page; struct cl_page *last_page; - struct osc_page *opg; + struct osc_page *opg = NULL; int result = 0; ENTRY; @@ -311,9 +311,6 @@ int osc_io_commit_async(const struct lu_env *env, break; } - osc_page_touch_at(env, osc2cl(osc), osc_index(opg), - page == last_page ? to : PAGE_SIZE); - cl_page_list_del(env, qin, page); (*cb)(env, io, page); @@ -321,6 +318,9 @@ int osc_io_commit_async(const struct lu_env *env, * complete at any time. */ } + osc_page_touch_at(env, osc2cl(osc), osc_index(opg), + page == last_page ? to : PAGE_SIZE); + /* for sync write, kernel will wait for this page to be flushed before * osc_io_end() is called, so release it earlier. * for mkwrite(), it's known there is no further pages. */
panic hit constantly with
- ONLY=42 REFORMAT=yes sh sanity.sh
typically in 42e test,.