Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10407

osc_cache.c:1141:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed:

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: Lustre 2.11.0
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      RHEL 7 + master 063a83ab1fe518e52dbc7fb5f6e9d092b20f44e9.
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      while testing an own patch I hit a this panic.
      As I see, it's result of not atomically counting of unstable pages.
      osc_io_commit_async add to cache and unlock an osc object, and size will be update a short after. osc_io_unplug called from brw_queue_work and found a pages in cache, when tries to count an unstable pages number while panic a hit.

      this panic can easy replicated if we have delay a size updates with patch

      diff --git a/lustre/osc/osc_io.c b/lustre/osc/osc_io.c
      index 3d353324f1..5471b96ec1 100644
      --- a/lustre/osc/osc_io.c
      +++ b/lustre/osc/osc_io.c
      @@ -267,7 +267,7 @@ int osc_io_commit_async(const struct lu_env *env,
       	struct osc_object *osc = cl2osc(ios->cis_obj);
       	struct cl_page  *page;
       	struct cl_page  *last_page;
      -	struct osc_page *opg;
      +	struct osc_page *opg = NULL;
       	int result = 0;
       	ENTRY;
       
      @@ -311,9 +311,6 @@ int osc_io_commit_async(const struct lu_env *env,
       				break;
       		}
       
      -		osc_page_touch_at(env, osc2cl(osc), osc_index(opg),
      -				  page == last_page ? to : PAGE_SIZE);
      -
       		cl_page_list_del(env, qin, page);
       
       		(*cb)(env, io, page);
      @@ -321,6 +318,9 @@ int osc_io_commit_async(const struct lu_env *env,
       		 * complete at any time. */
       	}
       
      +	osc_page_touch_at(env, osc2cl(osc), osc_index(opg),
      +			  page == last_page ? to : PAGE_SIZE);
      +
       	/* for sync write, kernel will wait for this page to be flushed before
       	 * osc_io_end() is called, so release it earlier.
       	 * for mkwrite(), it's known there is no further pages. */
      

      panic hit constantly with

      1. ONLY=42 REFORMAT=yes sh sanity.sh

      typically in 42e test,.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hongchao.zhang Hongchao Zhang
                Reporter:
                shadow Alexey Lyashkov
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: