Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8435

LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • Lustre 2.7.0
    • Bull Lustre distribution based on Lustre 2.7.2
    • 3
    • 9223372036854775807

    Description

      In the last month one of our customer hit more than 100 times a crash with the following signature:

      [506626.555125] SLUB: Unable to allocate memory on node -1 (gfp=0x80c0)
      [506626.562216]   cache: kvm_mmu_page_header(22:step_batch), object size: 168,
      buffer size: 168, default order: 1, min order: 0
      [506626.574729]   node 0: slabs: 0, objs: 0, free: 0
      [506626.579974]   node 1: slabs: 0, objs: 0, free: 0
      [506626.585219]   node 2: slabs: 60, objs: 2880, free: 0
      [506626.590852]   node 3: slabs: 0, objs: 0, free: 0
      [506626.596112] LustreError: 41604:0:(osc_cache.c:1290:osc_completion())
      ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) ) failed:
      cp_state:0, cmd:1
      [506626.612512] LustreError: 41604:0:(osc_cache.c:1290:osc_completion()) LBUG
      [506626.620186] Pid: 41604, comm: cat
      [506626.623978]
                      Call Trace:
      [506626.628573]  [<ffffffffa05eb853>] libcfs_debug_dumpstack+0x53/0x80
      [libcfs]
      [506626.636448]  [<ffffffffa05ebdf5>] lbug_with_loc+0x45/0xc0 [libcfs]
      [506626.643456]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.651526]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.659108]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.666037]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.673531]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.680450]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.687961]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.694964]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.702078]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.709674]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.716785]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.723797]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.731477]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.738962]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.745962]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.751700]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.757345]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      [506626.764138]
      [506626.765990] Kernel panic - not syncing: LBUG
      [506626.770850] CPU: 53 PID: 41604 Comm: cat Tainted: G           OE 
      ------------   3.10.0-327.22.2.el7.x86_64 #1
      [506626.782104] Hardware name: BULL bullx blade/CHPU, BIOS BIOSX07.037.01.003
      10/23/2015
      [506626.790838]  ffffffffa0610ced 000000000f6a3070 ffff8817799eb8c0
      ffffffff816360f4
      [506626.799228]  ffff8817799eb940 ffffffff8162f96a ffffffff00000008
      ffff8817799eb950
      [506626.807618]  ffff8817799eb8f0 000000000f6a3070 ffffffffa0e01466
      0000000000000246
      [506626.816005] Call Trace:
      [506626.818839]  [<ffffffff816360f4>] dump_stack+0x19/0x1b
      [506626.824668]  [<ffffffff8162f96a>] panic+0xd8/0x1e7
      [506626.830128]  [<ffffffffa05ebe5b>] lbug_with_loc+0xab/0xc0 [libcfs]
      [506626.837129]  [<ffffffffa0dea859>] osc_ap_completion.isra.30+0x4d9/0x5b0
      [osc]
      [506626.845192]  [<ffffffffa0df558d>] osc_queue_sync_pages+0x2dd/0x350 [osc]
      [506626.852766]  [<ffffffffa0de750f>] osc_io_submit+0x42f/0x530 [osc]
      [506626.859702]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.867184]  [<ffffffffa0b8d257>] lov_io_submit+0x2a7/0x420 [lov]
      [506626.874099]  [<ffffffffa086fbd6>] cl_io_submit_rw+0x66/0x170 [obdclass]
      [506626.881611]  [<ffffffffa0c67f70>] ll_readpage+0x2d0/0x560 [lustre]
      [506626.888609]  [<ffffffff8116af87>] generic_file_aio_read+0x3b7/0x750
      [506626.895721]  [<ffffffffa0c98485>] vvp_io_read_start+0x3c5/0x470 [lustre]
      [506626.903322]  [<ffffffffa086f965>] cl_io_start+0x65/0x130 [obdclass]
      [506626.910418]  [<ffffffffa0872f85>] cl_io_loop+0xa5/0x190 [obdclass]
      [506626.917420]  [<ffffffffa0c34e8c>] ll_file_io_generic+0x5fc/0xae0 [lustre]
      [506626.925091]  [<ffffffffa0c35db2>] ll_file_aio_read+0x192/0x530 [lustre]
      [506626.932575]  [<ffffffffa0c3621b>] ll_file_read+0xcb/0x1e0 [lustre]
      [506626.939569]  [<ffffffff811dea1c>] vfs_read+0x9c/0x170
      [506626.945300]  [<ffffffff811df56f>] SyS_read+0x7f/0xe0
      [506626.950938]  [<ffffffff81646889>] system_call_fastpath+0x16/0x1b
      

      The customer being a black site, we can't provide the crashdump, but will happily provide any text output you would find useful.

      Attachments

        1. crash_output.txt
          24 kB
        2. foreach_bt_merge.txt
          152 kB
        3. struct_analyze1.txt
          50 kB

        Issue Links

          Activity

            [LU-8435] LBUG (osc_cache.c:1290:osc_completion()) ASSERTION( equi(page->cp_state == CPS_PAGEIN, cmd == OBD_BRW_READ) )
            simmonsja James A Simmons added a comment - - edited

            Peter the original fix https://review.whamcloud.com/#/c/13956 has already landed to master. I think this is safe to close. Or do you mean https://review.whamcloud.com/#/c/21745 ?

            simmonsja James A Simmons added a comment - - edited Peter the original fix https://review.whamcloud.com/#/c/13956  has already landed to master. I think this is safe to close. Or do you mean https://review.whamcloud.com/#/c/21745  ?
            pjones Peter Jones added a comment -

            I think that we need the ticket to remain open until the original patch has landed to master

            pjones Peter Jones added a comment - I think that we need the ticket to remain open until the original patch has landed to master

            Hi Bruno,

            Thanks for the backport.
            You can close.

            Regards.

            spiechurski Sebastien Piechurski added a comment - Hi Bruno, Thanks for the backport. You can close. Regards.

            Hello Sebastien,
            Patch has been back-ported and has landed on b2_7_fe branch, do you agree to close ?

            bfaccini Bruno Faccini (Inactive) added a comment - Hello Sebastien, Patch has been back-ported and has landed on b2_7_fe branch, do you agree to close ?

            I'm running a test on this now. Seems good, no crashes ... yet.

            bevans Ben Evans (Inactive) added a comment - I'm running a test on this now. Seems good, no crashes ... yet.

            My bad, I miss your last comment . That's the problem working too often with 2.7.2 only, you sometimes miss what was added in master.

            I've pushed a reproducer for this in sanity. Do whatever you want with it.

            adegremont Aurelien Degremont (Inactive) added a comment - My bad, I miss your last comment . That's the problem working too often with 2.7.2 only, you sometimes miss what was added in master. I've pushed a reproducer for this in sanity. Do whatever you want with it.

            Aurelien Degremont (aurelien.degremont@cea.fr) uploaded a new patch: http://review.whamcloud.com/21745
            Subject: LU-8435 tests: slab alloc error does not LBUG
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0e8a912a7dfc65d14d1f252efea043637620f518

            gerrit Gerrit Updater added a comment - Aurelien Degremont (aurelien.degremont@cea.fr) uploaded a new patch: http://review.whamcloud.com/21745 Subject: LU-8435 tests: slab alloc error does not LBUG Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0e8a912a7dfc65d14d1f252efea043637620f518

            Well in fact, I just found that this has already been fixed in master by Andreas fix for LU-6215 (http://review.whamcloud.com/13956, commit eaba382df06cd6f570b16c5f759f612b057e5324). Thus I will simply back-port it to b2_7!

            bfaccini Bruno Faccini (Inactive) added a comment - Well in fact, I just found that this has already been fixed in master by Andreas fix for LU-6215 ( http://review.whamcloud.com/13956 , commit eaba382df06cd6f570b16c5f759f612b057e5324). Thus I will simply back-port it to b2_7!

            And use of list_for_each_entry_safe() + list_del() looks like a better one
            This will be my first patch attempt to fix this, I will then try to expose it to your reproducer to see how this works then.

            bfaccini Bruno Faccini (Inactive) added a comment - And use of list_for_each_entry_safe() + list_del() looks like a better one This will be my first patch attempt to fix this, I will then try to expose it to your reproducer to see how this works then.

            I agree with this analysis. cp_state is CPS_PAGEIN in almost all our crashdumps. Except for 1 or 2 but this could means that the page was freed between the 2 iterations IMO?

            list_for_each_entry + list_del_init seems a really bad idea.

            adegremont Aurelien Degremont (Inactive) added a comment - I agree with this analysis. cp_state is CPS_PAGEIN in almost all our crashdumps. Except for 1 or 2 but this could means that the page was freed between the 2 iterations IMO? list_for_each_entry + list_del_init seems a really bad idea.

            Well having a better look to the code involved in error path, my guess is that the problem may be caused by the following piece:

            2621 int osc_queue_sync_pages(const struct lu_env *env, struct osc_object *obj,
            2622                          struct list_head *list, int cmd, int brw_flags)
            2623 {
            2624         struct client_obd     *cli = osc_cli(obj);
            2625         struct osc_extent     *ext;
            2626         struct osc_async_page *oap;
            2627         int     page_count = 0;
            2628         int     mppr       = cli->cl_max_pages_per_rpc;
            2629         pgoff_t start      = CL_PAGE_EOF;
            2630         pgoff_t end        = 0;
            2631         ENTRY;
            2632 
            2633         list_for_each_entry(oap, list, oap_pending_item) {
            2634                 pgoff_t index = osc_index(oap2osc(oap));
            2635                 if (index > end)
            2636                         end = index;
            2637                 if (index < start)
            2638                         start = index;
            2639                 ++page_count;
            2640                 mppr <<= (page_count > mppr);
            2641         }
            2642 
            2643         ext = osc_extent_alloc(obj);   <<<<<<<<< Memory allocation failure can occur here
            2644         if (ext == NULL) {
            2645                 list_for_each_entry(oap, list, oap_pending_item) {
            2646                         list_del_init(&oap->oap_pending_item);     <<<<<<<<<<<<
            2647                         osc_ap_completion(env, cli, oap, 0, -ENOMEM);
            2648                 }
            2649                 RETURN(-ENOMEM);
            2650         }
            

            the list_del_init() should cause the current oap to have its next/prev oap_pending_item list pointers to point on itself and thus to be again selected by list-for_each_entry() to be passed for a 2nd call to osc_ap_completion() when its processing has already been done during 1st call to osc_ap_completion() for it, and particularly its associated cl_page->cp_state has already been set to CPS_CACHED in osc_ap_completion()>osc_completion()>cl_page_completion(), hence the Assert du to no longer CPS_PAGEIN.

            I am working on a patch to fix this and to be pushed soon.

            bfaccini Bruno Faccini (Inactive) added a comment - Well having a better look to the code involved in error path, my guess is that the problem may be caused by the following piece: 2621 int osc_queue_sync_pages(const struct lu_env *env, struct osc_object *obj, 2622 struct list_head *list, int cmd, int brw_flags) 2623 { 2624 struct client_obd *cli = osc_cli(obj); 2625 struct osc_extent *ext; 2626 struct osc_async_page *oap; 2627 int page_count = 0; 2628 int mppr = cli->cl_max_pages_per_rpc; 2629 pgoff_t start = CL_PAGE_EOF; 2630 pgoff_t end = 0; 2631 ENTRY; 2632 2633 list_for_each_entry(oap, list, oap_pending_item) { 2634 pgoff_t index = osc_index(oap2osc(oap)); 2635 if (index > end) 2636 end = index; 2637 if (index < start) 2638 start = index; 2639 ++page_count; 2640 mppr <<= (page_count > mppr); 2641 } 2642 2643 ext = osc_extent_alloc(obj); <<<<<<<<< Memory allocation failure can occur here 2644 if (ext == NULL) { 2645 list_for_each_entry(oap, list, oap_pending_item) { 2646 list_del_init(&oap->oap_pending_item); <<<<<<<<<<<< 2647 osc_ap_completion(env, cli, oap, 0, -ENOMEM); 2648 } 2649 RETURN(-ENOMEM); 2650 } the list_del_init() should cause the current oap to have its next/prev oap_pending_item list pointers to point on itself and thus to be again selected by list-for_each_entry() to be passed for a 2nd call to osc_ap_completion() when its processing has already been done during 1st call to osc_ap_completion() for it, and particularly its associated cl_page->cp_state has already been set to CPS_CACHED in osc_ap_completion() >osc_completion() >cl_page_completion(), hence the Assert du to no longer CPS_PAGEIN. I am working on a patch to fix this and to be pushed soon.

            People

              bfaccini Bruno Faccini (Inactive)
              spiechurski Sebastien Piechurski
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: