Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2720

osc_page_delete()) ASSERTION(0) failed

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 6614

    Description

      it was already posted to LU-1723, but seems to be a different issue from the original one in that ticket, so moving it here.

      2012-11-06T14:57:10.872333-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_cache.c:2367:osc_teardown_async_page()) extent ffff88060eedfe58@

      {[23 -> 23/255], [2|0|-|cache|wi|ffff88020e18f8c8], [4096|1|+|-|ffff8801f9ed9c18|256| (null)]}

      trunc at 23.
      2012-11-06T14:57:10.872389-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) page@ffff88080e8c8bc0[2 ffff88020cefad08:23 ^ (null)_ffff88080e8c8b00 4 0 1 (null) (null) 0x0]
      2012-11-06T14:57:10.902445-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) page@ffff88080e8c8b00[1 ffff88020d8c1f58:23 ^ffff88080e8c8bc0_ (null) 4 0 1 (null) (null) 0x0]
      2012-11-06T14:57:10.902494-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) vvp-page@ffff88080e8cf5a0(0:0:0) vm@ffffea001c6a99b8 e00000000000063 7:0 0 23 lru
      2012-11-06T14:57:10.902508-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) lov-page@ffff8808050a7888
      2012-11-06T14:57:10.958005-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) osc-page@ffff88080d78ed20: 1< 0x845fed 258 0 + - > 2< 94208 0 4096 0x0 0x520 | (null) ffff880804350700 ffff88020e18f8c8 > 3< + ffff8801f8f577b0 0 0 0 > 4< 0 0 8 33824768 - | - - + - > 5< - - + - | 0 - | 1 - ->
      2012-11-06T14:57:10.958049-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) end page@ffff88080e8c8bc0
      2012-11-06T14:57:10.983504-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:411:osc_page_delete()) Trying to teardown failed: -16
      2012-11-06T14:57:10.983536-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:412:osc_page_delete()) ASSERTION( 0 ) failed:
      2012-11-06T14:57:10.983549-06:00 c0-0c1s6n0 LustreError: 5270:0:(osc_page.c:412:osc_page_delete()) LBUG
      2012-11-06T14:57:10.983570-06:00 c0-0c1s6n0 Pid: 5270, comm: fsx-linux
      2012-11-06T14:57:10.983582-06:00 c0-0c1s6n0 Call Trace:
      2012-11-06T14:57:10.983605-06:00 c0-0c1s6n0 [<ffffffff810063b1>] try_stack_unwind+0x161/0x1a0
      2012-11-06T14:57:11.009114-06:00 c0-0c1s6n0 [<ffffffff81004bf9>] dump_trace+0x89/0x440
      2012-11-06T14:57:11.009138-06:00 c0-0c1s6n0 [<ffffffffa014e887>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
      2012-11-06T14:57:11.009161-06:00 c0-0c1s6n0 [<ffffffffa014ede7>] lbug_with_loc+0x47/0xc0 [libcfs]
      2012-11-06T14:57:11.009173-06:00 c0-0c1s6n0 [<ffffffffa0671d41>] osc_page_delete+0x2d1/0x2e0 [osc]
      2012-11-06T14:57:11.034700-06:00 c0-0c1s6n0 [<ffffffffa02b0095>] cl_page_delete0+0xd5/0x4e0 [obdclass]
      2012-11-06T14:57:11.034737-06:00 c0-0c1s6n0 [<ffffffffa02b04e2>] cl_page_delete+0x42/0x120 [obdclass]
      2012-11-06T14:57:11.034762-06:00 c0-0c1s6n0 [<ffffffffa07f2e2d>] ll_invalidatepage+0x8d/0x170 [lustre]
      2012-11-06T14:57:11.034774-06:00 c0-0c1s6n0 [<ffffffffa07ea290>] ll_page_mkwrite+0x7c0/0x840 [lustre]
      2012-11-06T14:57:11.034820-06:00 c0-0c1s6n0 [<ffffffff81107cb7>] __do_fault+0xe7/0x570
      2012-11-06T14:57:11.034833-06:00 c0-0c1s6n0 [<ffffffff811081e4>] handle_pte_fault+0xa4/0xcd0
      2012-11-06T14:57:11.060561-06:00 c0-0c1s6n0 [<ffffffff81108fbe>] handle_mm_fault+0x1ae/0x240
      2012-11-06T14:57:11.060588-06:00 c0-0c1s6n0 [<ffffffff81025471>] do_page_fault+0x191/0x410
      2012-11-06T14:57:11.060600-06:00 c0-0c1s6n0 [<ffffffff81301b5f>] page_fault+0x1f/0x30
      2012-11-06T14:57:11.060647-06:00 c0-0c1s6n0 [<00000000200422b3>] 0x200422b3
      2012-11-06T14:57:11.060660-06:00 c0-0c1s6n0 Kernel panic - not syncing: LBUG

      Attachments

        Issue Links

          Activity

            [LU-2720] osc_page_delete()) ASSERTION(0) failed
            spitzcor Cory Spitz added a comment -

            Thanks, it is LU-3217.

            spitzcor Cory Spitz added a comment - Thanks, it is LU-3217 .
            pjones Peter Jones added a comment -

            Corey

            It would be best to open a new ticket with details of the CPU spinning issue that you are seeing

            Peter

            pjones Peter Jones added a comment - Corey It would be best to open a new ticket with details of the CPU spinning issue that you are seeing Peter
            spitzcor Cory Spitz added a comment -

            Ah, thanks Oleg. Sure, we'll keep looking at our spin issue.

            spitzcor Cory Spitz added a comment - Ah, thanks Oleg. Sure, we'll keep looking at our spin issue.
            green Oleg Drokin added a comment -

            Vitaly comments relates to patchset #2, he since added patchset #3 that fixes the potential issue discussed.
            As for the cpu spinning you are seeing, we need more details to make any educated guess there.

            green Oleg Drokin added a comment - Vitaly comments relates to patchset #2, he since added patchset #3 that fixes the potential issue discussed. As for the cpu spinning you are seeing, we need more details to make any educated guess there.
            spitzcor Cory Spitz added a comment -

            Can someone please comment about change #5222 landing considering the comments and questions from 08/Feb/13? Cray is seeing significant CPU spinning in 2.4 RC testing, but Wally would have to confirm if change #5222 is the cause.

            spitzcor Cory Spitz added a comment - Can someone please comment about change #5222 landing considering the comments and questions from 08/Feb/13? Cray is seeing significant CPU spinning in 2.4 RC testing, but Wally would have to confirm if change #5222 is the cause.
            pjones Peter Jones added a comment -

            Landed for 2.4

            pjones Peter Jones added a comment - Landed for 2.4
            keith Keith Mannthey (Inactive) added a comment - Are you going to drop http://review.whamcloud.com/5222 ?

            after a talk to Jay, we decided not to change cl_lock_peek, because it may return REPEAT only for lock canceling or glimpse ast. at the same time, lock canceling may be long and we do not sleep here, so this looping will consume CPU resources. as after this patch, cl_lock_peek is used for SOM only, it may result the only ioepoch holder does not provide attibute update in done_writing and mds re-asks for them. if there will be a need in minimizing the amount of these RPCs, a sleeping version of cl_lock_peek is to be implemented.

            vitaly_fertman Vitaly Fertman added a comment - after a talk to Jay, we decided not to change cl_lock_peek, because it may return REPEAT only for lock canceling or glimpse ast. at the same time, lock canceling may be long and we do not sleep here, so this looping will consume CPU resources. as after this patch, cl_lock_peek is used for SOM only, it may result the only ioepoch holder does not provide attibute update in done_writing and mds re-asks for them. if there will be a need in minimizing the amount of these RPCs, a sleeping version of cl_lock_peek is to be implemented.

            We hit this often when running fsx-linux from LTP. Usually it happens during a stress run in an hour or two. We haven't seen this bug after applying this patch together with LU-2722/2723 patches fixing the DIO issues.

            wang Wally Wang (Inactive) added a comment - We hit this often when running fsx-linux from LTP. Usually it happens during a stress run in an hour or two. We haven't seen this bug after applying this patch together with LU-2722 /2723 patches fixing the DIO issues.

            Vitaly, can you please include some information about how this problem was initially hit (e.g. test load, frequency of being hit, etc).

            adilger Andreas Dilger added a comment - Vitaly, can you please include some information about how this problem was initially hit (e.g. test load, frequency of being hit, etc).

            People

              keith Keith Mannthey (Inactive)
              vitaly_fertman Vitaly Fertman
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: