Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14433

fallocate: osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      After adding fallocate call to racer the following crash is observed very often:

      [  156.954031] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: last_oap_count 0
      [  156.956535] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) LBUG
      [  156.957881] Pid: 4652, comm: ldlm_bl_01 3.10.0-7.9-debug #1 SMP Mon Feb 1 17:33:41 EST 2021
      [  156.959725] Call Trace:
      [  156.960375]  [<ffffffffa017873c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [  156.962003]  [<ffffffffa0178a5c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [  156.963391]  [<ffffffffa0839b26>] osc_extent_make_ready+0xb66/0xe60 [osc]
      [  156.965016]  [<ffffffffa083c383>] osc_io_unplug0+0xee3/0x1900 [osc]
      [  156.966261]  [<ffffffffa0840a60>] osc_cache_writeback_range+0x9a0/0xfd0 [osc]
      [  156.968027]  [<ffffffffa082b985>] osc_lock_flush+0x195/0x290 [osc]
      [  156.969414]  [<ffffffffa082be58>] osc_ldlm_blocking_ast+0x2f8/0x3e0 [osc]
      [  156.970970]  [<ffffffffa05bbe54>] ldlm_cancel_callback+0x84/0x320 [ptlrpc]
      [  156.972552]  [<ffffffffa05d4011>] ldlm_cli_cancel_local+0xd1/0x420 [ptlrpc]
      [  156.974515]  [<ffffffffa05da24c>] ldlm_cli_cancel+0x10c/0x560 [ptlrpc]
      [  156.976004]  [<ffffffffa082bcda>] osc_ldlm_blocking_ast+0x17a/0x3e0 [osc]
      [  156.977297]  [<ffffffffa05e6435>] ldlm_handle_bl_callback+0xc5/0x3e0 [ptlrpc]
      [  156.978932]  [<ffffffffa05e6d0f>] ldlm_bl_thread_main+0x5bf/0xae0 [ptlrpc]
      [  156.980417]  [<ffffffff810ba124>] kthread+0xe4/0xf0
      [  156.981842]  [<ffffffff817eee5d>] ret_from_fork_nospec_begin+0x7/0x21
      [  156.983639]  [<ffffffffffffffff>] 0xffffffffffffffff
      [  156.984738] Kernel panic - not syncing: LBUG
      

      This is result of wrong assumption about fallocate range over file range. It is done too early in ll_fallocate() so pending write/truncate may change file size and fallocate could change file size wrongly

      Attachments

        Issue Links

          Activity

            [LU-14433] fallocate: osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.0 [ 14791 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            tappro Mikhail Pershin made changes -
            Link New: This issue is related to LU-14287 [ LU-14287 ]
            tappro Mikhail Pershin made changes -
            Description Original: After adding fallocate call to racer the following crash is observe very often:
            {code}
            [ 156.954031] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: last_oap_count 0
            [ 156.956535] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) LBUG
            [ 156.957881] Pid: 4652, comm: ldlm_bl_01 3.10.0-7.9-debug #1 SMP Mon Feb 1 17:33:41 EST 2021
            [ 156.959725] Call Trace:
            [ 156.960375] [<ffffffffa017873c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [ 156.962003] [<ffffffffa0178a5c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [ 156.963391] [<ffffffffa0839b26>] osc_extent_make_ready+0xb66/0xe60 [osc]
            [ 156.965016] [<ffffffffa083c383>] osc_io_unplug0+0xee3/0x1900 [osc]
            [ 156.966261] [<ffffffffa0840a60>] osc_cache_writeback_range+0x9a0/0xfd0 [osc]
            [ 156.968027] [<ffffffffa082b985>] osc_lock_flush+0x195/0x290 [osc]
            [ 156.969414] [<ffffffffa082be58>] osc_ldlm_blocking_ast+0x2f8/0x3e0 [osc]
            [ 156.970970] [<ffffffffa05bbe54>] ldlm_cancel_callback+0x84/0x320 [ptlrpc]
            [ 156.972552] [<ffffffffa05d4011>] ldlm_cli_cancel_local+0xd1/0x420 [ptlrpc]
            [ 156.974515] [<ffffffffa05da24c>] ldlm_cli_cancel+0x10c/0x560 [ptlrpc]
            [ 156.976004] [<ffffffffa082bcda>] osc_ldlm_blocking_ast+0x17a/0x3e0 [osc]
            [ 156.977297] [<ffffffffa05e6435>] ldlm_handle_bl_callback+0xc5/0x3e0 [ptlrpc]
            [ 156.978932] [<ffffffffa05e6d0f>] ldlm_bl_thread_main+0x5bf/0xae0 [ptlrpc]
            [ 156.980417] [<ffffffff810ba124>] kthread+0xe4/0xf0
            [ 156.981842] [<ffffffff817eee5d>] ret_from_fork_nospec_begin+0x7/0x21
            [ 156.983639] [<ffffffffffffffff>] 0xffffffffffffffff
            [ 156.984738] Kernel panic - not syncing: LBUG
            {code}
            New: After adding fallocate call to racer the following crash is observed very often:
            {code:java}
            [ 156.954031] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) ASSERTION( last_oap_count > 0 ) failed: last_oap_count 0
            [ 156.956535] LustreError: 4652:0:(osc_cache.c:1141:osc_extent_make_ready()) LBUG
            [ 156.957881] Pid: 4652, comm: ldlm_bl_01 3.10.0-7.9-debug #1 SMP Mon Feb 1 17:33:41 EST 2021
            [ 156.959725] Call Trace:
            [ 156.960375] [<ffffffffa017873c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [ 156.962003] [<ffffffffa0178a5c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [ 156.963391] [<ffffffffa0839b26>] osc_extent_make_ready+0xb66/0xe60 [osc]
            [ 156.965016] [<ffffffffa083c383>] osc_io_unplug0+0xee3/0x1900 [osc]
            [ 156.966261] [<ffffffffa0840a60>] osc_cache_writeback_range+0x9a0/0xfd0 [osc]
            [ 156.968027] [<ffffffffa082b985>] osc_lock_flush+0x195/0x290 [osc]
            [ 156.969414] [<ffffffffa082be58>] osc_ldlm_blocking_ast+0x2f8/0x3e0 [osc]
            [ 156.970970] [<ffffffffa05bbe54>] ldlm_cancel_callback+0x84/0x320 [ptlrpc]
            [ 156.972552] [<ffffffffa05d4011>] ldlm_cli_cancel_local+0xd1/0x420 [ptlrpc]
            [ 156.974515] [<ffffffffa05da24c>] ldlm_cli_cancel+0x10c/0x560 [ptlrpc]
            [ 156.976004] [<ffffffffa082bcda>] osc_ldlm_blocking_ast+0x17a/0x3e0 [osc]
            [ 156.977297] [<ffffffffa05e6435>] ldlm_handle_bl_callback+0xc5/0x3e0 [ptlrpc]
            [ 156.978932] [<ffffffffa05e6d0f>] ldlm_bl_thread_main+0x5bf/0xae0 [ptlrpc]
            [ 156.980417] [<ffffffff810ba124>] kthread+0xe4/0xf0
            [ 156.981842] [<ffffffff817eee5d>] ret_from_fork_nospec_begin+0x7/0x21
            [ 156.983639] [<ffffffffffffffff>] 0xffffffffffffffff
            [ 156.984738] Kernel panic - not syncing: LBUG
            {code}

            This is result of wrong assumption about fallocate range over file range. It is done too early in {{ll_fallocate()}} so pending write/truncate may change file size and fallocate could change file size wrongly
            tappro Mikhail Pershin created issue -

            People

              tappro Mikhail Pershin
              tappro Mikhail Pershin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: