Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3433

Encountered a assertion for the ols_state being set to a impossible state

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.5.0
    • Lustre 2.4.0, Lustre 2.4.1
    • Cray Lustre 2.4 clients running on SLES11 SP2.
    • 3
    • 8552

    Description

      During a test run using Lustre 2.4 one of our clients encountered this LBUG.

      [2013-05-29 18:39:47][c7-1c1s3n3]LustreError:16573:0:(osc_lock.c:1165:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 4
      [2013-05-29 18:39:47][c7-1c1s3n3]LustreError: 16573:0
      (osc_lock.c:1165:osc_lock_enqueue()) LBUG
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81006451>] try_stack_unwind+0x161/0x1a0
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81004ca9>] dump_trace+0x89/0x440
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa013a897>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa013ade7>] lbug_with_loc+0x47/0xc0 [libcfs]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa0639f55>] osc_lock_enqueue+0x725/0x8b0 [osc]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032d4eb>] cl_enqueue_try+0xfb/0x320 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa06ccded>] lov_lock_enqueue+0x1fd/0x880 [lov]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032d4eb>] cl_enqueue_try+0xfb/0x320 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032e3bf>] cl_enqueue_locked+0x7f/0x1f0 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032efbe>] cl_lock_request+0x7e/0x270 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa0334274>] cl_io_lock+0x394/0x5c0 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa033453a>] cl_io_loop+0x9a/0x1a0 [obdclass]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074b90f>] ll_file_io_generic+0x33f/0x5f0 [lustre]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074c08b>] ll_file_aio_read+0x23b/0x290 [lustre]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074d002>] ll_file_read+0x1f2/0x280 [lustre]
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81135548>] vfs_read+0xc8/0x180
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113b799>] kernel_read+0x49/0x60
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113b885>] prepare_binprm+0xd5/0x100
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113c670>] do_execve_common+0x1c0/0x300
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113c83f>] do_execve+0x3f/0x50
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8100ae1e>] sys_execve+0x4e/0x80
      [2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81316a3c>] stub_execve+0x6c/0xc0
      [2013-05-29 18:39:47][c7-1c1s3n3] [<0000000020176437>] 0x20176437
      [2013-05-29 18:39:47][c7-1c1s3n3]Kernel panic - not syncing: LBUG

      Attachments

        1. test.sh
          0.2 kB
          Patrick Farrell
        2. write-eintr.c
          3 kB
          Patrick Farrell

        Issue Links

          Activity

            [LU-3433] Encountered a assertion for the ols_state being set to a impossible state
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-5910 [ LU-5910 ]
            pjones Peter Jones added a comment -

            Karsten

            This would certainly be under consideration for a fuller 2.4.x maintenance release. 2.4.3 had limited content because it was an unscheduled release driven by the need to issue a 2.4.x release to address the security vulnerability discovered (LU-4703/4704)

            Peter

            pjones Peter Jones added a comment - Karsten This would certainly be under consideration for a fuller 2.4.x maintenance release. 2.4.3 had limited content because it was an unscheduled release driven by the need to issue a 2.4.x release to address the security vulnerability discovered ( LU-4703 /4704) Peter

            Niu/Peter, will this backport be merged into b2_4 (for 2.4.4)? I saw this bug on three lustre 2.4.2 clients and AFAIKS your patch is not included in 2.4.3 (we're using this version now) or b2_4.

            knweiss Karsten Weiss added a comment - Niu/Peter, will this backport be merged into b2_4 (for 2.4.4)? I saw this bug on three lustre 2.4.2 clients and AFAIKS your patch is not included in 2.4.3 (we're using this version now) or b2_4.
            pjones Peter Jones made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Reopened [ 4 ] New: Resolved [ 5 ]
            pjones Peter Jones made changes -
            Labels New: mn4
            pjones Peter Jones made changes -
            Labels Original: llnl
            niu Niu Yawei (Inactive) added a comment - ported LU-3889 fix to b2_4: http://review.whamcloud.com/9194
            pjones Peter Jones added a comment -

            Just timing. At the time the fix for LU-3889 was unproven so reverting to the previous known state was preferable. I think that moving forward with both fixes is a sound approach.

            pjones Peter Jones added a comment - Just timing. At the time the fix for LU-3889 was unproven so reverting to the previous known state was preferable. I think that moving forward with both fixes is a sound approach.

            The fix of LU-3889 has been landed on master & b2_5, it looks to me that we should just re-add this patch and backport the fix of LU-3889 to b2_4.

            Bob, what's your opinion? I see the revert patch was uploaded by you, why we revert this patch instead of backport the LU-3889?

            niu Niu Yawei (Inactive) added a comment - The fix of LU-3889 has been landed on master & b2_5, it looks to me that we should just re-add this patch and backport the fix of LU-3889 to b2_4. Bob, what's your opinion? I see the revert patch was uploaded by you, why we revert this patch instead of backport the LU-3889 ?
            morrone Christopher Morrone (Inactive) made changes -
            Fix Version/s Original: Lustre 2.4.2 [ 10605 ]

            People

              niu Niu Yawei (Inactive)
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: