Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8175

conflicting PW & PR extent locks on a client

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      > [5034040.035051] Lustre: 16432:0:(client.c:1910:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1453393018/real 1453393018] req@ffff881f9d653c00 x1518811430048732/t0(0) o3->snx11091-OST0028-osc-ffff881fe6574800@172.17.47.209@o2ib1013:6/4 lens 488/432 e 0 to 1 dl 1453393778 ref 2 fl Rpc:XU/2/ffffffff rc -11/-1
      > [5034040.035057] Lustre: 16432:0:(client.c:1910:ptlrpc_expire_one_request()) Skipped 32 previous similar messages
      > [5034482.398979] Lustre: snx11091-OST000b-osc-ffff881fe6574800: Connection to snx11091-OST000b (at 172.17.47.201@o2ib1013) was lost; in progress operations using this service will wait for recovery to complete
      > [5034482.398984] Lustre: Skipped 7 previous similar messages
      > [5034482.399254] Lustre: snx11091-OST000b-osc-ffff881fe6574800: Connection restored to snx11091-OST000b (at 172.17.47.201@o2ib1013)
      > [5034482.399257] Lustre: Skipped 7 previous similar messages
      > [5034798.943798] Lustre: 16422:0:(client.c:1910:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1453393778/real 1453393778] req@ffff881fe4cc9000 x1518811430052084/t0(0) o4->snx11091-OST0028-osc-ffff881fe6574800@172.17.47.209@o2ib1013:6/4 lens 488/448 e 0 to 1 dl 1453394538 ref 2 fl Rpc:XU/2/ffffffff rc -11/-1
      > [5034798.943805] Lustre: 16442:0:(client.c:1910:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1453393778/real 1453393778] req@ffff881fe4cc9400 x1518811430052092/t0(0) o4->snx11091-OST0028-osc-ffff881fe6574800@172.17.47.209@o2ib1013:6/4 lens 488/448 e 0 to 1 dl 1453394538 ref 2 fl Rpc:XU/2/ffffffff rc 0/-1
      > [5034798.943811] Lustre: 16442:0:(client.c:1910:ptlrpc_expire_one_request()) Skipped 30 previous similar messages
      > [5035427.382998] Lustre: snx11091-OST002a-osc-ffff881fe6574800: Connection restored to snx11091-OST002a (at 172.17.47.210@o2ib1012)
      > [5035427.383001] Lustre: Skipped 7 previous similar messages
      > [5035429.345176] LustreError: 16406:0:(osc_cache.c:2421:osc_teardown_async_page()) extent ffff88071aac01e0@

      {[0 -> 255/255], [3|0|+|cache|wihuY|ffff880877eec198], [1048576|256|+|+|ffff880e6beab738|256| (null)]}

      trunc at 0.
      > [5035429.345183] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) page@ffff880973c33000[3 ffff88037416ae18 4 0 1 (null) (null) 0x0]
      > [5035429.345188] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) vvp-page@ffff880973c330a0(0:0:0) vm@ffffea0006449948 20000000001079 2:0 ffff880973c33000 0 lru
      > [5035429.345191] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) lov-page@ffff880973c330f8, raid0
      > [5035429.345198] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) osc-page@ffff880973c33160 0: 1< 0x845fed 2 0 + - > 2< 0 0 4096 0x0 0x420 | (null) ffff881fe6ae0620 ffff880877eec198 > 3< + ffff880768e26380 0 0 0 > 4< 0 9 8 0 - | + - + + > 5< + - + - | 0 - | 948 - +>
      > [5035429.345202] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) end page@ffff880973c33000
      > [5035429.345204] LustreError: 16406:0:(osc_page.c:333:osc_page_delete()) Trying to teardown failed: -16
      > [5035429.345206] LustreError: 16406:0:(osc_page.c:334:osc_page_delete()) ASSERTION( 0 ) failed:
      > [5035429.353732] LustreError: 16406:0:(osc_page.c:334:osc_page_delete()) LBUG
      > [5035429.360601] Pid: 16406, comm: ptlrpcd_3
      > [5035429.360602]
      > [5035429.360603] Call Trace:
      > [5035429.360612] [<ffffffff81004b95>] dump_trace+0x75/0x300
      > [5035429.360636] [<ffffffffa089c82a>] libcfs_debug_dumpstack+0x4a/0x70 [libcfs]
      > [5035429.360664] [<ffffffffa089cd5e>] lbug_with_loc+0x3e/0xb0 [libcfs]
      > [5035429.360678] [<ffffffffa1d35103>] osc_page_delete+0x393/0x3d0 [osc]
      > [5035429.360722] [<ffffffffa09f43fd>] cl_page_delete0+0x6d/0x200 [obdclass]
      > [5035429.360765] [<ffffffffa09f45c5>] cl_page_delete+0x35/0x120 [obdclass]
      > [5035429.360817] [<ffffffffa1e695c6>] ll_invalidatepage+0x96/0x160 [lustre]
      > [5035429.360850] [<ffffffffa1e7b45c>] vvp_page_discard+0xcc/0x170 [lustre]
      > [5035429.360887] [<ffffffffa09f2ce8>] cl_page_invoid+0x58/0x150 [obdclass]
      > [5035429.360918] [<ffffffffa1d4193e>] check_and_discard_cb+0x13e/0x190 [osc]
      > [5035429.360934] [<ffffffffa1d41b4d>] osc_page_gang_lookup+0x1bd/0x340 [osc]
      > [5035429.360951] [<ffffffffa1d41e0b>] osc_lock_discard_pages+0x13b/0x240 [osc]
      > [5035429.360966] [<ffffffffa1d37993>] osc_lock_flush+0xf3/0x270 [osc]
      > [5035429.360979] [<ffffffffa1d37c09>] osc_lock_cancel+0xf9/0x1e0 [osc]
      > [5035429.361005] [<ffffffffa09f6bc5>] cl_lock_cancel0+0x65/0x150 [obdclass]
      > [5035429.361050] [<ffffffffa09f9f76>] cl_lock_hold_release+0x1e6/0x2c0 [obdclass]
      > [5035429.361081] [<ffffffffa1d3a613>] osc_lock_upcall+0x223/0x460 [osc]
      > [5035429.361093] [<ffffffffa1d1b82d>] osc_enqueue_fini+0x9d/0x270 [osc]
      > [5035429.361102] [<ffffffffa1d1e883>] osc_enqueue_interpret+0xe3/0x1e0 [osc]
      > [5035429.361136] [<ffffffffa1c00152>] ptlrpc_check_set+0x562/0x1b60 [ptlrpc]
      > [5035429.361174] [<ffffffffa1c2bd5b>] ptlrpcd_check+0x52b/0x550 [ptlrpc]
      > [5035429.361219] [<ffffffffa1c2c39b>] ptlrpcd+0x32b/0x410 [ptlrpc]
      > [5035429.361244] [<ffffffff81083f16>] kthread+0x96/0xa0
      > [5035429.361249] [<ffffffff8146d964>] kernel_thread_helper+0x4/0x10
      > [5035429.361252]
      > [5035429.361378] Kernel panic - not syncing: LBUG

      Attachments

        Activity

          People

            jay Jinshan Xiong (Inactive)
            askulysh Andriy Skulysh
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: