Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6717

dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • Hyperion SWL test
    • 3
    • 9223372036854775807

    Description

      During test run, started dropping DNE MDTs

      xJun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:
      Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:
      Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) LBUG
      Jun 11 22:44:52 iws15 kernel: LustreError: 10346:0:(dt_object.c:512:dt_record_write()) LBUG
      Jun 11 22:44:52 iws15 kernel: Pid: 10346, comm: mdt_out00_007
      Jun 11 22:44:52 iws15 kernel:
      Jun 11 22:44:52 iws15 kernel: Call Trace:
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa04a2875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa04a2e77>] lbug_with_loc+0x47/0xb0 [libcfs]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa09ae15f>] dt_record_write+0xbf/0x130 [obdclass]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3c78e>] out_tx_write_exec+0x7e/0x2e0 [ptlrpc]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3b574>] ? out_tx_destroy_exec+0x24/0x1a0 [ptlrpc]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c36282>] out_tx_end+0xe2/0x5d0 [ptlrpc]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c3c216>] out_handle+0x9d6/0xed0 [ptlrpc]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0bcf2cc>] ? lustre_msg_get_opc+0x8c/0x100 [ptlrpc]
      Jun 11 22:44:52 iws15 kernel: [<ffffffffa0c323be>] tgt_request_handle+0x95e/0x10b0 [ptlrpc]
      Jun 11 22:44:53 iws15 kernel: [<ffffffffa0be1c51>] ptlrpc_main+0xe41/0x1970 [ptlrpc]
      Jun 11 22:44:53 iws15 kernel: [<ffffffffa0be0e10>] ? ptlrpc_main+0x0/0x1970 [ptlrpc]
      Jun 11 22:44:53 iws15 kernel: [<ffffffff8109e71e>] kthread+0x9e/0xc0
      Jun 11 22:44:53 iws15 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
      Jun 11 22:44:53 iws15 kernel: [<ffffffff8109e680>] ? kthread+0x0/0xc0
      Jun 11 22:44:53 iws15 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      Jun 11 22:44:53 iws15 kernel:
      

      System panic'd at this point, no log dump

      Attachments

        Issue Links

          Activity

            [LU-6717] dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15278/
            Subject: LU-6717 llog: Create update llog synchronously
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f4313edcb837429ccc7f501158514560d602ee85

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15278/ Subject: LU-6717 llog: Create update llog synchronously Project: fs/lustre-release Branch: master Current Patch Set: Commit: f4313edcb837429ccc7f501158514560d602ee85

            wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15278
            Subject: LU-6717 llog: Create update llog synchronously
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ca7ad20c675ce6b0cdaf96eb350bdc46cb0bae78

            gerrit Gerrit Updater added a comment - wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/15278 Subject: LU-6717 llog: Create update llog synchronously Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ca7ad20c675ce6b0cdaf96eb350bdc46cb0bae78
            di.wang Di Wang added a comment -

            Cliff: Could you please tell me which build did you use for the test? tag 2.7.55? Thanks

            di.wang Di Wang added a comment - Cliff: Could you please tell me which build did you use for the test? tag 2.7.55? Thanks
            di.wang Di Wang added a comment -

            It looks like llog object got disappeared after recovery.

            di.wang Di Wang added a comment - It looks like llog object got disappeared after recovery.
            di.wang Di Wang added a comment -

            Do you have crash dump for this panic?

            di.wang Di Wang added a comment - Do you have crash dump for this panic?

            Certainly what information would you like?

            cliffw Cliff White (Inactive) added a comment - Certainly what information would you like?

            Cliff: can you save the state of the filesystem for analysis?

            doug Doug Oucharek (Inactive) added a comment - Cliff: can you save the state of the filesystem for analysis?

            System now hits the LBUG upon every restart attempt. Have fsck'd the disks.
            continues to get errors

            LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567847, ql: 1, comp: 288, conn: 289, next: 4317567849, last_committed: 4317566870)
            LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567852, ql: 1, comp: 288, conn: 289, next: 4317567854, last_committed: 4317566870)
            LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567857, ql: 1, comp: 288, conn: 289, next: 4317567859, last_committed: 4317566870)
            Lustre: lustre-MDT000a-osp-MDT0004: Connection restored to lustre-MDT000a (at 0@lo)
            
            
            cliffw Cliff White (Inactive) added a comment - System now hits the LBUG upon every restart attempt. Have fsck'd the disks. continues to get errors LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567847, ql: 1, comp: 288, conn: 289, next: 4317567849, last_committed: 4317566870) LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567852, ql: 1, comp: 288, conn: 289, next: 4317567854, last_committed: 4317566870) LustreError: 6442:0:(ldlm_lib.c:1751:check_for_next_transno()) lustre-MDT000a: waking for gap in transno, VBR is OFF (skip: 4317567857, ql: 1, comp: 288, conn: 289, next: 4317567859, last_committed: 4317566870) Lustre: lustre-MDT000a-osp-MDT0004: Connection restored to lustre-MDT000a (at 0@lo)

            People

              di.wang Di Wang
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: