Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7391

(osp_md_object.c:1155:osp_md_write()) ASSERTION( ob j->opo_ooa->ooa_attr.la_valid & LA_SIZE ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • Lustre 2.8.0
    • Lustre 2.8.0
    • lola
      build: 2.7.62-28-g0754bc8, 0754bc8f2623bea184111af216f7567608db35b6; soakbuild '20151104.1'
    • 3
    • 9223372036854775807

    Description

      Error occurred during soak testing of build '20151104.1' on cluster lola (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20151104.1). MDTs are fromated with ldiskfs and OSTs with zfs as storage backend. DNE is enabled. MDSes are configured in HA failover configuration:

      • lola-8,9
        • mdt0/mgs, mdt1 primary node lola-8
        • mdt2,mdt3 primary node lola-9
      • lola-10,11
        • mdt4/mdt5 primary node lola-10
        • mdt6,mdt7 primary node lola-11

      Event sequence:

      • 2015-11-04 17:15:28 restart of lola-11 finished
      • The following error occcured on lola-11:
        lola-11.log:Nov  4 17:16:17 lola-11 kernel: LustreError: 5104:0:(llog.c:581:llog_process_thread()) soaked-MDT0007-osp-MDT0006 retry remote llog process
        

        INTL-156

        lola-11.log:Nov  4 17:16:18 lola-11 kernel: LustreError: 4984:0:(ldlm_lib.c:1883:check_for_next_transno()) soaked-MDT0007: waking for gap in transno, VBR is OFF (skip: 558345901253, ql: 5, comp: 11, conn: 16, next: 558345901263, next_update 0 last_committed: 558345900230)
        
      • lola-) hit LBUG
        Error message reads as:
        Nov  4 17:16:21 lola-9 kernel: LustreError: 6493:0:(osp_md_object.c:1155:osp_md_write()) ASSERTION( ob
        j->opo_ooa->ooa_attr.la_valid & LA_SIZE ) failed:
        Nov  4 17:16:21 lola-9 kernel: LustreError: 6493:0:(osp_md_object.c:1155:osp_md_write()) LBUG
        Nov  4 17:16:21 lola-9 kernel: Pid: 6493, comm: mdt00_005
        Nov  4 17:16:21 lola-9 kernel:
        Nov  4 17:16:21 lola-9 kernel: Call Trace:
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa07d2875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa07d2e77>] lbug_with_loc+0x47/0xb0 [libcfs]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa14518cd>] osp_md_write+0x42d/0x4e0 [osp]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa092509d>] dt_record_write+0x3d/0x130 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa08e3398>] llog_osd_write_rec+0x768/0x1c50 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa08d1416>] llog_write_rec+0xb6/0x270 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa08da7e3>] llog_cat_add_rec+0x1c3/0x7b0 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa08d1229>] llog_add+0x89/0x1c0 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0bca124>] sub_updates_write+0x1b4/0x14a0 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0bcbe24>] top_trans_stop+0xa14/0xe30 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa136042e>] ? lod_attr_set+0x12e/0xaa0 [lod]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0940800>] ? lu_ucred+0x20/0x30 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa134391c>] lod_trans_stop+0x2bc/0x330 [lod]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa13edb8a>] mdd_trans_stop+0x1a/0xac [mdd]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa13dcf3a>] mdd_create+0x12ea/0x1600 [mdd]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa128c7a4>] ? mdt_version_save+0x84/0x1a0 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa128ef66>] mdt_reint_create+0xbb6/0xcc0 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0940800>] ? lu_ucred+0x20/0x30 [obdclass]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa126e675>] ? mdt_ucred+0x15/0x20 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa128785c>] ? mdt_root_squash+0x2c/0x3f0 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0b73cf2>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa128b99d>] mdt_reint_rec+0x5d/0x200 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa127777b>] mdt_reint_internal+0x62b/0xb80 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa127816b>] mdt_reint+0x6b/0x120 [mdt]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0bb60ec>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0b5d9e1>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffff8152a39e>] ? thread_return+0x4e/0x7d0
        Nov  4 17:16:21 lola-9 kernel: [<ffffffffa0b5cba0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
        Nov  4 17:16:21 lola-9 kernel: [<ffffffff8109e78e>] kthread+0x9e/0xc0
        Nov  4 17:16:21 lola-9 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
        Nov  4 17:16:21 lola-9 kernel: [<ffffffff8109e6f0>] ? kthread+0x0/0xc0
        Nov  4 17:16:21 lola-9 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20
        Nov  4 17:16:21 lola-9 kernel:
        Nov  4 17:16:21 lola-9 kernel: LustreError: dumping log to /tmp/lustre-log.1446686181.6493
        

        Attached soak.log, debug log file, messages and console log of node lola-9.

      Attachments

        1. soak.log.bz2
          39 kB
        2. messages-lola-9.log.bz2
          291 kB
        3. lustre-log.1446686181.6493.bz2
          13 kB
        4. console-lola-9.log.bz2
          217 kB

        Issue Links

          Activity

            People

              di.wang Di Wang
              heckes Frank Heckes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: