Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7720

osd_object.c:925:osd_attr_set()) ASSERTION( dt_object_exists(dt)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • None
    • lola
      build: master branch, 2.7.65-38-g607f691 ; 607f6919ea67b101796630d4b55649a12ea0e859
    • 3
    • 9223372036854775807

    Description

      The error happened during soak testing of build '20160126' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160126). DNE is enabled.
      MDTs had been formated with ldiskfs, OSTs with zfs.
      No faults were injected during soak test. Only application load and execution of lfsck were imposed on the test cluster.

      Sequence of events:

      • Jan 27 05:44:56 - Started lfsck - command on primary MDS (lola-8):
        lctl lfsck_start -M soaked-MDT0000 -s 1000 -t all -A 
        
      • Jan 27 05:49 - OSS node lola-5 hit several LBUGs of the form:
        Jan 27 05:49:11 lola-5 kernel: LustreError: 17617:0:(osd_object.c:925:osd_attr_set()) LBUG
        Jan 27 05:49:11 lola-5 kernel: Pid: 17617, comm: ll_ost_out03_00
        Jan 27 05:49:11 lola-5 kernel: 
        Jan 27 05:49:11 lola-5 kernel: Call Trace:
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa05c7875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa05c7e77>] lbug_with_loc+0x47/0xb0 [libcfs]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa0b27af5>] osd_attr_set+0xdd5/0xe40 [osd_zfs]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa0710795>] ? keys_fill+0xd5/0x1b0 [obdclass]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa02da916>] ? spl_kmem_alloc+0x96/0x1a0 [spl]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa09b4033>] out_tx_attr_set_exec+0xa3/0x480 [ptlrpc]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa09aa49a>] out_tx_end+0xda/0x5c0 [ptlrpc]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffffa09b0364>] out_handle+0x11c4/0x19a0 [ptlrpc]
        Jan 27 05:49:11 lola-5 kernel: [<ffffffff8152b83e>] ? mutex_lock+0x1e/0x50
        Jan 27 05:49:12 lola-5 kernel: [<ffffffffa099f6fa>] ? req_can_reconstruct+0x6a/0x120 [ptlrpc]
        
      • Jan 27 08:30 - lola-5 crashed with oom-killer, most likely caused by LBUG in the end; over 600 blocked ost_* - threads.

      Attached files:

      • messages, console logs of lola-5
      • debug log files: lustre-log.1453902551.22690 lustre-log.1453902552.17617

      Attachments

        Activity

          People

            wc-triage WC Triage
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: