Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6846

dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      Found this in 24 hours failover test in OpenSFS cluster.

      Lustre: DEBUG MARKER: ==== Checking the clients loads AFTER failover -- failure NOT OK
      Lustre: DEBUG MARKER: mds7 has failed over 1 times, and counting...
      Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436839484/real 1436839484]  req@ffff88062c53a080 x1506632146392040/t0(0) o400->lustre-MDT0006-osp-MDT0001@192.168.2.127@o2ib:24/4 lens 224/224 e 1 to 1 dl 1436839486 ref 1 fl Rpc:X/c0/ffffffff rc 0/-1
      Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) Skipped 1 previous similar message
      Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436839503/real 1436839503]  req@ffff88062c53a080 x1506632146392208/t0(0) o400->lustre-MDT0006-osp-MDT0001@192.168.2.127@o2ib:24/4 lens 224/224 e 1 to 1 dl 1436839505 ref 1 fl Rpc:X/c0/ffffffff rc 0/-1
      Lustre: 9887:0:(client.c:2018:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
      Lustre: lustre-MDT0006-osp-MDT0001: Connection restored to lustre-MDT0006 (at 192.168.2.127@o2ib)
      LustreError: 12030:0:(dt_object.c:512:dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed: 
      LustreError: 12030:0:(dt_object.c:512:dt_record_write()) LBUG
      Pid: 12030, comm: mdt_out03_005
      
      Call Trace:
       [<ffffffffa0506875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa0506e77>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa065978f>] dt_record_write+0xbf/0x130 [obdclass]
       [<ffffffffa08f4d0e>] out_tx_write_exec+0x7e/0x300 [ptlrpc]
       [<ffffffffa08ed30a>] out_tx_end+0xda/0x5d0 [ptlrpc]
       [<ffffffffa08f1e7b>] out_handle+0xd9b/0x17e0 [ptlrpc]
       [<ffffffffa083afb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
       [<ffffffffa08ea212>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
       [<ffffffffa0892891>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
       [<ffffffffa0891a50>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
       [<ffffffff8109abf6>] kthread+0x96/0xa0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109ab60>] ? kthread+0x0/0xa0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      LustreError: dumping log to /tmp/lustre-log.1436839511.12030
      
      

      Attachments

        Issue Links

          Activity

            [LU-6846] dt_record_write()) ASSERTION( dt->do_body_ops->dbo_write ) failed:

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16333/
            Subject: LU-6846 llog: create remote llog synchronously
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 36f59f94c06c74887a32f2a7757e7c962c6cf8dd

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16333/ Subject: LU-6846 llog: create remote llog synchronously Project: fs/lustre-release Branch: master Current Patch Set: Commit: 36f59f94c06c74887a32f2a7757e7c962c6cf8dd

            All have this what I posted to LU-6831 for the trace. The node just reboots with no kdumps or anything

            simmonsja James A Simmons added a comment - All have this what I posted to LU-6831 for the trace. The node just reboots with no kdumps or anything
            di.wang Di Wang added a comment -

            Hmm, I guess somewhere in ldiskfs still complain about this super giant credit reservation. Do you have stack trace when the panic happens?

            di.wang Di Wang added a comment - Hmm, I guess somewhere in ldiskfs still complain about this super giant credit reservation. Do you have stack trace when the panic happens?
            simmonsja James A Simmons added a comment - - edited

            Yes it is my series. Ah I haven't updates my local test scripts. Still blows up the node with 300k.

            simmonsja James A Simmons added a comment - - edited Yes it is my series. Ah I haven't updates my local test scripts. Still blows up the node with 300k.
            di.wang Di Wang added a comment -

            Hmm, I only saw 300g and 300h failed in your log.

            Sep 10 12:49:43 spoon17.ccs.ornl.gov kernel: [  239.577103] Lustre: DEBUG MARKER: == sanity test 300g: check default striped directory for normal directory == 12:49:43 (1441903783)
            Sep 10 12:49:44 spoon17.ccs.ornl.gov kernel: [  240.291185] Lustre: DEBUG MARKER: sanity test_300g: @@@@@@ FAIL: stripe count 1 != 0 for /lustre/lustre/d300g.sanity/normal_dir/test1
            Sep 10 12:49:45 spoon17.ccs.ornl.gov kernel: [  241.048817] Lustre: DEBUG MARKER: == sanity test 300h: check default striped directory for striped directory == 12:49:45 (1441903785)
            Sep 10 12:49:45 spoon17.ccs.ornl.gov kernel: [  241.837071] Lustre: DEBUG MARKER: sanity test_300h: @@@@@@ FAIL: stripe count 1 != 0 for 
            

            is http://review.whamcloud.com/#/c/16130/ in your series? I think there is a fix in that patch( see test_300_check_default_striped_dir) to deal with this failure.

            di.wang Di Wang added a comment - Hmm, I only saw 300g and 300h failed in your log. Sep 10 12:49:43 spoon17.ccs.ornl.gov kernel: [ 239.577103] Lustre: DEBUG MARKER: == sanity test 300g: check default striped directory for normal directory == 12:49:43 (1441903783) Sep 10 12:49:44 spoon17.ccs.ornl.gov kernel: [ 240.291185] Lustre: DEBUG MARKER: sanity test_300g: @@@@@@ FAIL: stripe count 1 != 0 for /lustre/lustre/d300g.sanity/normal_dir/test1 Sep 10 12:49:45 spoon17.ccs.ornl.gov kernel: [ 241.048817] Lustre: DEBUG MARKER: == sanity test 300h: check default striped directory for striped directory == 12:49:45 (1441903785) Sep 10 12:49:45 spoon17.ccs.ornl.gov kernel: [ 241.837071] Lustre: DEBUG MARKER: sanity test_300h: @@@@@@ FAIL: stripe count 1 != 0 for is http://review.whamcloud.com/#/c/16130/ in your series? I think there is a fix in that patch( see test_300_check_default_striped_dir) to deal with this failure.

            I posted the full log under ticket LU-6831. You can see it here : https://jira.hpdd.intel.com/secure/attachment/18890/kern-09-10-2015.log

            simmonsja James A Simmons added a comment - I posted the full log under ticket LU-6831 . You can see it here : https://jira.hpdd.intel.com/secure/attachment/18890/kern-09-10-2015.log
            di.wang Di Wang added a comment -

            Hi, James, this is a known issue, because credits required for creating a big striped directory is larger than maximum credits here. And I believe you can see this on current master as well.
            But it is just a warning for now, and this should not cause 300k fails. Did you see other error message there? thanks.

            di.wang Di Wang added a comment - Hi, James, this is a known issue, because credits required for creating a big striped directory is larger than maximum credits here. And I believe you can see this on current master as well. But it is just a warning for now, and this should not cause 300k fails. Did you see other error message there? thanks.

            Still failing for sanity test 300k. I'm getting the following back trace:

            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.502856] Lustre: DEBUG MARKER: == sanity test 300k: test large striped dir
            ectory == 12:35:19 (1441902919)
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.555631] Lustre: *** cfs_fail_loc=1703, val=0***
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.563480] Lustre: 29844:0:(osd_handler.c:1106:osd_trans_start()) lustre-MDT
            0000: too many transaction credits (51379 > 1024)
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.575080] Lustre: 29844:0:(osd_handler.c:1041:osd_trans_dump_creds()) cre
            ate: 146/1168/0, destroy: 0/0/0
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.585090] Lustre: 29844:0:(osd_handler.c:1048:osd_trans_dump_creds()) att
            r_set: 130/130/0, xattr_set: 259/16513/0
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.597106] Lustre: 29844:0:(osd_handler.c:1058:osd_trans_dump_creds()) wri
            te: 70/2146/0, punch: 0/0/0, quota 2/2/0
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.607934] Lustre: 29844:0:(osd_handler.c:1065:osd_trans_dump_creds()) ins
            ert: 932/30778/0, delete: 0/0/0
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.617951] Lustre: 29844:0:(osd_handler.c:1072:osd_trans_dump_creds()) ref
            _add: 642/642/0, ref_del: 0/0/0
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.627962] Pid: 29844, comm: mdt01_005
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.631922]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.631923] Call Trace:
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.636098] [<ffffffffa040c875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.643201] [<ffffffffa0e57b3f>] osd_trans_start+0x3ef/0x410 [osd_ldiskfs]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.650441] [<ffffffffa0837977>] top_trans_start+0x357/0x960 [ptlrpc]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.657134] [<ffffffffa1082c11>] lod_trans_start+0x61/0x70 [lod]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.663447] [<ffffffffa0f6c5c4>] mdd_trans_start+0x14/0x20 [mdd]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.669699] [<ffffffffa0f5b45d>] mdd_create+0x9cd/0x1280 [mdd]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.675797] [<ffffffffa0fcc77c>] ? mdt_version_save+0x8c/0x1a0 [mdt]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.682471] [<ffffffffa0fcef3c>] mdt_reint_create+0xbbc/0xcc0 [mdt]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.688988] [<ffffffffa05ac240>] ? lu_ucred+0x20/0x30 [obdclass]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.695246] [<ffffffffa0fae675>] ? mdt_ucred+0x15/0x20 [mdt]
            Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.701251] [<ffffffffa0fc785c>] ? mdt_root_squash+0x2c/0x3f0 [mdt]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.052812] [<ffffffffa07db532>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.059811] [<ffffffffa0f3699d>] mdt_reint_rec+0x5d/0x200 [mdt]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.065938] [<ffffffffa0f227ab>] mdt_reint_internal+0x62b/0xb80 [mdt]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.072622] [<ffffffffa0f2319b>] mdt_reint+0x6b/0x120 [mdt]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.078449] [<ffffffffa081d72c>] tgt_request_handle+0xa4c/0x1290 [ptlrpc]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.085555] [<ffffffffa07c53b1>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.091995] [<ffffffffa07c4570>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.098371] [<ffffffff8109e88e>] kthread+0x9e/0xc0
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.103354] [<ffffffff8100c28a>] child_rip+0xa/0x20
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.108422] [<ffffffff8109e7f0>] ? kthread+0x0/0xc0
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.113536] [<ffffffff8100c280>] ? child_rip+0x0/0x20
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.118808]
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.120489] LustreError: 30312:0:(osd_internal.h:1030:osd_trans_exec_op()) lustre-MDT0000-osd: op = 3, rb = 4
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.423620] Lustre: 30312:0:(osd_handler.c:1041:osd_trans_dump_creds()) create: 146/1168/11, destroy: 0/0/0
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.433696] Lustre: 30312:0:(osd_handler.c:1041:osd_trans_dump_creds()) Skipped 419 previous similar messages
            Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.443770] Lustre: 30312:0:(osd_handler.c:1048:osd_trans_dump_creds()) attr_set: 130/130/0, xattr_set: 259/16513/0

            simmonsja James A Simmons added a comment - Still failing for sanity test 300k. I'm getting the following back trace: Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.502856] Lustre: DEBUG MARKER: == sanity test 300k: test large striped dir ectory == 12:35:19 (1441902919) Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.555631] Lustre: *** cfs_fail_loc=1703, val=0*** Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.563480] Lustre: 29844:0:(osd_handler.c:1106:osd_trans_start()) lustre-MDT 0000: too many transaction credits (51379 > 1024) Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.575080] Lustre: 29844:0:(osd_handler.c:1041:osd_trans_dump_creds()) cre ate: 146/1168/0, destroy: 0/0/0 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.585090] Lustre: 29844:0:(osd_handler.c:1048:osd_trans_dump_creds()) att r_set: 130/130/0, xattr_set: 259/16513/0 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.597106] Lustre: 29844:0:(osd_handler.c:1058:osd_trans_dump_creds()) wri te: 70/2146/0, punch: 0/0/0, quota 2/2/0 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.607934] Lustre: 29844:0:(osd_handler.c:1065:osd_trans_dump_creds()) ins ert: 932/30778/0, delete: 0/0/0 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.617951] Lustre: 29844:0:(osd_handler.c:1072:osd_trans_dump_creds()) ref _add: 642/642/0, ref_del: 0/0/0 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.627962] Pid: 29844, comm: mdt01_005 Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.631922] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.631923] Call Trace: Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.636098] [<ffffffffa040c875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.643201] [<ffffffffa0e57b3f>] osd_trans_start+0x3ef/0x410 [osd_ldiskfs] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.650441] [<ffffffffa0837977>] top_trans_start+0x357/0x960 [ptlrpc] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.657134] [<ffffffffa1082c11>] lod_trans_start+0x61/0x70 [lod] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.663447] [<ffffffffa0f6c5c4>] mdd_trans_start+0x14/0x20 [mdd] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.669699] [<ffffffffa0f5b45d>] mdd_create+0x9cd/0x1280 [mdd] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.675797] [<ffffffffa0fcc77c>] ? mdt_version_save+0x8c/0x1a0 [mdt] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.682471] [<ffffffffa0fcef3c>] mdt_reint_create+0xbbc/0xcc0 [mdt] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.688988] [<ffffffffa05ac240>] ? lu_ucred+0x20/0x30 [obdclass] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.695246] [<ffffffffa0fae675>] ? mdt_ucred+0x15/0x20 [mdt] Sep 10 12:35:19 spoon17.ccs.ornl.gov kernel: [ 283.701251] [<ffffffffa0fc785c>] ? mdt_root_squash+0x2c/0x3f0 [mdt] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.052812] [<ffffffffa07db532>] ? __req_capsule_get+0x162/0x6e0 [ptlrpc] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.059811] [<ffffffffa0f3699d>] mdt_reint_rec+0x5d/0x200 [mdt] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.065938] [<ffffffffa0f227ab>] mdt_reint_internal+0x62b/0xb80 [mdt] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.072622] [<ffffffffa0f2319b>] mdt_reint+0x6b/0x120 [mdt] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.078449] [<ffffffffa081d72c>] tgt_request_handle+0xa4c/0x1290 [ptlrpc] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.085555] [<ffffffffa07c53b1>] ptlrpc_main+0xe41/0x1910 [ptlrpc] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.091995] [<ffffffffa07c4570>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.098371] [<ffffffff8109e88e>] kthread+0x9e/0xc0 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.103354] [<ffffffff8100c28a>] child_rip+0xa/0x20 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.108422] [<ffffffff8109e7f0>] ? kthread+0x0/0xc0 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.113536] [<ffffffff8100c280>] ? child_rip+0x0/0x20 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.118808] Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.120489] LustreError: 30312:0:(osd_internal.h:1030:osd_trans_exec_op()) lustre-MDT0000-osd: op = 3, rb = 4 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.423620] Lustre: 30312:0:(osd_handler.c:1041:osd_trans_dump_creds()) create: 146/1168/11, destroy: 0/0/0 Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.433696] Lustre: 30312:0:(osd_handler.c:1041:osd_trans_dump_creds()) Skipped 419 previous similar messages Sep 10 12:21:13 spoon17.ccs.ornl.gov kernel: [ 515.443770] Lustre: 30312:0:(osd_handler.c:1048:osd_trans_dump_creds()) attr_set: 130/130/0, xattr_set: 259/16513/0

            Much better!!

            simmonsja James A Simmons added a comment - Much better!!
            di.wang Di Wang added a comment -

            Hi, James, I just update patch. Thanks for updating.

            di.wang Di Wang added a comment - Hi, James, I just update patch. Thanks for updating.

            Just did. Now with sanity test 60 I get the follow exception:

            Sep 9 13:17:09 ninja11.ccs.ornl.gov kernel: [12210.969737] Lustre: 9313:0:(llog_test.c:1010:llog_test_7())
            7a: test llog_logid_rec
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.888955] LustreError: 9313:0:(llog_test.c:80:verify_handl
            e()) 7_sub: expected 1 active recs after write, found 2
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.899565] LustreError: 9313:0:(osd_handler.c:2488:osd_obje
            ct_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed:
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.911973] LustreError: 9313:0:(osd_handler.c:2488:osd_obje
            ct_destroy()) LBUG
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.919387] Pid: 9313, comm: lctl
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.922812]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.922813] Call Trace:
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.926946] [<ffffffffa0448875>] libcfs_debug_dumpstack+0x5
            5/0x80 [libcfs]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.933996] [<ffffffffa0448e77>] lbug_with_loc+0x47/0xb0 [l
            ibcfs]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.940297] [<ffffffffa0e9e80b>] osd_object_destroy+0x51b/0
            x5a0 [osd_ldiskfs]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.947683] [<ffffffffa0e9d177>] ? osd_object_ref_del+0x187
            /0x4e0 [osd_ldiskfs]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.955277] [<ffffffffa058ddba>] llog_osd_destroy+0x1ba/0x9
            d0 [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.962177] [<ffffffffa057d686>] llog_destroy+0x2b6/0x470 [
            obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.968712] [<ffffffffa0527ccc>] llog_test_7_sub+0x29c/0x92
            0 [llog_test]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.975597] [<ffffffffa0528eaf>] llog_test_7+0x13f/0xa10 [l
            log_test]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.982111] [<ffffffffa052dbb5>] llog_run_tests+0x1355/0x1d
            40 [llog_test]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.989099] [<ffffffffa0582b7e>] ? llog_setup+0x4ae/0x8b0 [
            obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.995643] [<ffffffffa052ed49>] llog_test_setup+0x7a9/0x8f
            0 [llog_test]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.002538] [<ffffffffa05b00fb>] obd_setup+0x19b/0x290 [obd
            class]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.008810] [<ffffffff81175113>] ? kmem_cache_alloc_trace+0
            x1b3/0x1c0
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.015463] [<ffffffffa0594a53>] ? class_new_export+0x883/0
            x9e0 [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.022553] [<ffffffffa05b044f>] class_setup+0x25f/0x940 [o
            bdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.029018] [<ffffffffa05b7631>] class_process_config+0x115
            1/0x2840 [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.036522] [<ffffffffa0575fd7>] ? obd_ioctl_getdata+0x127/
            0x1310 [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.043762] [<ffffffffa0454bb1>] ? libcfs_debug_msg+0x41/0x
            50 [libcfs]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.050477] [<ffffffff8117546c>] ? __kmalloc+0x21c/0x230
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.055985] [<ffffffffa058fbf2>] class_handle_ioctl+0xbb2/0
            x21e0 [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.063131] [<ffffffff8148b5a1>] ? netlink_create+0x191/0x2
            00
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.069075] [<ffffffffa05752ab>] obd_class_ioctl+0x4b/0x190
            [obdclass]
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.075785] [<ffffffff811a4222>] vfs_ioctl+0x22/0xa0
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.080930] [<ffffffff811acc5f>] ? generic_drop_inode+0x4f/
            0x80
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.087028] [<ffffffff811a43c4>] do_vfs_ioctl+0x84/0x580
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.092516] [<ffffffff81448871>] ? sock_release+0x71/0x90
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.098104] [<ffffffff814489bf>] ? __sock_create+0xff/0x240
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.103849] [<ffffffff811a4941>] sys_ioctl+0x81/0xa0
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.108989] [<ffffffff8100b0d2>] system_call_fastpath+0x16/
            0x1b
            Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.115100]

            simmonsja James A Simmons added a comment - Just did. Now with sanity test 60 I get the follow exception: Sep 9 13:17:09 ninja11.ccs.ornl.gov kernel: [12210.969737] Lustre: 9313:0:(llog_test.c:1010:llog_test_7()) 7a: test llog_logid_rec Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.888955] LustreError: 9313:0:(llog_test.c:80:verify_handl e()) 7_sub: expected 1 active recs after write, found 2 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.899565] LustreError: 9313:0:(osd_handler.c:2488:osd_obje ct_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed: Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.911973] LustreError: 9313:0:(osd_handler.c:2488:osd_obje ct_destroy()) LBUG Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.919387] Pid: 9313, comm: lctl Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.922812] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.922813] Call Trace: Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.926946] [<ffffffffa0448875>] libcfs_debug_dumpstack+0x5 5/0x80 [libcfs] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.933996] [<ffffffffa0448e77>] lbug_with_loc+0x47/0xb0 [l ibcfs] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.940297] [<ffffffffa0e9e80b>] osd_object_destroy+0x51b/0 x5a0 [osd_ldiskfs] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.947683] [<ffffffffa0e9d177>] ? osd_object_ref_del+0x187 /0x4e0 [osd_ldiskfs] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.955277] [<ffffffffa058ddba>] llog_osd_destroy+0x1ba/0x9 d0 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.962177] [<ffffffffa057d686>] llog_destroy+0x2b6/0x470 [ obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.968712] [<ffffffffa0527ccc>] llog_test_7_sub+0x29c/0x92 0 [llog_test] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.975597] [<ffffffffa0528eaf>] llog_test_7+0x13f/0xa10 [l log_test] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.982111] [<ffffffffa052dbb5>] llog_run_tests+0x1355/0x1d 40 [llog_test] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.989099] [<ffffffffa0582b7e>] ? llog_setup+0x4ae/0x8b0 [ obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12211.995643] [<ffffffffa052ed49>] llog_test_setup+0x7a9/0x8f 0 [llog_test] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.002538] [<ffffffffa05b00fb>] obd_setup+0x19b/0x290 [obd class] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.008810] [<ffffffff81175113>] ? kmem_cache_alloc_trace+0 x1b3/0x1c0 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.015463] [<ffffffffa0594a53>] ? class_new_export+0x883/0 x9e0 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.022553] [<ffffffffa05b044f>] class_setup+0x25f/0x940 [o bdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.029018] [<ffffffffa05b7631>] class_process_config+0x115 1/0x2840 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.036522] [<ffffffffa0575fd7>] ? obd_ioctl_getdata+0x127/ 0x1310 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.043762] [<ffffffffa0454bb1>] ? libcfs_debug_msg+0x41/0x 50 [libcfs] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.050477] [<ffffffff8117546c>] ? __kmalloc+0x21c/0x230 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.055985] [<ffffffffa058fbf2>] class_handle_ioctl+0xbb2/0 x21e0 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.063131] [<ffffffff8148b5a1>] ? netlink_create+0x191/0x2 00 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.069075] [<ffffffffa05752ab>] obd_class_ioctl+0x4b/0x190 [obdclass] Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.075785] [<ffffffff811a4222>] vfs_ioctl+0x22/0xa0 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.080930] [<ffffffff811acc5f>] ? generic_drop_inode+0x4f/ 0x80 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.087028] [<ffffffff811a43c4>] do_vfs_ioctl+0x84/0x580 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.092516] [<ffffffff81448871>] ? sock_release+0x71/0x90 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.098104] [<ffffffff814489bf>] ? __sock_create+0xff/0x240 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.103849] [<ffffffff811a4941>] sys_ioctl+0x81/0xa0 Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.108989] [<ffffffff8100b0d2>] system_call_fastpath+0x16/ 0x1b Sep 9 13:17:10 ninja11.ccs.ornl.gov kernel: [12212.115100]

            People

              di.wang Di Wang
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: