Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6969

osd_internal.h:1090:osd_trans_exec_check()) LBUG for osd_index_ea_delete()

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      Met this during local racer test on master.

      5[77465]: segfault at 8 ip 00000031f720b3f3 sp 00007fff8f7c3e50 error 4 in ld-2.12.so[31f7200000+20000]
      LustreError: 20700:0:(mdd_object.c:70:mdd_la_get()) lustre-MDD0000: object [0x200000404:0x4774:0x0] not found: rc = -2
      LustreError: 20700:0:(mdd_object.c:70:mdd_la_get()) Skipped 1 previous similar message
      Lustre: 42406:0:(osd_internal.h:1087:osd_trans_exec_check()) op 9: used 10, used now 10, reserved 5
      Lustre: 42406:0:(osd_handler.c:902:osd_trans_dump_creds())   create: 0/0/0, destroy: 0/0/0
      Lustre: 42406:0:(osd_handler.c:909:osd_trans_dump_creds())   attr_set: 2/2/0, xattr_set: 1/64/0
      Lustre: 42406:0:(osd_handler.c:919:osd_trans_dump_creds())   write: 6/14/0, punch: 0/0/0, quota 4/4/0
      Lustre: 42406:0:(osd_handler.c:926:osd_trans_dump_creds())   insert: 0/0/0, delete: 1/5/10
      Lustre: 42406:0:(osd_handler.c:933:osd_trans_dump_creds())   ref_add: 0/0/0, ref_del: 1/1/0
      LustreError: 42406:0:(osd_internal.h:1090:osd_trans_exec_check()) LBUG
      Pid: 42406, comm: mdt_out00_004
      
      Call Trace:
       [<ffffffffa05b4875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
       [<ffffffffa05b4e77>] lbug_with_loc+0x47/0xb0 [libcfs]
       [<ffffffffa0f30631>] osd_index_ea_delete+0x7b1/0xe10 [osd_ldiskfs]
       [<ffffffffa0999f90>] out_obj_index_delete+0x150/0x370 [ptlrpc]
       [<ffffffffa099a1d8>] out_tx_index_delete_exec+0x28/0x190 [ptlrpc]
       [<ffffffffa098e0ca>] out_tx_end+0xda/0x5d0 [ptlrpc]
       [<ffffffffa09931df>] out_handle+0x7af/0x1950 [ptlrpc]
       [<ffffffffa05c0c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa098afc2>] tgt_request_handle+0xa42/0x1230 [ptlrpc]
       [<ffffffffa09331a1>] ptlrpc_main+0xe41/0x1920 [ptlrpc]
       [<ffffffffa0932360>] ? ptlrpc_main+0x0/0x1920 [ptlrpc]
       [<ffffffff8109e66e>] kthread+0x9e/0xc0
       [<ffffffff8100c20a>] child_rip+0xa/0x20
       [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      LustreError: dumping log to /tmp/lustre-log.1438790617.42406
      
      Message from syslogd@testnode at Aug  5 09:48:22 ...                                                                                                                                                                                                                                
       kernel:LustreError: 8393:0:(osd_internal.h:1090:osd_trans_exec_check()) LBUG
      

      Attachments

        Issue Links

          Activity

            [LU-6969] osd_internal.h:1090:osd_trans_exec_check()) LBUG for osd_index_ea_delete()

            Sorry, my bad.

            vinayakh Vinayak (Inactive) added a comment - Sorry, my bad.

            sorry, this is a different issue happened to OST.

            bzzz Alex Zhuravlev added a comment - sorry, this is a different issue happened to OST.

            The issue looks to be not resolved completely. Hit this while running racer, test_1 for multiple times. Attaching the log file.

            Lustre: DEBUG MARKER: == racer test 1: racer on clients: fre0311,fre0312 DURATION=900 == 22:55:04 (1443740104)
            
            Lustre: 7246:0:(osd_handler.c:912:osd_trans_dump_creds())   create: 0/0/0, destroy: 0/0/0
            
            Lustre: 7246:0:(osd_handler.c:919:osd_trans_dump_creds())   attr_set: 1/1/0, xattr_set: 1/1/0
            
            Lustre: 7246:0:(osd_handler.c:929:osd_trans_dump_creds())   write: 2/12/0, punch: 1/4/0, quota 2/2/0
            
            Lustre: 7246:0:(osd_handler.c:936:osd_trans_dump_creds())   insert: 0/0/0, delete: 0/0/0
            
            Lustre: 7246:0:(osd_handler.c:943:osd_trans_dump_creds())   ref_add: 0/0/0, ref_del: 0/0/0
            
            LustreError: 7246:0:(osd_internal.h:1040:osd_trans_exec_op()) lustre-OST0001-osd: op = 7, rb = 7
            
            LustreError: 7246:0:(osd_internal.h:1048:osd_trans_exec_op()) LBUG
            
            Pid: 7246, comm: ll_ost_io00_009
            
            
            
            Call Trace:
            
             [<ffffffffa032a875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
            
             [<ffffffffa032ae77>] lbug_with_loc+0x47/0xb0 [libcfs]
            
             [<ffffffffa0b7e0be>] osd_write+0x41e/0x5b0 [osd_ldiskfs]
            
             [<ffffffffa0477c4d>] dt_record_write+0x3d/0x130 [obdclass]
            
             [<ffffffffa06f8545>] tgt_client_data_write+0x165/0x1b0 [ptlrpc]
            
             [<ffffffffa06f9517>] tgt_txn_stop_cb+0x477/0x1110 [ptlrpc]
            
             [<ffffffffa0477b1e>] dt_txn_hook_stop+0x5e/0x90 [obdclass]
            
             [<ffffffffa0b5b0ce>] osd_trans_stop+0x1ae/0x990 [osd_ldiskfs]
            
             [<ffffffffa0b6bd58>] ? osd_attr_set+0x148/0x620 [osd_ldiskfs]
            
             [<ffffffffa0ce7a7f>] ofd_trans_stop+0x1f/0x60 [ofd]
            
             [<ffffffffa0ce94aa>] ofd_object_punch+0x35a/0xa30 [ofd]
            
             [<ffffffffa0cd573e>] ofd_punch_hdl+0x36e/0xb20 [ofd]
            
             [<ffffffffa07084bc>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc]
            
             [<ffffffffa06afb41>] ptlrpc_main+0xe41/0x1910 [ptlrpc]
            
             [<ffffffffa06aed00>] ? ptlrpc_main+0x0/0x1910 [ptlrpc]
            
             [<ffffffff8109abf6>] kthread+0x96/0xa0
            
             [<ffffffff8100c20a>] child_rip+0xa/0x20
            
             [<ffffffff8109ab60>] ? kthread+0x0/0xa0
            
             [<ffffffff8100c200>] ? child_rip+0x0/0x20
            

            Correct me If I am wrong.

            vinayakh Vinayak (Inactive) added a comment - The issue looks to be not resolved completely. Hit this while running racer, test_1 for multiple times. Attaching the log file. Lustre: DEBUG MARKER: == racer test 1: racer on clients: fre0311,fre0312 DURATION=900 == 22:55:04 (1443740104) Lustre: 7246:0:(osd_handler.c:912:osd_trans_dump_creds()) create: 0/0/0, destroy: 0/0/0 Lustre: 7246:0:(osd_handler.c:919:osd_trans_dump_creds()) attr_set: 1/1/0, xattr_set: 1/1/0 Lustre: 7246:0:(osd_handler.c:929:osd_trans_dump_creds()) write: 2/12/0, punch: 1/4/0, quota 2/2/0 Lustre: 7246:0:(osd_handler.c:936:osd_trans_dump_creds()) insert: 0/0/0, delete: 0/0/0 Lustre: 7246:0:(osd_handler.c:943:osd_trans_dump_creds()) ref_add: 0/0/0, ref_del: 0/0/0 LustreError: 7246:0:(osd_internal.h:1040:osd_trans_exec_op()) lustre-OST0001-osd: op = 7, rb = 7 LustreError: 7246:0:(osd_internal.h:1048:osd_trans_exec_op()) LBUG Pid: 7246, comm: ll_ost_io00_009 Call Trace: [<ffffffffa032a875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa032ae77>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0b7e0be>] osd_write+0x41e/0x5b0 [osd_ldiskfs] [<ffffffffa0477c4d>] dt_record_write+0x3d/0x130 [obdclass] [<ffffffffa06f8545>] tgt_client_data_write+0x165/0x1b0 [ptlrpc] [<ffffffffa06f9517>] tgt_txn_stop_cb+0x477/0x1110 [ptlrpc] [<ffffffffa0477b1e>] dt_txn_hook_stop+0x5e/0x90 [obdclass] [<ffffffffa0b5b0ce>] osd_trans_stop+0x1ae/0x990 [osd_ldiskfs] [<ffffffffa0b6bd58>] ? osd_attr_set+0x148/0x620 [osd_ldiskfs] [<ffffffffa0ce7a7f>] ofd_trans_stop+0x1f/0x60 [ofd] [<ffffffffa0ce94aa>] ofd_object_punch+0x35a/0xa30 [ofd] [<ffffffffa0cd573e>] ofd_punch_hdl+0x36e/0xb20 [ofd] [<ffffffffa07084bc>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] [<ffffffffa06afb41>] ptlrpc_main+0xe41/0x1910 [ptlrpc] [<ffffffffa06aed00>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 Correct me If I am wrong.

            Landed for 2.8.0

            jgmitter Joseph Gmitter (Inactive) added a comment - Landed for 2.8.0

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15924/
            Subject: LU-6969 osd: remove agent inodes in a separate transaction
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0887b89c0c4e2b7c5a7ba3365e758a7d94c667fa

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15924/ Subject: LU-6969 osd: remove agent inodes in a separate transaction Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0887b89c0c4e2b7c5a7ba3365e758a7d94c667fa

            We faced this issue in sanity, test_51e and with the patch indicated in this ticket http://review.whamcloud.com/#/c/15924, did not face the issue even after running the test for 50 times.

            We have asked test team to verify the same if they have any scenario/cases in which this re-produces and also started multi runs for Intel specific failures like racer, test_1 etc.

            vinayakh Vinayak (Inactive) added a comment - We faced this issue in sanity, test_51e and with the patch indicated in this ticket http://review.whamcloud.com/#/c/15924 , did not face the issue even after running the test for 50 times. We have asked test team to verify the same if they have any scenario/cases in which this re-produces and also started multi runs for Intel specific failures like racer, test_1 etc.

            Kalpak, did you test http://review.whamcloud.com/15924 to see if it fixes this issue?

            adilger Andreas Dilger added a comment - Kalpak, did you test http://review.whamcloud.com/15924 to see if it fixes this issue?
            bogl Bob Glossman (Inactive) added a comment - 2 more instances seen in el7 test on master: https://testing.hpdd.intel.com/test_sets/1960c0a2-60ef-11e5-b495-5254006e85c2 https://testing.hpdd.intel.com/test_sets/1a18a8ca-60ef-11e5-b495-5254006e85c2 Reproduces a lot on tests of el7 server

            We have also hit this issue during master testing. This seems to be reproduced multiple times in last few days.

            I think it would be good to mark this issue as a blocker for 2.8.0.

            kshah Kalpak Shah (Inactive) added a comment - We have also hit this issue during master testing. This seems to be reproduced multiple times in last few days. I think it would be good to mark this issue as a blocker for 2.8.0.

            another seen in el7 client/server on master:
            https://testing.hpdd.intel.com/test_sets/4f0cbd82-5d6b-11e5-80c4-5254006e85c2

            from console log of mds:

            14:33:34:[ 7016.394189] LustreError: 4685:0:(osd_internal.h:1090:osd_trans_exec_check()) LBUG
            14:33:34:[ 7016.394773] Pid: 4685, comm: mdt_out00_001
            14:33:34:[ 7016.395093] 
            14:33:34:[ 7016.395093] Call Trace:
            14:33:34:[ 7016.395500]  [<ffffffffa062a7d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
            14:33:34:[ 7016.396023]  [<ffffffffa062ad75>] lbug_with_loc+0x45/0xc0 [libcfs]
            14:33:34:[ 7016.396542]  [<ffffffffa0c08a5e>] osd_it_ea_rec.part.94+0x0/0x36 [osd_ldiskfs]
            14:33:34:[ 7016.397083]  [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs]
            14:33:34:[ 7016.397664]  [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
            14:33:34:[ 7016.398235]  [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc]
            14:33:34:[ 7016.398805]  [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass]
            14:33:34:[ 7016.399351]  [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc]
            14:33:34:[ 7016.399985]  [<ffffffffa0a2b98e>] out_tx_end+0xde/0x5e0 [ptlrpc]
            14:33:34:[ 7016.400493]  [<ffffffffa0a2f607>] out_handle+0xe77/0x18d0 [ptlrpc]
            14:33:34:[ 7016.401083]  [<ffffffffa097aaa0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
            14:33:34:[ 7016.401606]  [<ffffffffa0a25723>] tgt_request_handle+0x7f3/0x1190 [ptlrpc]
            14:33:34:[ 7016.402134]  [<ffffffffa09cdf5b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            14:33:34:[ 7016.402763]  [<ffffffffa09cbd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
            14:33:34:[ 7016.403269]  [<ffffffff810a9672>] ? default_wake_function+0x12/0x20
            14:33:34:[ 7016.403758]  [<ffffffff810a08a8>] ? __wake_up_common+0x58/0x90
            14:33:34:[ 7016.404214]  [<ffffffffa09d1700>] ptlrpc_main+0xb70/0x1e90 [ptlrpc]
            14:33:34:[ 7016.404700]  [<ffffffff810ad906>] ? __dequeue_entity+0x26/0x40
            14:33:34:[ 7016.405131]  [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0
            14:33:34:[ 7016.405583]  [<ffffffffa09d0b90>] ? ptlrpc_main+0x0/0x1e90 [ptlrpc]
            14:33:34:[ 7016.406057]  [<ffffffff810973af>] kthread+0xcf/0xe0
            14:33:34:[ 7016.406460]  [<ffffffff810972e0>] ? kthread+0x0/0xe0
            14:33:34:[ 7016.406813]  [<ffffffff81615198>] ret_from_fork+0x58/0x90
            14:33:34:[ 7016.407216]  [<ffffffff810972e0>] ? kthread+0x0/0xe0
            14:33:34:[ 7016.407627] 
            14:33:34:[ 7016.407840] Kernel panic - not syncing: LBUG
            14:33:34:[ 7016.408176] CPU: 1 PID: 4685 Comm: mdt_out00_001 Tainted: GF          O--------------   3.10.0-229.14.1.el7_lustre.g630ab85.x86_64 #1
            14:33:34:[ 7016.408827] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            14:33:34:[ 7016.408827]  ffffffffa0647ecf 0000000001e4f211 ffff88007b4039c0 ffffffff8160533a
            14:33:34:[ 7016.408827]  ffff88007b403a40 ffffffff815febae ffffffff00000008 ffff88007b403a50
            14:33:34:[ 7016.408827]  ffff88007b4039f0 0000000001e4f211 ffffffffa0c0a7d0 0000000000000246
            14:33:34:[ 7016.408827] Call Trace:
            14:33:34:[ 7016.408827]  [<ffffffff8160533a>] dump_stack+0x19/0x1b
            14:33:34:[ 7016.408827]  [<ffffffff815febae>] panic+0xd8/0x1e7
            14:33:34:[ 7016.408827]  [<ffffffffa062addb>] lbug_with_loc+0xab/0xc0 [libcfs]
            14:33:34:[ 7016.408827]  [<ffffffffa0c08a5e>] osd_trans_exec_check.part.91+0x1a/0x1a [osd_ldiskfs]
            14:33:34:[ 7016.408827]  [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs]
            14:33:34:[ 7016.408827]  [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
            14:33:34:[ 7016.408827]  [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass]
            14:33:34:[ 7016.408827]  [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa0a2b98e>] out_tx_end+0xde/0x5e0 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa0a2f607>] out_handle+0xe77/0x18d0 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa097aaa0>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa0a25723>] tgt_request_handle+0x7f3/0x1190 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa09cdf5b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa09cbd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffff810a9672>] ? default_wake_function+0x12/0x20
            14:33:34:[ 7016.408827]  [<ffffffff810a08a8>] ? __wake_up_common+0x58/0x90
            14:33:34:[ 7016.408827]  [<ffffffffa09d1700>] ptlrpc_main+0xb70/0x1e90 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffff810ad906>] ? __dequeue_entity+0x26/0x40
            14:33:34:[ 7016.408827]  [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0
            14:33:34:[ 7016.408827]  [<ffffffffa09d0b90>] ? ptlrpc_register_service+0xfc0/0xfc0 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffff810973af>] kthread+0xcf/0xe0
            14:33:34:[ 7016.408827]  [<ffffffff810972e0>] ? kthread_create_on_node+0x140/0x140
            14:33:34:[ 7016.408827]  [<ffffffff81615198>] ret_from_fork+0x58/0x90
            14:33:34:[ 7016.408827]  [<ffffffff810972e0>] ? kthread_create_on_node+0x140/0x140
            14:33:34:[ 7016.408827] drm_kms_helper: panic occurred, switching back to text console
            14:33:34:[ 7016.408827] ------------[ cut here ]------------
            14:33:34:[ 7016.408827] kernel BUG at arch/x86/mm/pageattr.c:216!
            14:33:34:[ 7016.408827] invalid opcode: 0000 [#1] SMP 
            14:33:34:[ 7016.408827] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ppdev ib_sa serio_raw pcspkr virtio_balloon i2c_piix4 ib_mad parport_pc parport ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi 8139too virtio_blk cirrus syscopyarea sysfillrect sysimgblt virtio_pci virtio_ring virtio drm_kms_helper 8139cp mii ata_piix ttm drm i2c_core libata floppy
            14:33:34:[ 7016.408827] CPU: 1 PID: 4685 Comm: mdt_out00_001 Tainted: GF          O--------------   3.10.0-229.14.1.el7_lustre.g630ab85.x86_64 #1
            14:33:34:[ 7016.408827] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            14:33:34:[ 7016.408827] task: ffff8800697e71c0 ti: ffff88007b400000 task.ti: ffff88007b400000
            14:33:34:[ 7016.408827] RIP: 0010:[<ffffffff8105c2ef>]  [<ffffffff8105c2ef>] change_page_attr_set_clr+0x4ef/0x500
            14:33:34:[ 7016.408827] RSP: 0018:ffff88007b4031c0  EFLAGS: 00010046
            14:33:34:[ 7016.408827] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000010
            14:33:34:[ 7016.408827] RDX: 0000000000002000 RSI: 0000000000000000 RDI: 0000000080000000
            14:33:34:[ 7016.408827] RBP: ffff88007b403258 R08: 0000000000000004 R09: 000000000006d4cb
            14:33:34:[ 7016.408827] R10: 0000000000003689 R11: ffffffff811902af R12: 0000000000000010
            14:33:34:[ 7016.408827] R13: 0000000000000000 R14: 0000000000000200 R15: 0000000000000005
            14:33:34:[ 7016.408827] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
            14:33:34:[ 7016.408827] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
            14:33:34:[ 7016.408827] CR2: 00007f79ce24d018 CR3: 000000000190e000 CR4: 00000000000006e0
            14:33:34:[ 7016.408827] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            14:33:34:[ 7016.408827] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            14:33:34:[ 7016.408827] Stack:
            14:33:34:[ 7016.408827]  000000045a5a5a5a 0000000000000000 0000000000000000 ffff88006df2b000
            14:33:34:[ 7016.408827]  ffff8800697e71c0 0000000000000000 0000000000000000 0000000000000010
            14:33:34:[ 7016.408827]  0000000000000000 0000000500000001 000000000006d4cb 0000020000000000
            14:33:34:[ 7016.408827] Call Trace:
            14:33:34:[ 7016.408827]  [<ffffffff8105c646>] _set_pages_array+0xe6/0x130
            14:33:34:[ 7016.408827]  [<ffffffff8105c6c3>] set_pages_array_wc+0x13/0x20
            14:33:34:[ 7016.408827]  [<ffffffffa00cf3af>] ttm_set_pages_caching+0x2f/0x70 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffffa00cf4f4>] ttm_alloc_new_pages.isra.7+0xb4/0x180 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffffa00cfe50>] ttm_pool_populate+0x3e0/0x500 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffffa013332e>] cirrus_ttm_tt_populate+0xe/0x10 [cirrus]
            14:33:34:[ 7016.408827]  [<ffffffffa00cc6dd>] ttm_bo_move_memcpy+0x65d/0x6e0 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffff8118fa7e>] ? map_vm_area+0x2e/0x40
            14:33:34:[ 7016.408827]  [<ffffffffa00c82c9>] ? ttm_tt_init+0x69/0xb0 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffffa01332d8>] cirrus_bo_move+0x18/0x20 [cirrus]
            14:33:34:[ 7016.408827]  [<ffffffffa00c9de5>] ttm_bo_handle_move_mem+0x265/0x5b0 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffff81601bf4>] ? __slab_free+0x10e/0x277
            14:33:34:[ 7016.408827]  [<ffffffff8118f273>] ? __free_vmap_area+0xb3/0xf0
            14:33:34:[ 7016.408827]  [<ffffffffa00ca74a>] ? ttm_bo_mem_space+0x10a/0x310 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffffa00cae17>] ttm_bo_validate+0x247/0x260 [ttm]
            14:33:34:[ 7016.408827]  [<ffffffff81059e69>] ? iounmap+0x79/0xa0
            14:33:34:[ 7016.408827]  [<ffffffff81050000>] ? kgdb_arch_late+0x80/0x180
            14:33:34:[ 7016.408827]  [<ffffffffa0133ac2>] cirrus_bo_push_sysram+0x82/0xe0 [cirrus]
            14:33:34:[ 7016.408827]  [<ffffffffa0131c84>] cirrus_crtc_do_set_base.isra.8.constprop.10+0x84/0x430 [cirrus]
            14:33:34:[ 7016.408827]  [<ffffffffa0132479>] cirrus_crtc_mode_set+0x449/0x4d0 [cirrus]
            14:33:34:[ 7016.408827]  [<ffffffffa00e8939>] drm_crtc_helper_set_mode+0x2e9/0x520 [drm_kms_helper]
            14:33:34:[ 7016.408827]  [<ffffffffa00e96bf>] drm_crtc_helper_set_config+0x87f/0xaa0 [drm_kms_helper]
            14:33:34:[ 7016.408827]  [<ffffffffa0088711>] drm_mode_set_config_internal+0x61/0xe0 [drm]
            14:33:34:[ 7016.408827]  [<ffffffffa00f0e83>] restore_fbdev_mode+0xb3/0xe0 [drm_kms_helper]
            14:33:34:[ 7016.408827]  [<ffffffffa00f1045>] drm_fb_helper_force_kernel_mode+0x75/0xb0 [drm_kms_helper]
            14:33:34:[ 7016.408827]  [<ffffffffa00f1d59>] drm_fb_helper_panic+0x29/0x30 [drm_kms_helper]
            14:33:34:[ 7016.408827]  [<ffffffff81610bec>] notifier_call_chain+0x4c/0x70
            14:33:34:[ 7016.408827]  [<ffffffff81610c4a>] atomic_notifier_call_chain+0x1a/0x20
            14:33:34:[ 7016.408827]  [<ffffffff815febdc>] panic+0x106/0x1e7
            14:33:34:[ 7016.408827]  [<ffffffffa062addb>] lbug_with_loc+0xab/0xc0 [libcfs]
            14:33:34:[ 7016.408827]  [<ffffffffa0c08a5e>] osd_trans_exec_check.part.91+0x1a/0x1a [osd_ldiskfs]
            14:33:34:[ 7016.408827]  [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs]
            14:33:34:[ 7016.408827]  [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
            14:33:34:[ 7016.408827]  [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc]
            14:33:34:[ 7016.408827]  [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass]
            14:33:34:[ 7016.408827]  [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc]
            15:34:29:********** Timeout by autotest system **********
            
            bogl Bob Glossman (Inactive) added a comment - another seen in el7 client/server on master: https://testing.hpdd.intel.com/test_sets/4f0cbd82-5d6b-11e5-80c4-5254006e85c2 from console log of mds: 14:33:34:[ 7016.394189] LustreError: 4685:0:(osd_internal.h:1090:osd_trans_exec_check()) LBUG 14:33:34:[ 7016.394773] Pid: 4685, comm: mdt_out00_001 14:33:34:[ 7016.395093] 14:33:34:[ 7016.395093] Call Trace: 14:33:34:[ 7016.395500] [<ffffffffa062a7d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] 14:33:34:[ 7016.396023] [<ffffffffa062ad75>] lbug_with_loc+0x45/0xc0 [libcfs] 14:33:34:[ 7016.396542] [<ffffffffa0c08a5e>] osd_it_ea_rec.part.94+0x0/0x36 [osd_ldiskfs] 14:33:34:[ 7016.397083] [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs] 14:33:34:[ 7016.397664] [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0 14:33:34:[ 7016.398235] [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc] 14:33:34:[ 7016.398805] [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass] 14:33:34:[ 7016.399351] [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc] 14:33:34:[ 7016.399985] [<ffffffffa0a2b98e>] out_tx_end+0xde/0x5e0 [ptlrpc] 14:33:34:[ 7016.400493] [<ffffffffa0a2f607>] out_handle+0xe77/0x18d0 [ptlrpc] 14:33:34:[ 7016.401083] [<ffffffffa097aaa0>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] 14:33:34:[ 7016.401606] [<ffffffffa0a25723>] tgt_request_handle+0x7f3/0x1190 [ptlrpc] 14:33:34:[ 7016.402134] [<ffffffffa09cdf5b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] 14:33:34:[ 7016.402763] [<ffffffffa09cbd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] 14:33:34:[ 7016.403269] [<ffffffff810a9672>] ? default_wake_function+0x12/0x20 14:33:34:[ 7016.403758] [<ffffffff810a08a8>] ? __wake_up_common+0x58/0x90 14:33:34:[ 7016.404214] [<ffffffffa09d1700>] ptlrpc_main+0xb70/0x1e90 [ptlrpc] 14:33:34:[ 7016.404700] [<ffffffff810ad906>] ? __dequeue_entity+0x26/0x40 14:33:34:[ 7016.405131] [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0 14:33:34:[ 7016.405583] [<ffffffffa09d0b90>] ? ptlrpc_main+0x0/0x1e90 [ptlrpc] 14:33:34:[ 7016.406057] [<ffffffff810973af>] kthread+0xcf/0xe0 14:33:34:[ 7016.406460] [<ffffffff810972e0>] ? kthread+0x0/0xe0 14:33:34:[ 7016.406813] [<ffffffff81615198>] ret_from_fork+0x58/0x90 14:33:34:[ 7016.407216] [<ffffffff810972e0>] ? kthread+0x0/0xe0 14:33:34:[ 7016.407627] 14:33:34:[ 7016.407840] Kernel panic - not syncing: LBUG 14:33:34:[ 7016.408176] CPU: 1 PID: 4685 Comm: mdt_out00_001 Tainted: GF O-------------- 3.10.0-229.14.1.el7_lustre.g630ab85.x86_64 #1 14:33:34:[ 7016.408827] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 14:33:34:[ 7016.408827] ffffffffa0647ecf 0000000001e4f211 ffff88007b4039c0 ffffffff8160533a 14:33:34:[ 7016.408827] ffff88007b403a40 ffffffff815febae ffffffff00000008 ffff88007b403a50 14:33:34:[ 7016.408827] ffff88007b4039f0 0000000001e4f211 ffffffffa0c0a7d0 0000000000000246 14:33:34:[ 7016.408827] Call Trace: 14:33:34:[ 7016.408827] [<ffffffff8160533a>] dump_stack+0x19/0x1b 14:33:34:[ 7016.408827] [<ffffffff815febae>] panic+0xd8/0x1e7 14:33:34:[ 7016.408827] [<ffffffffa062addb>] lbug_with_loc+0xab/0xc0 [libcfs] 14:33:34:[ 7016.408827] [<ffffffffa0c08a5e>] osd_trans_exec_check.part.91+0x1a/0x1a [osd_ldiskfs] 14:33:34:[ 7016.408827] [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs] 14:33:34:[ 7016.408827] [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0 14:33:34:[ 7016.408827] [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass] 14:33:34:[ 7016.408827] [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa0a2b98e>] out_tx_end+0xde/0x5e0 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa0a2f607>] out_handle+0xe77/0x18d0 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa097aaa0>] ? target_send_reply_msg+0x170/0x170 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa0a25723>] tgt_request_handle+0x7f3/0x1190 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa09cdf5b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa09cbd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffff810a9672>] ? default_wake_function+0x12/0x20 14:33:34:[ 7016.408827] [<ffffffff810a08a8>] ? __wake_up_common+0x58/0x90 14:33:34:[ 7016.408827] [<ffffffffa09d1700>] ptlrpc_main+0xb70/0x1e90 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffff810ad906>] ? __dequeue_entity+0x26/0x40 14:33:34:[ 7016.408827] [<ffffffff810125f6>] ? __switch_to+0x136/0x4a0 14:33:34:[ 7016.408827] [<ffffffffa09d0b90>] ? ptlrpc_register_service+0xfc0/0xfc0 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffff810973af>] kthread+0xcf/0xe0 14:33:34:[ 7016.408827] [<ffffffff810972e0>] ? kthread_create_on_node+0x140/0x140 14:33:34:[ 7016.408827] [<ffffffff81615198>] ret_from_fork+0x58/0x90 14:33:34:[ 7016.408827] [<ffffffff810972e0>] ? kthread_create_on_node+0x140/0x140 14:33:34:[ 7016.408827] drm_kms_helper: panic occurred, switching back to text console 14:33:34:[ 7016.408827] ------------[ cut here ]------------ 14:33:34:[ 7016.408827] kernel BUG at arch/x86/mm/pageattr.c:216! 14:33:34:[ 7016.408827] invalid opcode: 0000 [#1] SMP 14:33:34:[ 7016.408827] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ppdev ib_sa serio_raw pcspkr virtio_balloon i2c_piix4 ib_mad parport_pc parport ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi 8139too virtio_blk cirrus syscopyarea sysfillrect sysimgblt virtio_pci virtio_ring virtio drm_kms_helper 8139cp mii ata_piix ttm drm i2c_core libata floppy 14:33:34:[ 7016.408827] CPU: 1 PID: 4685 Comm: mdt_out00_001 Tainted: GF O-------------- 3.10.0-229.14.1.el7_lustre.g630ab85.x86_64 #1 14:33:34:[ 7016.408827] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 14:33:34:[ 7016.408827] task: ffff8800697e71c0 ti: ffff88007b400000 task.ti: ffff88007b400000 14:33:34:[ 7016.408827] RIP: 0010:[<ffffffff8105c2ef>] [<ffffffff8105c2ef>] change_page_attr_set_clr+0x4ef/0x500 14:33:34:[ 7016.408827] RSP: 0018:ffff88007b4031c0 EFLAGS: 00010046 14:33:34:[ 7016.408827] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000010 14:33:34:[ 7016.408827] RDX: 0000000000002000 RSI: 0000000000000000 RDI: 0000000080000000 14:33:34:[ 7016.408827] RBP: ffff88007b403258 R08: 0000000000000004 R09: 000000000006d4cb 14:33:34:[ 7016.408827] R10: 0000000000003689 R11: ffffffff811902af R12: 0000000000000010 14:33:34:[ 7016.408827] R13: 0000000000000000 R14: 0000000000000200 R15: 0000000000000005 14:33:34:[ 7016.408827] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 14:33:34:[ 7016.408827] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 14:33:34:[ 7016.408827] CR2: 00007f79ce24d018 CR3: 000000000190e000 CR4: 00000000000006e0 14:33:34:[ 7016.408827] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 14:33:34:[ 7016.408827] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 14:33:34:[ 7016.408827] Stack: 14:33:34:[ 7016.408827] 000000045a5a5a5a 0000000000000000 0000000000000000 ffff88006df2b000 14:33:34:[ 7016.408827] ffff8800697e71c0 0000000000000000 0000000000000000 0000000000000010 14:33:34:[ 7016.408827] 0000000000000000 0000000500000001 000000000006d4cb 0000020000000000 14:33:34:[ 7016.408827] Call Trace: 14:33:34:[ 7016.408827] [<ffffffff8105c646>] _set_pages_array+0xe6/0x130 14:33:34:[ 7016.408827] [<ffffffff8105c6c3>] set_pages_array_wc+0x13/0x20 14:33:34:[ 7016.408827] [<ffffffffa00cf3af>] ttm_set_pages_caching+0x2f/0x70 [ttm] 14:33:34:[ 7016.408827] [<ffffffffa00cf4f4>] ttm_alloc_new_pages.isra.7+0xb4/0x180 [ttm] 14:33:34:[ 7016.408827] [<ffffffffa00cfe50>] ttm_pool_populate+0x3e0/0x500 [ttm] 14:33:34:[ 7016.408827] [<ffffffffa013332e>] cirrus_ttm_tt_populate+0xe/0x10 [cirrus] 14:33:34:[ 7016.408827] [<ffffffffa00cc6dd>] ttm_bo_move_memcpy+0x65d/0x6e0 [ttm] 14:33:34:[ 7016.408827] [<ffffffff8118fa7e>] ? map_vm_area+0x2e/0x40 14:33:34:[ 7016.408827] [<ffffffffa00c82c9>] ? ttm_tt_init+0x69/0xb0 [ttm] 14:33:34:[ 7016.408827] [<ffffffffa01332d8>] cirrus_bo_move+0x18/0x20 [cirrus] 14:33:34:[ 7016.408827] [<ffffffffa00c9de5>] ttm_bo_handle_move_mem+0x265/0x5b0 [ttm] 14:33:34:[ 7016.408827] [<ffffffff81601bf4>] ? __slab_free+0x10e/0x277 14:33:34:[ 7016.408827] [<ffffffff8118f273>] ? __free_vmap_area+0xb3/0xf0 14:33:34:[ 7016.408827] [<ffffffffa00ca74a>] ? ttm_bo_mem_space+0x10a/0x310 [ttm] 14:33:34:[ 7016.408827] [<ffffffffa00cae17>] ttm_bo_validate+0x247/0x260 [ttm] 14:33:34:[ 7016.408827] [<ffffffff81059e69>] ? iounmap+0x79/0xa0 14:33:34:[ 7016.408827] [<ffffffff81050000>] ? kgdb_arch_late+0x80/0x180 14:33:34:[ 7016.408827] [<ffffffffa0133ac2>] cirrus_bo_push_sysram+0x82/0xe0 [cirrus] 14:33:34:[ 7016.408827] [<ffffffffa0131c84>] cirrus_crtc_do_set_base.isra.8.constprop.10+0x84/0x430 [cirrus] 14:33:34:[ 7016.408827] [<ffffffffa0132479>] cirrus_crtc_mode_set+0x449/0x4d0 [cirrus] 14:33:34:[ 7016.408827] [<ffffffffa00e8939>] drm_crtc_helper_set_mode+0x2e9/0x520 [drm_kms_helper] 14:33:34:[ 7016.408827] [<ffffffffa00e96bf>] drm_crtc_helper_set_config+0x87f/0xaa0 [drm_kms_helper] 14:33:34:[ 7016.408827] [<ffffffffa0088711>] drm_mode_set_config_internal+0x61/0xe0 [drm] 14:33:34:[ 7016.408827] [<ffffffffa00f0e83>] restore_fbdev_mode+0xb3/0xe0 [drm_kms_helper] 14:33:34:[ 7016.408827] [<ffffffffa00f1045>] drm_fb_helper_force_kernel_mode+0x75/0xb0 [drm_kms_helper] 14:33:34:[ 7016.408827] [<ffffffffa00f1d59>] drm_fb_helper_panic+0x29/0x30 [drm_kms_helper] 14:33:34:[ 7016.408827] [<ffffffff81610bec>] notifier_call_chain+0x4c/0x70 14:33:34:[ 7016.408827] [<ffffffff81610c4a>] atomic_notifier_call_chain+0x1a/0x20 14:33:34:[ 7016.408827] [<ffffffff815febdc>] panic+0x106/0x1e7 14:33:34:[ 7016.408827] [<ffffffffa062addb>] lbug_with_loc+0xab/0xc0 [libcfs] 14:33:34:[ 7016.408827] [<ffffffffa0c08a5e>] osd_trans_exec_check.part.91+0x1a/0x1a [osd_ldiskfs] 14:33:34:[ 7016.408827] [<ffffffffa0bdc857>] osd_index_ea_delete+0x6d7/0xad0 [osd_ldiskfs] 14:33:34:[ 7016.408827] [<ffffffff811ac1be>] ? kmem_cache_alloc_trace+0x1ce/0x1f0 14:33:34:[ 7016.408827] [<ffffffffa0a30fb1>] out_obj_index_delete+0x111/0x2f0 [ptlrpc] 14:33:34:[ 7016.408827] [<ffffffffa076ae83>] ? lu_context_init+0xd3/0x1f0 [obdclass] 14:33:34:[ 7016.408827] [<ffffffffa0a311d5>] out_tx_index_delete_exec+0x25/0x180 [ptlrpc] 15:34:29:********** Timeout by autotest system **********

            People

              bzzz Alex Zhuravlev
              di.wang Di Wang
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: