Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-818

possible circular locking dependency detected on MDS

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 2.1.0
    • None
    • RHEL 6.2
      debug kernel 2.6.32-207.1chaos.ch5.x86_64.debug
    • 3
    • 6529

    Description

      The kernel lock validator produced the following warning on a Lustre 2.1 MDS.

      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-207.1chaos.ch5.x86_64.debug #1
      -------------------------------------------------------
      mdt_01/24742 is trying to acquire lock:
       (&sb->s_type->i_mutex_key#14){+.+.+.}, at: [<ffffffff811a2bf0>] do_filp_open+0x2c0/0xd90
      
      but task is already holding lock:
       (&loghandle->lgh_lock){+++++.}, at: [<ffffffffa04aaef2>] llog_cat_current_log.clone.0+0xa2/0x1000 [obdclass]
      
      which lock already depends on the new lock.
      
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&loghandle->lgh_lock){+++++.}:
             [<ffffffff810af29e>] __lock_acquire+0xeee/0x1570
             [<ffffffff810af9c4>] lock_acquire+0xa4/0x120
             [<ffffffff8109b74e>] down_read_nested+0x5e/0xb0
             [<ffffffffa04aae97>] llog_cat_current_log.clone.0+0x47/0x1000 [obdclass]
             [<ffffffffa04ad68a>] llog_cat_add_rec+0x5a/0x840 [obdclass]
             [<ffffffffa04b3fd6>] llog_obd_origin_add+0x56/0x190 [obdclass]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa07660cc>] lov_llog_origin_add+0xcc/0x5f0 [lov]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa09dfc39>] mds_llog_origin_add+0x109/0x310 [mds]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa09e02c6>] mds_llog_add_unlink+0x176/0x520 [mds]
             [<ffffffffa09e09a6>] mds_log_op_unlink+0x196/0x9a0 [mds]
             [<ffffffffa0a012b1>] mdd_unlink_log+0x51/0x110 [mdd]
             [<ffffffffa09f594e>] mdd_object_kill+0x14e/0x1b0 [mdd]
             [<ffffffffa0a0eece>] mdd_finish_unlink+0x20e/0x2c0 [mdd]
             [<ffffffffa0a15683>] mdd_unlink+0x853/0xb00 [mdd]
             [<ffffffffa0abcb46>] cml_unlink+0xb6/0x260 [cmm]
             [<ffffffffa0a5b799>] mdt_reint_unlink+0x689/0xa30 [mdt]
             [<ffffffffa0a5966f>] mdt_reint_rec+0x3f/0x100 [mdt]
             [<ffffffffa0a51b94>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
             [<ffffffffa0a51efc>] mdt_reint+0x4c/0x120 [mdt]
             [<ffffffffa0a459f5>] mdt_handle_common+0x8e5/0x1820 [mdt]
             [<ffffffffa0a46a05>] mdt_regular_handle+0x15/0x20 [mdt]
             [<ffffffffa05f3d9a>] ptlrpc_main+0xbea/0x1930 [ptlrpc]
             [<ffffffff8100c20a>] child_rip+0xa/0x20
      
      -> #2 (&mo->oo_sem#2/3){+++++.}:
             [<ffffffff810af29e>] __lock_acquire+0xeee/0x1570
             [<ffffffff810af9c4>] lock_acquire+0xa4/0x120
             [<ffffffff8109b6a5>] down_write_nested+0x65/0xb0
             [<ffffffffa0ad512b>] osd_object_write_lock+0x5b/0xd0 [osd_ldiskfs]
             [<ffffffffa0a23fb6>] mdd_write_lock+0x26/0x30 [mdd]
             [<ffffffffa0a16127>] mdd_create+0x7f7/0x1db0 [mdd]
             [<ffffffffa0510880>] llo_store_create_index+0x180/0x220 [obdclass]
             [<ffffffffa0addf8f>] osd_oi_init+0x1bf/0x2d0 [osd_ldiskfs]
             [<ffffffffa0ad779e>] osd_prepare+0x7e/0x310 [osd_ldiskfs]
             [<ffffffffa0a1da2a>] mdd_prepare+0x8a/0x790 [mdd]
             [<ffffffffa0ab607d>] cmm_prepare+0x4d/0x110 [cmm]
             [<ffffffffa0a4f0e0>] mdt_device_alloc+0xf40/0x2510 [mdt]
             [<ffffffffa04d599f>] obd_setup+0x1ff/0x330 [obdclass]
             [<ffffffffa04d5cd8>] class_setup+0x208/0xa50 [obdclass]
             [<ffffffffa04dd80c>] class_process_config+0xd6c/0x1fd0 [obdclass]
             [<ffffffffa04dfb28>] class_config_llog_handler+0x938/0x1680 [obdclass]
             [<ffffffffa04a939b>] llog_process_thread+0x94b/0xdf0 [obdclass]
             [<ffffffff8100c20a>] child_rip+0xa/0x20
      
      -> #1 (jbd2_handle){+.+.+.}:
             [<ffffffff810af29e>] __lock_acquire+0xeee/0x1570
             [<ffffffff810af9c4>] lock_acquire+0xa4/0x120
             [<ffffffffa009737c>] start_this_handle+0x45c/0x580 [jbd2]
             [<ffffffffa00976bf>] jbd2_journal_start+0xdf/0x120 [jbd2]
             [<ffffffffa0994318>] ldiskfs_journal_start_sb+0x58/0x90 [ldiskfs]
             [<ffffffffa0985122>] ldiskfs_create+0x62/0x180 [ldiskfs]
             [<ffffffff8119fce4>] vfs_create+0xb4/0xe0
             [<ffffffff811a345f>] do_filp_open+0xb2f/0xd90
             [<ffffffff8118f569>] do_sys_open+0x69/0x140
             [<ffffffff8118f680>] sys_open+0x20/0x30
             [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
      
      -> #0 (&sb->s_type->i_mutex_key#14){+.+.+.}:
             [<ffffffff810af89e>] __lock_acquire+0x14ee/0x1570
             [<ffffffff810af9c4>] lock_acquire+0xa4/0x120
             [<ffffffff8151dbe3>] __mutex_lock_common+0x43/0x400
             [<ffffffff8151e0a8>] mutex_lock_nested+0x48/0x60
             [<ffffffff811a2bf0>] do_filp_open+0x2c0/0xd90
             [<ffffffff811a36dd>] filp_open+0x1d/0x20
             [<ffffffffa09dd420>] mds_obd_create+0x1e0/0xd80 [mds]
             [<ffffffffa04b1a64>] llog_lvfs_create+0x6d4/0x1310 [obdclass]
             [<ffffffffa04ab16a>] llog_cat_current_log.clone.0+0x31a/0x1000 [obdclass]
             [<ffffffffa04ad68a>] llog_cat_add_rec+0x5a/0x840 [obdclass]
             [<ffffffffa04b3fd6>] llog_obd_origin_add+0x56/0x190 [obdclass]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa07660cc>] lov_llog_origin_add+0xcc/0x5f0 [lov]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa09dfc39>] mds_llog_origin_add+0x109/0x310 [mds]
             [<ffffffffa04b41b1>] llog_add+0xa1/0x3c0 [obdclass]
             [<ffffffffa09e02c6>] mds_llog_add_unlink+0x176/0x520 [mds]
             [<ffffffffa09e09a6>] mds_log_op_unlink+0x196/0x9a0 [mds]
             [<ffffffffa0a012b1>] mdd_unlink_log+0x51/0x110 [mdd]
             [<ffffffffa09f594e>] mdd_object_kill+0x14e/0x1b0 [mdd]
             [<ffffffffa0a0eece>] mdd_finish_unlink+0x20e/0x2c0 [mdd]
             [<ffffffffa0a15683>] mdd_unlink+0x853/0xb00 [mdd]
             [<ffffffffa0abcb46>] cml_unlink+0xb6/0x260 [cmm]
             [<ffffffffa0a5b799>] mdt_reint_unlink+0x689/0xa30 [mdt]
             [<ffffffffa0a5966f>] mdt_reint_rec+0x3f/0x100 [mdt]
             [<ffffffffa0a51b94>] mdt_reint_internal+0x6d4/0x9f0 [mdt]
             [<ffffffffa0a51efc>] mdt_reint+0x4c/0x120 [mdt]
             [<ffffffffa0a459f5>] mdt_handle_common+0x8e5/0x1820 [mdt]
             [<ffffffffa0a46a05>] mdt_regular_handle+0x15/0x20 [mdt]
             [<ffffffffa05f3d9a>] ptlrpc_main+0xbea/0x1930 [ptlrpc]
             [<ffffffff8100c20a>] child_rip+0xa/0x20
      
      other info that might help us debug this:
      
      3 locks held by mdt_01/24742:
       #0:  (jbd2_handle){+.+.+.}, at: [<ffffffffa0097325>] start_this_handle+0x405/0x580 [jbd2]
       #1:  (&mo->oo_sem#2/3){+++++.}, at: [<ffffffffa0ad512b>] osd_object_write_lock+0x5b/0xd0 [osd_ldiskfs]
       #2:  (&loghandle->lgh_lock){+++++.}, at: [<ffffffffa04aaef2>] llog_cat_current_log.clone.0+0xa2/0x1000 [obdclass]
      
      stack backtrace:
      Pid: 24742, comm: mdt_01 Tainted: G        W  ----------------   2.6.32-207.1chaos.ch5.x86_64.debug #1
      Call Trace:
       [<ffffffff810aca93>] ? print_circular_bug+0xf3/0x100
       [<ffffffff810af89e>] ? __lock_acquire+0x14ee/0x1570
       [<ffffffff811aa760>] ? __d_lookup+0x0/0x180
       [<ffffffff810af9c4>] ? lock_acquire+0xa4/0x120
       [<ffffffff811a2bf0>] ? do_filp_open+0x2c0/0xd90
       [<ffffffff8151dbe3>] ? __mutex_lock_common+0x43/0x400
       [<ffffffff811a2bf0>] ? do_filp_open+0x2c0/0xd90
       [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
       [<ffffffff811a2bf0>] ? do_filp_open+0x2c0/0xd90
       [<ffffffff810ab351>] ? lockdep_init_map+0x51/0x130
       [<ffffffff8151e0a8>] ? mutex_lock_nested+0x48/0x60
       [<ffffffff811a2bf0>] ? do_filp_open+0x2c0/0xd90
       [<ffffffff8103758c>] ? kvm_clock_read+0x1c/0x20
       [<ffffffff81012c29>] ? sched_clock+0x9/0x10
       [<ffffffff811b1bc9>] ? mntput_no_expire+0x29/0x110
       [<ffffffff81297fa0>] ? sprintf+0x40/0x50
       [<ffffffff811a36dd>] ? filp_open+0x1d/0x20
       [<ffffffffa09dd420>] ? mds_obd_create+0x1e0/0xd80 [mds]
       [<ffffffff810aa27d>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff8109d4af>] ? cpu_clock+0x6f/0x80
       [<ffffffff810ad39d>] ? lock_release_holdtime+0x3d/0x190
       [<ffffffff81175df3>] ? cache_alloc_debugcheck_after+0xf3/0x230
       [<ffffffffa04b1a64>] ? llog_lvfs_create+0x6d4/0x1310 [obdclass]
       [<ffffffffa04aaef2>] ? llog_cat_current_log.clone.0+0xa2/0x1000 [obdclass]
       [<ffffffffa04aaef2>] ? llog_cat_current_log.clone.0+0xa2/0x1000 [obdclass]
       [<ffffffffa04ab16a>] ? llog_cat_current_log.clone.0+0x31a/0x1000 [obdclass]
       [<ffffffff8103758c>] ? kvm_clock_read+0x1c/0x20
       [<ffffffff81012c29>] ? sched_clock+0x9/0x10
       [<ffffffff8109d245>] ? sched_clock_local+0x25/0x90
       [<ffffffffa04ad68a>] ? llog_cat_add_rec+0x5a/0x840 [obdclass]
       [<ffffffffa076625a>] ? lov_llog_origin_add+0x25a/0x5f0 [lov]
       [<ffffffffa04b3fd6>] ? llog_obd_origin_add+0x56/0x190 [obdclass]
       [<ffffffff8109d368>] ? sched_clock_cpu+0xb8/0x110
       [<ffffffff810aa27d>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff8109d4af>] ? cpu_clock+0x6f/0x80
       [<ffffffffa04b41b1>] ? llog_add+0xa1/0x3c0 [obdclass]
       [<ffffffffa07660cc>] ? lov_llog_origin_add+0xcc/0x5f0 [lov]
       [<ffffffff8103758c>] ? kvm_clock_read+0x1c/0x20
       [<ffffffff81012c29>] ? sched_clock+0x9/0x10
       [<ffffffff8109d245>] ? sched_clock_local+0x25/0x90
       [<ffffffffa04b41b1>] ? llog_add+0xa1/0x3c0 [obdclass]
       [<ffffffffa09dfc39>] ? mds_llog_origin_add+0x109/0x310 [mds]
       [<ffffffff8109d368>] ? sched_clock_cpu+0xb8/0x110
       [<ffffffffa04b41b1>] ? llog_add+0xa1/0x3c0 [obdclass]
       [<ffffffffa09e02c6>] ? mds_llog_add_unlink+0x176/0x520 [mds]
       [<ffffffffa09e09a6>] ? mds_log_op_unlink+0x196/0x9a0 [mds]
       [<ffffffffa0a012b1>] ? mdd_unlink_log+0x51/0x110 [mdd]
       [<ffffffffa09fadfb>] ? mdd_attr_get_internal+0x7ab/0xb10 [mdd]
       [<ffffffffa09f594e>] ? mdd_object_kill+0x14e/0x1b0 [mdd]
       [<ffffffffa0a0eece>] ? mdd_finish_unlink+0x20e/0x2c0 [mdd]
       [<ffffffffa09f7259>] ? mdd_attr_check_set_internal_locked+0x69/0x180 [mdd]
       [<ffffffffa0a15683>] ? mdd_unlink+0x853/0xb00 [mdd]
       [<ffffffffa0a12196>] ? mdd_lookup+0x56/0x120 [mdd]
       [<ffffffffa0ad26c6>] ? osd_object_version_get+0x16/0xc0 [osd_ldiskfs]
       [<ffffffffa05bce20>] ? ldlm_completion_ast+0x0/0x740 [ptlrpc]
       [<ffffffffa0a3ccc0>] ? mdt_blocking_ast+0x0/0x320 [mdt]
       [<ffffffffa0abcb46>] ? cml_unlink+0xb6/0x260 [cmm]
       [<ffffffffa0a5a406>] ? mdt_version_save+0x96/0x170 [mdt]
       [<ffffffffa0a5b799>] ? mdt_reint_unlink+0x689/0xa30 [mdt]
       [<ffffffffa0a5882f>] ? mdt_unlink_unpack+0x41f/0x5a0 [mdt]
       [<ffffffffa0a5966f>] ? mdt_reint_rec+0x3f/0x100 [mdt]
       [<ffffffffa05e5b64>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
       [<ffffffffa0a51b94>] ? mdt_reint_internal+0x6d4/0x9f0 [mdt]
       [<ffffffffa0a47806>] ? mdt_reint_opcode+0x96/0x160 [mdt]
       [<ffffffffa0a51efc>] ? mdt_reint+0x4c/0x120 [mdt]
       [<ffffffffa0a459f5>] ? mdt_handle_common+0x8e5/0x1820 [mdt]
       [<ffffffffa05e32c4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
       [<ffffffffa0a46a05>] ? mdt_regular_handle+0x15/0x20 [mdt]
       [<ffffffffa05f3d9a>] ? ptlrpc_main+0xbea/0x1930 [ptlrpc]
       [<ffffffff810adc9d>] ? trace_hardirqs_on_caller+0x14d/0x190
       [<ffffffffa05f31b0>] ? ptlrpc_main+0x0/0x1930 [ptlrpc]
       [<ffffffff8100c20a>] ? child_rip+0xa/0x20
       [<ffffffff8100bb50>] ? restore_args+0x0/0x30
       [<ffffffffa05f31b0>] ? ptlrpc_main+0x0/0x1930 [ptlrpc]
       [<ffffffff8100c200>] ? child_rip+0x0/0x20
      

      Attachments

        Activity

          People

            green Oleg Drokin
            nedbass Ned Bass
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: