Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6075

Interop 2.6.0<->2.7 sanity-scrub test_7: D process in MDS

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.7.0
    • None
    • server: lustre-master build #2770
      client: 2.6.0
    • 3
    • 16902

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a1aa0448-7e7b-11e4-8b8b-5254006e85c2.

      The sub-test test_7 failed with the following error:

      test failed to respond and timed out
      
      01:03:59:LustreError: 17421:0:(qmt_handler.c:427:qmt_dqacq0()) $$$ Release too much! uuid:lustre-MDT0000-lwp-OST0004_UUID release:262188 granted:0, total:0 qmt:lustre-QMT0000 pool:0-dt id:500 enforced:1 hard:394749 soft:375952 granted:0 time:0 qunit:16384 edquot:0 may_rel:0 revoke:0
      01:03:59:LustreError: 17421:0:(qmt_handler.c:427:qmt_dqacq0()) Skipped 1 previous similar message
      01:03:59:Lustre: *** cfs_fail_loc=191, val=0***
      01:03:59:Lustre: lustre-MDT0000: trigger OI scrub by RPC for [0x200000bd0:0x100:0x0], rc = 0 [2]
      01:03:59:INFO: task jbd2/dm-0-8:17400 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:jbd2/dm-0-8   D 0000000000000001     0 17400      2 0x00000080
      01:03:59: ffff88005b709d20 0000000000000046 ffff880065334090 ffff880065334080
      01:03:59: ffff8800606a7680 ffff8800023168e8 0000000000017997 ffff88006d06d540
      01:03:59: ffff88006d06daf8 ffff88005b709fd8 000000000000fbc8 ffff88006d06daf8
      01:03:59:Call Trace:
      01:03:59: [<ffffffff8109b2ce>] ? prepare_to_wait+0x4e/0x80
      01:03:59: [<ffffffffa03df80f>] jbd2_journal_commit_transaction+0x19f/0x1500 [jbd2]
      01:03:59: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
      01:03:59: [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03e5a58>] kjournald2+0xb8/0x220 [jbd2]
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03e59a0>] ? kjournald2+0x0/0x220 [jbd2]
      01:03:59: [<ffffffff8109abf6>] kthread+0x96/0xa0
      01:03:59: [<ffffffff8100c20a>] child_rip+0xa/0x20
      01:03:59: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      01:03:59: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      01:03:59:INFO: task osp-syn-0-0:17440 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:osp-syn-0-0   D 0000000000000000     0 17440      2 0x00000080
      01:03:59: ffff8800605c1700 0000000000000046 0000000000000000 ffffffff8128d756
      01:03:59: 0000000000000008 ffff88006a811a80 ffff8800605c2f68 ffff88005e305600
      01:03:59: ffff88006aeb9ab8 ffff8800605c1fd8 000000000000fbc8 ffff88006aeb9ab8
      01:03:59:Call Trace:
      01:03:59: [<ffffffff8128d756>] ? vsnprintf+0x336/0x5e0
      01:03:59: [<ffffffffa03de08a>] start_this_handle+0x25a/0x480 [jbd2]
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03de495>] jbd2_journal_start+0xb5/0x100 [jbd2]
      01:03:59: [<ffffffffa05b55d4>] ? llog_osd_declare_write_rec+0xd4/0x3f0 [obdclass]
      01:03:59: [<ffffffffa04367b6>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
      01:03:59: [<ffffffffa0d0c7cf>] osd_trans_start+0x1df/0x660 [osd_ldiskfs]
      01:03:59: [<ffffffffa05a4b7c>] llog_write+0x22c/0x420 [obdclass]
      01:03:59: [<ffffffffa05a4e3f>] llog_cancel_rec+0xaf/0x690 [obdclass]
      01:03:59: [<ffffffffa05aadf3>] llog_cat_cancel_records+0x133/0x2f0 [obdclass]
      01:03:59: [<ffffffffa10b13f2>] osp_sync_process_committed+0x2b2/0x820 [osp]
      01:03:59: [<ffffffffa10b19d3>] osp_sync_process_queues+0x73/0x700 [osp]
      01:03:59: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05aab28>] llog_cat_process_cb+0x448/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa05aa6e0>] ? llog_cat_process_cb+0x0/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05a8dfd>] llog_cat_process_or_fork+0x1ad/0x300 [obdclass]
      01:03:59: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20
      01:03:59: [<ffffffff810546b9>] ? __wake_up_common+0x59/0x90
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a8f69>] llog_cat_process+0x19/0x20 [obdclass]
      01:03:59: [<ffffffffa10b2673>] osp_sync_thread+0x243/0x7d0 [osp]
      01:03:59: [<ffffffff815296ee>] ? thread_return+0x4e/0x770
      01:03:59: [<ffffffffa10b2430>] ? osp_sync_thread+0x0/0x7d0 [osp]
      01:03:59: [<ffffffff8109abf6>] kthread+0x96/0xa0
      01:03:59: [<ffffffff8100c20a>] child_rip+0xa/0x20
      01:03:59: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      01:03:59: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      01:03:59:INFO: task osp-syn-1-0:17442 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:osp-syn-1-0   D 0000000000000000     0 17442      2 0x00000080
      01:03:59: ffff880057c99700 0000000000000046 0000000000000000 ffffffff8128d756
      01:03:59: 0000000000000008 ffff88006a811b40 ffff88005d7d0f68 ffff88005e305600
      01:03:59: ffff88006cf95098 ffff880057c99fd8 000000000000fbc8 ffff88006cf95098
      01:03:59:Call Trace:
      01:03:59: [<ffffffff8128d756>] ? vsnprintf+0x336/0x5e0
      01:03:59: [<ffffffffa03de08a>] start_this_handle+0x25a/0x480 [jbd2]
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03de495>] jbd2_journal_start+0xb5/0x100 [jbd2]
      01:03:59: [<ffffffffa05b55d4>] ? llog_osd_declare_write_rec+0xd4/0x3f0 [obdclass]
      01:03:59: [<ffffffffa04367b6>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
      01:03:59: [<ffffffffa0d0c7cf>] osd_trans_start+0x1df/0x660 [osd_ldiskfs]
      01:03:59: [<ffffffffa05a4b7c>] llog_write+0x22c/0x420 [obdclass]
      01:03:59: [<ffffffffa05a4e3f>] llog_cancel_rec+0xaf/0x690 [obdclass]
      01:03:59: [<ffffffffa05aadf3>] llog_cat_cancel_records+0x133/0x2f0 [obdclass]
      01:03:59: [<ffffffffa10b13f2>] osp_sync_process_committed+0x2b2/0x820 [osp]
      01:03:59: [<ffffffffa10b19d3>] osp_sync_process_queues+0x73/0x700 [osp]
      01:03:59: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05aab28>] llog_cat_process_cb+0x448/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa05aa6e0>] ? llog_cat_process_cb+0x0/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05a8dfd>] llog_cat_process_or_fork+0x1ad/0x300 [obdclass]
      01:03:59: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20
      01:03:59: [<ffffffff810546b9>] ? __wake_up_common+0x59/0x90
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a8f69>] llog_cat_process+0x19/0x20 [obdclass]
      01:03:59: [<ffffffffa10b2673>] osp_sync_thread+0x243/0x7d0 [osp]
      01:03:59: [<ffffffff815296ee>] ? thread_return+0x4e/0x770
      01:03:59: [<ffffffffa10b2430>] ? osp_sync_thread+0x0/0x7d0 [osp]
      01:03:59: [<ffffffff8109abf6>] kthread+0x96/0xa0
      01:03:59: [<ffffffff8100c20a>] child_rip+0xa/0x20
      01:03:59: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      01:03:59: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      01:03:59:INFO: task osp-syn-2-0:17444 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:osp-syn-2-0   D 0000000000000001     0 17444      2 0x00000080
      01:03:59: ffff880079239700 0000000000000046 0000000000000000 ffffffff8128d756
      01:03:59: 0000000000000008 ffff88006ca29d80 ffff88007923cf68 ffff88005e305600
      01:03:59: ffff880057dbd098 ffff880079239fd8 000000000000fbc8 ffff880057dbd098
      01:03:59:Call Trace:
      01:03:59: [<ffffffff8128d756>] ? vsnprintf+0x336/0x5e0
      01:03:59: [<ffffffffa03de08a>] start_this_handle+0x25a/0x480 [jbd2]
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03de495>] jbd2_journal_start+0xb5/0x100 [jbd2]
      01:03:59: [<ffffffffa05b55d4>] ? llog_osd_declare_write_rec+0xd4/0x3f0 [obdclass]
      01:03:59: [<ffffffffa04367b6>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
      01:03:59: [<ffffffffa0d0c7cf>] osd_trans_start+0x1df/0x660 [osd_ldiskfs]
      01:03:59: [<ffffffffa05a4b7c>] llog_write+0x22c/0x420 [obdclass]
      01:03:59: [<ffffffffa05a4e3f>] llog_cancel_rec+0xaf/0x690 [obdclass]
      01:03:59: [<ffffffffa05aadf3>] llog_cat_cancel_records+0x133/0x2f0 [obdclass]
      01:03:59: [<ffffffffa10b13f2>] osp_sync_process_committed+0x2b2/0x820 [osp]
      01:03:59: [<ffffffffa10b19d3>] osp_sync_process_queues+0x73/0x700 [osp]
      01:03:59: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffff81060aa3>] ? perf_event_task_sched_out+0x33/0x70
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05aab28>] llog_cat_process_cb+0x448/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa05aa6e0>] ? llog_cat_process_cb+0x0/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05a8dfd>] llog_cat_process_or_fork+0x1ad/0x300 [obdclass]
      01:03:59: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20
      01:03:59: [<ffffffff810546b9>] ? __wake_up_common+0x59/0x90
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a8f69>] llog_cat_process+0x19/0x20 [obdclass]
      01:03:59: [<ffffffffa10b2673>] osp_sync_thread+0x243/0x7d0 [osp]
      01:03:59: [<ffffffff815296ee>] ? thread_return+0x4e/0x770
      01:03:59: [<ffffffffa10b2430>] ? osp_sync_thread+0x0/0x7d0 [osp]
      01:03:59: [<ffffffff8109abf6>] kthread+0x96/0xa0
      01:03:59: [<ffffffff8100c20a>] child_rip+0xa/0x20
      01:03:59: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      01:03:59: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      01:03:59:INFO: task osp-syn-3-0:17446 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:osp-syn-3-0   D 0000000000000001     0 17446      2 0x00000080
      01:03:59: ffff8800581b5700 0000000000000046 0000000000000000 ffffffff8128d756
      01:03:59: 0000000000000008 ffff880065334680 ffff8800581b6f68 ffff88005e305600
      01:03:59: ffff880066fde5f8 ffff8800581b5fd8 000000000000fbc8 ffff880066fde5f8
      01:03:59:Call Trace:
      01:03:59: [<ffffffff8128d756>] ? vsnprintf+0x336/0x5e0
      01:03:59: [<ffffffffa03de08a>] start_this_handle+0x25a/0x480 [jbd2]
      01:03:59: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40
      01:03:59: [<ffffffffa03de495>] jbd2_journal_start+0xb5/0x100 [jbd2]
      01:03:59: [<ffffffffa05b55d4>] ? llog_osd_declare_write_rec+0xd4/0x3f0 [obdclass]
      01:03:59: [<ffffffffa04367b6>] ldiskfs_journal_start_sb+0x56/0xe0 [ldiskfs]
      01:03:59: [<ffffffffa0d0c7cf>] osd_trans_start+0x1df/0x660 [osd_ldiskfs]
      01:03:59: [<ffffffffa05a4b7c>] llog_write+0x22c/0x420 [obdclass]
      01:03:59: [<ffffffffa05a4e3f>] llog_cancel_rec+0xaf/0x690 [obdclass]
      01:03:59: [<ffffffffa05aadf3>] llog_cat_cancel_records+0x133/0x2f0 [obdclass]
      01:03:59: [<ffffffffa10b13f2>] osp_sync_process_committed+0x2b2/0x820 [osp]
      01:03:59: [<ffffffffa10b19d3>] osp_sync_process_queues+0x73/0x700 [osp]
      01:03:59: [<ffffffff81061d00>] ? default_wake_function+0x0/0x20
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05aab28>] llog_cat_process_cb+0x448/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a5ec9>] llog_process_thread+0xaa9/0xe80 [obdclass]
      01:03:59: [<ffffffffa05aa6e0>] ? llog_cat_process_cb+0x0/0x5e0 [obdclass]
      01:03:59: [<ffffffffa05a7abf>] llog_process_or_fork+0x13f/0x540 [obdclass]
      01:03:59: [<ffffffffa05a8dfd>] llog_cat_process_or_fork+0x1ad/0x300 [obdclass]
      01:03:59: [<ffffffff81061d12>] ? default_wake_function+0x12/0x20
      01:03:59: [<ffffffff810546b9>] ? __wake_up_common+0x59/0x90
      01:03:59: [<ffffffffa10b1960>] ? osp_sync_process_queues+0x0/0x700 [osp]
      01:03:59: [<ffffffffa05a8f69>] llog_cat_process+0x19/0x20 [obdclass]
      01:03:59: [<ffffffffa10b2673>] osp_sync_thread+0x243/0x7d0 [osp]
      01:03:59: [<ffffffff815296ee>] ? thread_return+0x4e/0x770
      01:03:59: [<ffffffffa10b2430>] ? osp_sync_thread+0x0/0x7d0 [osp]
      01:03:59: [<ffffffff8109abf6>] kthread+0x96/0xa0
      01:03:59: [<ffffffff8100c20a>] child_rip+0xa/0x20
      01:03:59: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
      01:03:59: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      01:03:59:INFO: task mdt00_004:18960 blocked for more than 120 seconds.
      01:03:59:      Not tainted 2.6.32-431.29.2.el6_lustre.x86_64 #1
      01:03:59:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      01:03:59:mdt00_004     D 0000000000000001     0 18960      2 0x00000080
      01:03:59: ffff88007942f7b0 0000000000000046 0000000000000000 ffffffff8116f933
      01:03:59: ffff88007942f760 ffffffff8116f933 ffff88007942f7d0 ffffffffa0cb5d0f
      01:03:59: ffff88006c955ab8 ffff88007942ffd8 000000000000fbc8 ffff88006c955ab8
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: