Description
While verifying patch http://review.whamcloud.com/13433 on Lustre b2_5 branch under DNE mode, replay-single test 81h hung as follows:
== replay-single test 81h: DNE: unlink remote dir, drop request reply, fail 2 MDTs == 04:25:26 (1421555126) CMD: shadow-16vm8 lctl set_param fail_loc=0x119 fail_loc=0x119 Failing mds1 on shadow-16vm12 CMD: shadow-16vm12 grep -c /mnt/mds1' ' /proc/mounts Stopping /mnt/mds1 (opts:) on shadow-16vm12 CMD: shadow-16vm12 umount -d /mnt/mds1 CMD: shadow-16vm12 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Failing mds2 on shadow-16vm8 CMD: shadow-16vm8 grep -c /mnt/mds2' ' /proc/mounts Stopping /mnt/mds2 (opts:) on shadow-16vm8 CMD: shadow-16vm8 umount -d /mnt/mds2
Dmesg on MDS 2, MDS 3, MDS 4 (shadow-16vm8):
Lustre: DEBUG MARKER: == replay-single test 81h: DNE: unlink remote dir, drop request reply, fail 2 MDTs == 04:25:26 (1421555126) Lustre: DEBUG MARKER: lctl set_param fail_loc=0x119 Lustre: DEBUG MARKER: grep -c /mnt/mds2' ' /proc/mounts Lustre: DEBUG MARKER: umount -d /mnt/mds2 INFO: task jbd2/dm-1-8:2042 blocked for more than 120 seconds. Not tainted 2.6.32-431.29.2.el6_lustre.g36cd22b.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. jbd2/dm-1-8 D 0000000000000001 0 2042 2 0x00000080 ffff880061e1bd20 0000000000000046 0000000000000000 ffffffff8109afb6 ffff880079f5d5c0 ffff8800023168e8 0000000000000bd6 ffff8800796b8040 ffff8800796b85f8 ffff880061e1bfd8 000000000000fbc8 ffff8800796b85f8 Call Trace: [<ffffffff8109afb6>] ? autoremove_wake_function+0x16/0x40 [<ffffffffa03df91f>] jbd2_journal_commit_transaction+0x19f/0x15a0 [jbd2] [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 [<ffffffff81083e1c>] ? lock_timer_base+0x3c/0x70 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa03e5c18>] kjournald2+0xb8/0x220 [jbd2] [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa03e5b60>] ? kjournald2+0x0/0x220 [jbd2] [<ffffffff8109abf6>] kthread+0x96/0xa0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109ab60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 INFO: task umount:2642 blocked for more than 120 seconds. Not tainted 2.6.32-431.29.2.el6_lustre.g36cd22b.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. umount D 0000000000000000 0 2642 2641 0x00000080 ffff88006a39f9c8 0000000000000086 0000000000000000 0000000000000018 ffff88006a39fa98 ffffffffa04946c3 0000000054bb35bf ffff88006a39f9c8 ffff88007adc3058 ffff88006a39ffd8 000000000000fbc8 ffff88007adc3058 Call Trace: [<ffffffffa04946c3>] ? libcfs_debug_vmsg2+0x5d3/0xbd0 [libcfs] [<ffffffffa03de18a>] start_this_handle+0x27a/0x4a0 [jbd2] [<ffffffff8116eeeb>] ? cache_alloc_refill+0x15b/0x240 [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa03de5b0>] jbd2_journal_start+0xd0/0x110 [jbd2] [<ffffffffa0494d01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa03de603>] jbd2_journal_force_commit+0x13/0x30 [jbd2] [<ffffffffa0435327>] ldiskfs_force_commit+0x27/0x40 [ldiskfs] [<ffffffffa0c7fd75>] osd_sync+0xf5/0x100 [osd_ldiskfs] [<ffffffffa0d2abe5>] mdt_device_sync+0x35/0xd0 [mdt] [<ffffffffa0d39ce7>] mdt_iocontrol+0x217/0x870 [mdt] [<ffffffffa05e3af6>] class_cleanup+0x836/0xd30 [obdclass] [<ffffffffa0494d01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa05b9096>] ? class_name2dev+0x56/0xe0 [obdclass] [<ffffffffa05e555a>] class_process_config+0x156a/0x1ad0 [obdclass] [<ffffffffa05de6b3>] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass] [<ffffffffa05e5c39>] class_manual_cleanup+0x179/0x6f0 [obdclass] [<ffffffffa05b9096>] ? class_name2dev+0x56/0xe0 [obdclass] [<ffffffffa062137c>] server_put_super+0x5ec/0xf60 [obdclass] [<ffffffff8118b63b>] generic_shutdown_super+0x5b/0xe0 [<ffffffff8118b726>] kill_anon_super+0x16/0x60 [<ffffffffa05e7ae6>] lustre_kill_super+0x36/0x60 [obdclass] [<ffffffff8118bec7>] deactivate_super+0x57/0x80 [<ffffffff811ab8cf>] mntput_no_expire+0xbf/0x110 [<ffffffff811ac41b>] sys_umount+0x7b/0x3a0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Maloo report: https://testing.hpdd.intel.com/test_sets/85944252-9f03-11e4-91b3-5254006e85c2