Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5270

Deadlock of mmap_sem when using jobstats with customized ID

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.6.0, Lustre 2.5.3
    • Lustre 2.6.0
    • b_ieel2_0
    • 3
    • 14706

    Description

      Some processes stuck when they tried to get mmap_sem lock. Following are the dump stacks. We are using jobstats with customized ID and that is why cfs_get_environ() is called.

      Jun 21 05:37:57 ca1020 kernel: INFO: task cmahostd:3138 blocked for more than 120 seconds.
      Jun 21 05:37:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:37:57 ca1020 kernel: cmahostd D 000000000000000b 0 3138 1 0x00000080
      Jun 21 05:37:57 ca1020 kernel: ffff881032a5fc50 0000000000000086 ffff881065665ea1 ffff881d48dab00b
      Jun 21 05:37:57 ca1020 kernel: ffff882069662200 ffff881032a5fbd8 ffffffff8119af4a ffff881032a5fd18
      Jun 21 05:37:57 ca1020 kernel: ffff881032b61ab8 ffff881032a5ffd8 000000000000fb88 ffff881032b61ab8
      Jun 21 05:37:57 ca1020 kernel: Call Trace:
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8119af4a>] ? dput+0x9a/0x150
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8118f2b5>] ? path_to_nameidata+0x25/0x60
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81144901>] __access_remote_vm+0x41/0x1f0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81148842>] ? vma_merge+0x1d2/0x3e0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81148d1b>] ? __vm_enough_memory+0x3b/0x190
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81144b0b>] access_process_vm+0x5b/0x80
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff811eb23d>] proc_pid_cmdline+0x6d/0x120
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff811ebfbd>] proc_info_read+0xad/0xf0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81181801>] sys_read+0x51/0x90
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:37:57 ca1020 kernel: INFO: task sge_execd:17808 blocked for more than 120 seconds.
      Jun 21 05:37:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:37:57 ca1020 kernel: sge_execd D 0000000000000007 0 17808 1 0x00000084
      Jun 21 05:37:57 ca1020 kernel: ffff88206006fe08 0000000000000082 0000000000000000 ffff88185f422cf8
      Jun 21 05:37:57 ca1020 kernel: ffff881065a56c00 ffffffff8121cc5f ffff88206006fd98 ffff880d4fec2b40
      Jun 21 05:37:57 ca1020 kernel: ffff8820661f6638 ffff88206006ffd8 000000000000fb88 ffff8820661f6638
      Jun 21 05:37:57 ca1020 kernel: Call Trace:
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8121cc5f>] ? security_inode_permission+0x1f/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fda3>] rwsem_down_write_failed+0x23/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff812833e3>] call_rwsem_down_write_failed+0x13/0x20
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150f2a2>] ? down_write+0x32/0x40
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81139b7c>] sys_mmap_pgoff+0x5c/0x2d0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81010519>] sys_mmap+0x29/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:37:57 ca1020 kernel: INFO: task sge_execd:17811 blocked for more than 120 seconds.
      Jun 21 05:37:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:37:57 ca1020 kernel: sge_execd D 0000000000000008 0 17811 1 0x00000080
      Jun 21 05:37:57 ca1020 kernel: ffff88104a101230 0000000000000082 ffffffff811666bc 0000000000000282
      Jun 21 05:37:57 ca1020 kernel: ffff88107fcb02c0 ffff881069139a00 ffff88107fc214c0 ffff88104a101218
      Jun 21 05:37:57 ca1020 kernel: ffff8810308fd098 ffff88104a101fd8 000000000000fb88 ffff8810308fd098
      Jun 21 05:37:57 ca1020 kernel: Call Trace:
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff811666bc>] ? transfer_objects+0x5c/0x80
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff810573a5>] ? select_idle_sibling+0x95/0x150
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa058d0b7>] cfs_get_environ+0x257/0x6c0 [libcfs]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08bfaa5>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa069396f>] lustre_get_jobid+0x10f/0x380 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08c0646>] lustre_msg_set_jobid+0xb6/0x140 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08e08ec>] ptlrpcd_add_req+0x3c/0x2f0 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06e9d21>] ? cl_req_attr_set+0xd1/0x230 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08bf63c>] ? lustre_msg_get_opc+0x9c/0x110 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0ac37a2>] osc_build_rpc+0xd62/0x1810 [osc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0ade4b7>] osc_io_unplug0+0x1257/0x1f00 [osc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0599bb2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08e8644>] ? get_my_ctx+0x64/0x100 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa08effbd>] ? sptlrpc_import_check_ctx+0x18d/0x320 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0892336>] ? ldlm_resource_putref+0x66/0x280 [ptlrpc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0ae0ee1>] osc_io_unplug+0x11/0x20 [osc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0ae10bd>] osc_queue_sync_pages+0x1cd/0x350 [osc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0ad25c7>] osc_io_submit+0x1c7/0x4e0 [osc]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06e955e>] cl_io_submit_rw+0x6e/0x160 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0b65230>] lov_io_submit+0x2d0/0x4b0 [lov]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06ebc7d>] ? cl_page_list_add+0x5d/0x190 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06e955e>] cl_io_submit_rw+0x6e/0x160 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06ebc10>] cl_io_read_page+0x180/0x190 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0c08f01>] ll_readpage+0x91/0x1a0 [lustre]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8111b223>] filemap_fault+0x313/0x500
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0c38bd4>] vvp_io_fault_start+0x424/0xc50 [lustre]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06e7645>] ? cl_wait+0xb5/0x250 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06e97ea>] cl_io_start+0x6a/0x140 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa06ed974>] cl_io_loop+0xb4/0x1b0 [obdclass]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffffa0c19002>] ll_fault+0x2c2/0x4d0 [lustre]
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff811430b4>] __do_fault+0x54/0x530
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81143687>] handle_pte_fault+0xf7/0xb50
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8105230c>] ? check_preempt_curr+0x7c/0x90
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81434551>] ? sock_aio_read+0x1a1/0x1b0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8114431a>] handle_mm_fault+0x23a/0x310
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff810474c9>] __do_page_fault+0x139/0x480
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff81010bde>] ? copy_user_generic+0xe/0x20
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff8151311e>] do_page_fault+0x3e/0xa0
      Jun 21 05:37:57 ca1020 kernel: [<ffffffff815104d5>] page_fault+0x25/0x30
      Jun 21 05:38:04 ca1020 ntpd[2512]: synchronized to 172.21.69.162, stratum 4
      Jun 21 05:38:05 ca1020 ntpd[2512]: synchronized to 172.21.69.161, stratum 4
      Jun 21 05:39:57 ca1020 kernel: INFO: task cmahostd:3138 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: cmahostd D 000000000000000b 0 3138 1 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff881032a5fc50 0000000000000086 ffff881065665ea1 ffff881d48dab00b
      Jun 21 05:39:57 ca1020 kernel: ffff882069662200 ffff881032a5fbd8 ffffffff8119af4a ffff881032a5fd18
      Jun 21 05:39:57 ca1020 kernel: ffff881032b61ab8 ffff881032a5ffd8 000000000000fb88 ffff881032b61ab8
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8119af4a>] ? dput+0x9a/0x150
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8118f2b5>] ? path_to_nameidata+0x25/0x60
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144901>] __access_remote_vm+0x41/0x1f0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81148842>] ? vma_merge+0x1d2/0x3e0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81148d1b>] ? __vm_enough_memory+0x3b/0x190
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144b0b>] access_process_vm+0x5b/0x80
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811eb23d>] proc_pid_cmdline+0x6d/0x120
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811ebfbd>] proc_info_read+0xad/0xf0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81181801>] sys_read+0x51/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:39:57 ca1020 kernel: INFO: task sge_execd:17808 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: sge_execd D 0000000000000007 0 17808 1 0x00000084
      Jun 21 05:39:57 ca1020 kernel: ffff88206006fe08 0000000000000082 0000000000000000 ffff88185f422cf8
      Jun 21 05:39:57 ca1020 kernel: ffff881065a56c00 ffffffff8121cc5f ffff88206006fd98 ffff880d4fec2b40
      Jun 21 05:39:57 ca1020 kernel: ffff8820661f6638 ffff88206006ffd8 000000000000fb88 ffff8820661f6638
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8121cc5f>] ? security_inode_permission+0x1f/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fda3>] rwsem_down_write_failed+0x23/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833e3>] call_rwsem_down_write_failed+0x13/0x20
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2a2>] ? down_write+0x32/0x40
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81139b7c>] sys_mmap_pgoff+0x5c/0x2d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81010519>] sys_mmap+0x29/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:39:57 ca1020 kernel: INFO: task sge_execd:17809 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: sge_execd D 000000000000000c 0 17809 1 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff88104bad9cf0 0000000000000082 0000000000000000 ffff88104bad9c68
      Jun 21 05:39:57 ca1020 kernel: ffffffff81ed24d0 0000010065af6700 ffff88104bad9f38 ffffffff4bad9c98
      Jun 21 05:39:57 ca1020 kernel: ffff881032985058 ffff88104bad9fd8 000000000000fb88 ffff881032985058
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8104751f>] __do_page_fault+0x18f/0x480
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8108582f>] ? copy_siginfo_to_user+0x14f/0x210
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81010bde>] ? copy_user_generic+0xe/0x20
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81015b5b>] ? check_for_xstate+0x3b/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8151311e>] do_page_fault+0x3e/0xa0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff815104d5>] page_fault+0x25/0x30
      Jun 21 05:39:57 ca1020 kernel: INFO: task sge_execd:17811 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: sge_execd D 0000000000000008 0 17811 1 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff88104a101230 0000000000000082 ffffffff811666bc 0000000000000282
      Jun 21 05:39:57 ca1020 kernel: ffff88107fcb02c0 ffff881069139a00 ffff88107fc214c0 ffff88104a101218
      Jun 21 05:39:57 ca1020 kernel: ffff8810308fd098 ffff88104a101fd8 000000000000fb88 ffff8810308fd098
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811666bc>] ? transfer_objects+0x5c/0x80
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810573a5>] ? select_idle_sibling+0x95/0x150
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa058d0b7>] cfs_get_environ+0x257/0x6c0 [libcfs]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08bfaa5>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa069396f>] lustre_get_jobid+0x10f/0x380 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08c0646>] lustre_msg_set_jobid+0xb6/0x140 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08e08ec>] ptlrpcd_add_req+0x3c/0x2f0 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06e9d21>] ? cl_req_attr_set+0xd1/0x230 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08bf63c>] ? lustre_msg_get_opc+0x9c/0x110 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0ac37a2>] osc_build_rpc+0xd62/0x1810 [osc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0ade4b7>] osc_io_unplug0+0x1257/0x1f00 [osc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0599bb2>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08e8644>] ? get_my_ctx+0x64/0x100 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa08effbd>] ? sptlrpc_import_check_ctx+0x18d/0x320 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0892336>] ? ldlm_resource_putref+0x66/0x280 [ptlrpc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0ae0ee1>] osc_io_unplug+0x11/0x20 [osc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0ae10bd>] osc_queue_sync_pages+0x1cd/0x350 [osc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0ad25c7>] osc_io_submit+0x1c7/0x4e0 [osc]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06e955e>] cl_io_submit_rw+0x6e/0x160 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0b65230>] lov_io_submit+0x2d0/0x4b0 [lov]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06ebc7d>] ? cl_page_list_add+0x5d/0x190 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06e955e>] cl_io_submit_rw+0x6e/0x160 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06ebc10>] cl_io_read_page+0x180/0x190 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0c08f01>] ll_readpage+0x91/0x1a0 [lustre]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8111b223>] filemap_fault+0x313/0x500
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0c38bd4>] vvp_io_fault_start+0x424/0xc50 [lustre]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06e7645>] ? cl_wait+0xb5/0x250 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06e97ea>] cl_io_start+0x6a/0x140 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa06ed974>] cl_io_loop+0xb4/0x1b0 [obdclass]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffffa0c19002>] ll_fault+0x2c2/0x4d0 [lustre]
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811430b4>] __do_fault+0x54/0x530
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81143687>] handle_pte_fault+0xf7/0xb50
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8105230c>] ? check_preempt_curr+0x7c/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81434551>] ? sock_aio_read+0x1a1/0x1b0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8114431a>] handle_mm_fault+0x23a/0x310
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810474c9>] __do_page_fault+0x139/0x480
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81010bde>] ? copy_user_generic+0xe/0x20
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8151311e>] do_page_fault+0x3e/0xa0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff815104d5>] page_fault+0x25/0x30
      Jun 21 05:39:57 ca1020 kernel: INFO: task w:43205 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: w D 000000000000000b 0 43205 43204 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff880104421c50 0000000000000082 ffffffff00000005 000000000003c54d
      Jun 21 05:39:57 ca1020 kernel: ffff882069662200 ffff880104421bd8 ffffffff8119af4a ffff880104421d18
      Jun 21 05:39:57 ca1020 kernel: ffff880d5c8a8638 ffff880104421fd8 000000000000fb88 ffff880d5c8a8638
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8119af4a>] ? dput+0x9a/0x150
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8118f2b5>] ? path_to_nameidata+0x25/0x60
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144901>] __access_remote_vm+0x41/0x1f0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8121cc5f>] ? security_inode_permission+0x1f/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8117e434>] ? nameidata_to_filp+0x54/0x70
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144b0b>] access_process_vm+0x5b/0x80
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811eb23d>] proc_pid_cmdline+0x6d/0x120
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811ebfbd>] proc_info_read+0xad/0xf0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81181801>] sys_read+0x51/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:39:57 ca1020 kernel: INFO: task w:43548 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: w D 0000000000000012 0 43548 43547 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff8818a12f7c50 0000000000000086 0000000000000000 000000000003c54d
      Jun 21 05:39:57 ca1020 kernel: ffff882069662200 ffff8818a12f7bd8 ffffffff8119af4a ffff8818a12f7d18
      Jun 21 05:39:57 ca1020 kernel: ffff881ed73cf098 ffff8818a12f7fd8 000000000000fb88 ffff881ed73cf098
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8119af4a>] ? dput+0x9a/0x150
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8118f2b5>] ? path_to_nameidata+0x25/0x60
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144901>] __access_remote_vm+0x41/0x1f0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8121cc5f>] ? security_inode_permission+0x1f/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8117e434>] ? nameidata_to_filp+0x54/0x70
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144b0b>] access_process_vm+0x5b/0x80
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811eb23d>] proc_pid_cmdline+0x6d/0x120
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811ebfbd>] proc_info_read+0xad/0xf0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81181801>] sys_read+0x51/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Jun 21 05:39:57 ca1020 kernel: INFO: task w:43887 blocked for more than 120 seconds.
      Jun 21 05:39:57 ca1020 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jun 21 05:39:57 ca1020 kernel: w D 0000000000000008 0 43887 43886 0x00000080
      Jun 21 05:39:57 ca1020 kernel: ffff8801d0c43c50 0000000000000082 0000000000000000 000000000003c54d
      Jun 21 05:39:57 ca1020 kernel: ffff882069662200 ffff8801d0c43bd8 ffffffff8119af4a ffff8801d0c43d18
      Jun 21 05:39:57 ca1020 kernel: ffff880f0fa9e638 ffff8801d0c43fd8 000000000000fb88 ffff880f0fa9e638
      Jun 21 05:39:57 ca1020 kernel: Call Trace:
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8119af4a>] ? dput+0x9a/0x150
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fc45>] rwsem_down_failed_common+0x95/0x1d0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8118f2b5>] ? path_to_nameidata+0x25/0x60
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150fdd6>] rwsem_down_read_failed+0x26/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff812833b4>] call_rwsem_down_read_failed+0x14/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8150f2d4>] ? down_read+0x24/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144901>] __access_remote_vm+0x41/0x1f0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8121cc5f>] ? security_inode_permission+0x1f/0x30
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8117e434>] ? nameidata_to_filp+0x54/0x70
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81144b0b>] access_process_vm+0x5b/0x80
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811eb23d>] proc_pid_cmdline+0x6d/0x120
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8116087a>] ? alloc_pages_current+0xaa/0x110
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811ebfbd>] proc_info_read+0xad/0xf0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff81181801>] sys_read+0x51/0x90
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290
      Jun 21 05:39:57 ca1020 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

      Attachments

        Activity

          People

            green Oleg Drokin
            lixi Li Xi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: