Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7016

deadlock on reading multiple files from HSM archive

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.8.0
    • None
    • master
    • 2
    • 9223372036854775807

    Description

      When it reads multiple files from HSM archive using posix copytool, it causes deadlock on the client.

      Aug 18 07:46:58 r38 kernel: INFO: task lhsmtool_posix:16414 blocked for more than 120 seconds.
      Aug 18 07:46:58 r38 kernel:      Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      Aug 18 07:46:58 r38 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Aug 18 07:46:58 r38 kernel: lhsmtool_posi D 0000000000000002     0 16414  14111 0x00000000
      Aug 18 07:46:58 r38 kernel: ffff880f6379bd18 0000000000000082 ffff88102609502a ffff881020055c40
      Aug 18 07:46:58 r38 kernel: ffff880f6379bc88 ffffffff81227e9f ffff880f6379bd68 ffffffff81199045
      Aug 18 07:46:59 r38 kernel: ffff880f67361af8 ffff880f6379bfd8 000000000000fbc8 ffff880f67361af8
      Aug 18 07:46:59 r38 kernel: Call Trace:
      Aug 18 07:46:59 r38 kernel: [<ffffffff81227e9f>] ? security_inode_permission+0x1f/0x30
      Aug 18 07:46:59 r38 kernel: [<ffffffff81199045>] ? __link_path_walk+0x145/0x1000
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
      Aug 18 07:46:59 r38 kernel: [<ffffffff8119ba76>] do_filp_open+0x2d6/0xd20
      Aug 18 07:46:59 r38 kernel: [<ffffffffa0605343>] ? ll_file_release+0x683/0xac0 [lustre]
      Aug 18 07:46:59 r38 kernel: [<ffffffff81153ea1>] ? unlink_anon_vmas+0x71/0xd0
      Aug 18 07:46:59 r38 kernel: [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
      Aug 18 07:46:59 r38 kernel: [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185be9>] do_sys_open+0x69/0x140
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185d00>] sys_open+0x20/0x30
      Aug 18 07:46:59 r38 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Aug 18 07:46:59 r38 kernel: INFO: task lhsmtool_posix:16415 blocked for more than 120 seconds.
      Aug 18 07:46:59 r38 kernel:      Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      Aug 18 07:46:59 r38 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Aug 18 07:46:59 r38 kernel: lhsmtool_posi D 0000000000000004     0 16415  14111 0x00000000
      Aug 18 07:46:59 r38 kernel: ffff880f634c1d18 0000000000000082 0000000000000000 ffff881020055c40
      Aug 18 07:46:59 r38 kernel: ffff880f634c1c88 ffffffff81227e9f ffff880f634c1d68 ffffffff81199045
      Aug 18 07:46:59 r38 kernel: ffff880f64e5a5f8 ffff880f634c1fd8 000000000000fbc8 ffff880f64e5a5f8
      Aug 18 07:46:59 r38 kernel: Call Trace:
      Aug 18 07:46:59 r38 kernel: [<ffffffff81227e9f>] ? security_inode_permission+0x1f/0x30
      Aug 18 07:46:59 r38 kernel: [<ffffffff81199045>] ? __link_path_walk+0x145/0x1000
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
      Aug 18 07:46:59 r38 kernel: [<ffffffff8119ba76>] do_filp_open+0x2d6/0xd20
      Aug 18 07:46:59 r38 kernel: [<ffffffffa0605343>] ? ll_file_release+0x683/0xac0 [lustre]
      Aug 18 07:46:59 r38 kernel: [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
      Aug 18 07:46:59 r38 kernel: [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185be9>] do_sys_open+0x69/0x140
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185d00>] sys_open+0x20/0x30
      Aug 18 07:46:59 r38 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      Aug 18 07:46:59 r38 kernel: INFO: task md5sum:16312 blocked for more than 120 seconds.
      Aug 18 07:46:59 r38 kernel:      Not tainted 2.6.32-431.29.2.el6.x86_64 #1
      Aug 18 07:46:59 r38 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Aug 18 07:46:59 r38 kernel: md5sum        D 0000000000000001     0 16312  15803 0x00000000
      Aug 18 07:46:59 r38 kernel: ffff881fbd611b58 0000000000000082 0000000000000000 0000000000000286
      Aug 18 07:46:59 r38 kernel: ffff881fbd611b68 ffff881020055c40 ffff881fbd611b38 ffffffffa05eb3f0
      Aug 18 07:46:59 r38 kernel: ffff882026a85ab8 ffff881fbd611fd8 000000000000fbc8 ffff882026a85ab8
      Aug 18 07:46:59 r38 kernel: Call Trace:
      Aug 18 07:46:59 r38 kernel: [<ffffffffa05eb3f0>] ? ll_lookup_finish_locks+0x270/0x8d0 [lustre]
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a5be>] __mutex_lock_slowpath+0x13e/0x180
      Aug 18 07:46:59 r38 kernel: [<ffffffff8152a45b>] mutex_lock+0x2b/0x50
      Aug 18 07:46:59 r38 kernel: [<ffffffff811989ab>] do_lookup+0x11b/0x230
      Aug 18 07:46:59 r38 kernel: [<ffffffff811996a4>] __link_path_walk+0x7a4/0x1000
      Aug 18 07:46:59 r38 kernel: [<ffffffff8114a3d7>] ? handle_pte_fault+0xf7/0xb00
      Aug 18 07:46:59 r38 kernel: [<ffffffff8104eeb7>] ? pte_alloc_one+0x37/0x50
      Aug 18 07:46:59 r38 kernel: [<ffffffff8119a1ba>] path_walk+0x6a/0xe0
      Aug 18 07:46:59 r38 kernel: [<ffffffff8119a3cb>] filename_lookup+0x6b/0xc0
      Aug 18 07:46:59 r38 kernel: [<ffffffff81226d56>] ? security_file_alloc+0x16/0x20
      Aug 18 07:46:59 r38 kernel: [<ffffffff8119b8a4>] do_filp_open+0x104/0xd20
      Aug 18 07:46:59 r38 kernel: [<ffffffff8114e820>] ? __vma_link_rb+0x30/0x40
      Aug 18 07:46:59 r38 kernel: [<ffffffff8114e8cb>] ? vma_link+0x9b/0xf0
      Aug 18 07:46:59 r38 kernel: [<ffffffff8128f83a>] ? strncpy_from_user+0x4a/0x90
      Aug 18 07:46:59 r38 kernel: [<ffffffff811a8b82>] ? alloc_fd+0x92/0x160
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185be9>] do_sys_open+0x69/0x140
      Aug 18 07:46:59 r38 kernel: [<ffffffff81185d00>] sys_open+0x20/0x30
      Aug 18 07:46:59 r38 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      

      Attachments

        1. messages
          216 kB
        2. mult_test.sh
          1 kB

        Activity

          People

            bfaccini Bruno Faccini (Inactive)
            ihara Shuichi Ihara (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: