Details

    • Type: Technical task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.10.0
    • Fix Version/s: Lustre 2.10.0
    • Labels:
    • Rank (Obsolete):
      9223372036854775807

      Description

      This issue was created by maloo for bobijam <bobijam.xu@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/67af86be-2027-11e7-9073-5254006e85c2.

      The sub-test test_244 failed with the following error:

      test failed to respond and timed out
      

      Info required for matching: sanity 244

      sendfile_grouplock.c calls sendfile_copy(sourfile, 0, destfile, 98765)
      and sendfile_copy()->llapi_group_lock(fd_out, dest_gid);

      which will call into lov_io_init() and atomic_inc(&lov->lo_active_ios)

      and sendfile_copy() tries to write to the file, which will check to get layout, and ll_layout_refresh() finds there is an active ios (marked by ll_get_grouplock()), so the write hung there

      sendfile_grou S 0000000000000000     0  7394   7321 0x00000080
       ffff88000eb3f618 0000000000000082 ffff88000eb3f5e0 ffff88000eb3f5dc
       00001ce200000000 ffff88003f828400 0000005dce083b5f ffff880003436ac0
       00000000000005ff 0000000100017a1d ffff88002b57fad0 ffff88000eb3ffd8
      Call Trace:
       [<ffffffffa0afa20b>] lov_layout_wait+0x11b/0x220 [lov]
       [<ffffffff810640e0>] ? default_wake_function+0x0/0x20
       [<ffffffffa0afc11e>] lov_conf_set+0x37e/0xa30 [lov]
       [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa059d888>] cl_conf_set+0x58/0x100 [obdclass]
       [<ffffffffa0fa5dd4>] ll_layout_conf+0x84/0x3f0 [lustre]
       [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa0fb0b9d>] ll_layout_refresh+0x96d/0x1710 [lustre]
       [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa0ff7d6f>] vvp_io_init+0x32f/0x450 [lustre]
       [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
       [<ffffffffa05a5148>] cl_io_init0+0x88/0x150 [obdclass]
       [<ffffffffa05a7caa>] cl_io_init+0x4a/0xa0 [obdclass]
       [<ffffffffa05a7dbc>] cl_io_rw_init+0xbc/0x200 [obdclass]
       [<ffffffffa0fa7213>] ll_file_io_generic+0x203/0xaf0 [lustre]
       [<ffffffffa0fa941d>] ll_file_aio_write+0x13d/0x280 [lustre]
       [<ffffffffa0fa969a>] ll_file_write+0x13a/0x270 [lustre]
       [<ffffffff81189ef8>] vfs_write+0xb8/0x1a0
       [<ffffffff811ba76d>] kernel_write+0x3d/0x50
       [<ffffffff811ba7da>] write_pipe_buf+0x5a/0x90
       [<ffffffff811b9342>] splice_from_pipe_feed+0x72/0x120
       [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90
       [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90
       [<ffffffff811b9d9e>] __splice_from_pipe+0x6e/0x80
       [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90
       [<ffffffff811b9e01>] splice_from_pipe+0x51/0x70
       [<ffffffff811b9e3d>] default_file_splice_write+0x1d/0x30
       [<ffffffff811b9fca>] do_splice_from+0xba/0xf0
       [<ffffffff811ba020>] direct_splice_actor+0x20/0x30
       [<ffffffff811ba256>] splice_direct_to_actor+0xc6/0x1c0
       [<ffffffff811ba000>] ? direct_splice_actor+0x0/0x30
       [<ffffffff811ba39d>] do_splice_direct+0x4d/0x60
       [<ffffffff8118a344>] do_sendfile+0x184/0x1e0
       [<ffffffff8118a3d4>] sys_sendfile64+0x34/0xb0
       [<ffffffff810e031e>] ? __audit_syscall_exit+0x25e/0x290
       [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bobijam Zhenyu Xu
                Reporter:
                maloo Maloo
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: