Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2252

Test failure on test suite sanity, subtest test_118k

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • None
    • None
    • 3
    • 5381

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/d3fd62bc-22e5-11e2-afb4-52540035b04c.

      The sub-test test_118k failed with the following error:

      test failed to respond and timed out

      From client-2 console log found D process

      14:15:41:Lustre: DEBUG MARKER: == sanity test 118k: bio alloc -ENOMEM and IO TERM handling =========== 14:15:38 (1351545338)
      14:15:41:LustreError: 11-0: an error occurred while communicating with 10.10.4.171@tcp. The ost_write operation failed with -5
      14:15:41:LustreError: Skipped 54 previous similar messages
      14:19:33:INFO: task flush-lustre-7:29674 blocked for more than 120 seconds.
      14:19:33:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      14:19:33:flush-lustre- D 0000000000000000     0 29674      2 0x00000080
      14:19:33: ffff880052a71938 0000000000000046 ffff880052a71900 ffff880052a718e0
      14:19:33: ffff880052a71980 ffffffffa0a69b80 000000000000004a ffff880037c21850
      14:19:33: ffff88007b4225f8 ffff880052a71fd8 000000000000fb88 ffff88007b4225f8
      14:19:33:Call Trace:
      14:19:33: [<ffffffff814ffec5>] rwsem_down_failed_common+0x95/0x1d0
      14:19:33: [<ffffffff81500056>] rwsem_down_read_failed+0x26/0x30
      14:19:33: [<ffffffff8127e634>] call_rwsem_down_read_failed+0x14/0x30
      14:19:33: [<ffffffff814ff554>] ? down_read+0x24/0x30
      14:19:33: [<ffffffffa0a26504>] lov_lsm_addref+0x34/0x150 [lov]
      14:19:33: [<ffffffffa0a26a23>] lov_io_init+0x73/0x160 [lov]
      14:19:33: [<ffffffffa0628f98>] cl_io_init0+0x98/0x160 [obdclass]
      14:19:33: [<ffffffffa062be84>] cl_io_init+0x64/0x100 [obdclass]
      14:19:33: [<ffffffffa0aa103e>] cl_sync_file_range+0x11e/0x570 [lustre]
      14:19:33: [<ffffffffa0ac6e1f>] ll_writepages+0x6f/0x1a0 [lustre]
      14:19:33: [<ffffffff81129b11>] do_writepages+0x21/0x40
      14:19:33: [<ffffffff811a513d>] writeback_single_inode+0xdd/0x290
      14:19:33: [<ffffffff811a554e>] writeback_sb_inodes+0xce/0x180
      14:19:33: [<ffffffff811a56ab>] writeback_inodes_wb+0xab/0x1b0
      14:19:33: [<ffffffff811a5a4b>] wb_writeback+0x29b/0x3f0
      14:19:33: [<ffffffff814fd960>] ? thread_return+0x4e/0x76e
      14:19:33: [<ffffffff8107eb42>] ? del_timer_sync+0x22/0x30
      14:19:33: [<ffffffff811a5d39>] wb_do_writeback+0x199/0x240
      14:19:33: [<ffffffff811a5e43>] bdi_writeback_task+0x63/0x1b0
      14:19:33: [<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
      14:19:33: [<ffffffff81138770>] ? bdi_start_fn+0x0/0x100
      14:19:33: [<ffffffff811387f6>] bdi_start_fn+0x86/0x100
      14:19:33: [<ffffffff81138770>] ? bdi_start_fn+0x0/0x100
      14:19:33: [<ffffffff81091d66>] kthread+0x96/0xa0
      14:19:33: [<ffffffff8100c14a>] child_rip+0xa/0x20
      14:19:33: [<ffffffff81091cd0>] ? kthread+0x0/0xa0
      14:19:33: [<ffffffff8100c140>] ? child_rip+0x0/0x20
      14:19:33:INFO: task dd:19006 blocked for more than 120 seconds.
      14:19:33:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      14:19:33:dd            D 0000000000000000     0 19006      1 0x00000080
      14:19:33: ffff880062f3d470 0000000000000082 ffff88007401aca0 ffff880062f3d438
      14:19:33: 0000000000000286 ffff880077a9d020 ffff880074be5600 ffffc900016d1014
      14:19:33: ffff880075bd05f8 ffff880062f3dfd8 000000000000fb88 ffff880075bd05f8
      14:19:33:Call Trace:
      14:19:33: [<ffffffffa0436ace>] ? cfs_mem_cache_free+0xe/0x10 [libcfs]
      14:19:33: [<ffffffff814ffec5>] rwsem_down_failed_common+0x95/0x1d0
      14:19:33: [<ffffffffa044b982>] ? cfs_hash_bd_from_key+0x42/0xe0 [libcfs]
      14:19:33: [<ffffffff81500056>] rwsem_down_read_failed+0x26/0x30
      14:19:33: [<ffffffff8127e634>] call_rwsem_down_read_failed+0x14/0x30
      14:19:33: [<ffffffff814ff554>] ? down_read+0x24/0x30
      14:19:33: [<ffffffffa0a26504>] lov_lsm_addref+0x34/0x150 [lov]
      14:19:33: [<ffffffffa0a26a23>] lov_io_init+0x73/0x160 [lov]
      14:19:33: [<ffffffffa0628f98>] cl_io_init0+0x98/0x160 [obdclass]
      14:19:33: [<ffffffffa062be84>] cl_io_init+0x64/0x100 [obdclass]
      14:19:33: [<ffffffffa09979a4>] osc_lru_shrink+0x4a4/0x8d0 [osc]
      14:19:33: [<ffffffffa0998192>] osc_page_init+0x3c2/0xb60 [osc]
      14:19:33: [<ffffffffa061c2d5>] ? cl_page_slice_add+0x55/0x140 [obdclass]
      14:19:33: [<ffffffffa062094b>] cl_page_find0+0x2ab/0x8c0 [obdclass]
      14:19:33: [<ffffffffa0620f78>] cl_page_find_sub+0x18/0x20 [obdclass]
      14:19:33: [<ffffffffa0a28f6a>] lov_page_init_raid0+0x19a/0x780 [lov]
      14:19:33: [<ffffffffa0a26938>] lov_page_init+0x68/0xe0 [lov]
      14:19:33: [<ffffffffa062094b>] cl_page_find0+0x2ab/0x8c0 [obdclass]
      14:19:33: [<ffffffffa044d292>] ? cfs_hash_lookup+0x82/0xa0 [libcfs]
      14:19:33: [<ffffffff81170345>] ? mem_cgroup_charge_common+0xa5/0xd0
      14:19:33: [<ffffffffa0620f91>] cl_page_find+0x11/0x20 [obdclass]
      14:19:33: [<ffffffffa0ac8504>] ll_cl_init+0x154/0x5b0 [lustre]
      14:19:33: [<ffffffff81136b5e>] ? __inc_zone_page_state+0x2e/0x30
      14:19:33: [<ffffffffa0ac8bb3>] ll_prepare_write+0x53/0x1a0 [lustre]
      14:19:33: [<ffffffffa0ae0b8e>] ll_write_begin+0x7e/0x1a0 [lustre]
      14:19:33: [<ffffffff81114be3>] generic_file_buffered_write+0x123/0x2e0
      14:19:33: [<ffffffff810724c7>] ? current_fs_time+0x27/0x30
      14:19:33: [<ffffffff81116580>] __generic_file_aio_write+0x250/0x480
      14:19:33: [<ffffffffa06224eb>] ? cl_lock_trace0+0x11b/0x130 [obdclass]
      14:19:33: [<ffffffff8111681f>] generic_file_aio_write+0x6f/0xe0
      14:19:33: [<ffffffffa0af440c>] vvp_io_write_start+0x9c/0x240 [lustre]
      14:19:33: [<ffffffffa0628d1a>] cl_io_start+0x6a/0x140 [obdclass]
      14:19:33: [<ffffffffa062d634>] cl_io_loop+0xb4/0x1b0 [obdclass]
      14:19:33: [<ffffffffa0aa06cb>] ll_file_io_generic+0x42b/0x550 [lustre]
      14:19:33: [<ffffffffa0aa15cc>] ll_file_aio_write+0x13c/0x2c0 [lustre]
      14:19:33: [<ffffffffa0aa18b9>] ll_file_write+0x169/0x2a0 [lustre]
      14:19:33: [<ffffffff8117b198>] vfs_write+0xb8/0x1a0
      14:19:33: [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
      14:19:33: [<ffffffff8117bbb1>] sys_write+0x51/0x90
      14:19:33: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
      14:19:33:INFO: task dd:19018 blocked for more than 120 seconds.
      14:19:33:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      14:19:33:dd            D 0000000000000000     0 19018      1 0x00000080
      14:19:33: ffff880056d1d470 0000000000000086 ffff88007401ae80 ffff880056d1d438
      14:19:33: 0000000000000286 ffff880077a9d920 ffff880074be5600 ffffc900011eb014
      14:19:33: ffff88007ad965f8 ffff880056d1dfd8 000000000000fb88 ffff88007ad965f8
      

      Attachments

        Issue Links

          Activity

            People

              jay Jinshan Xiong (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: