Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5121

replay-ost-single test_0b: OOM on the OST

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 14130

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/90389636-e54f-11e3-bb3a-52540035b04c.

      The sub-test test_0b failed with the following error:

      8:54:10:LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.1.5.230@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      18:54:10:LustreError: Skipped 7 previous similar messages
      18:54:10:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
      18:54:10:LNet: 29343:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
      18:54:10:LNet: 29343:0:(debug.c:218:libcfs_debug_str2mask()) Skipped 3 previous similar messages
      18:54:10:Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1 2>/dev/null
      18:54:10:Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 7 clients reconnect
      18:54:10:ll_ost00_003 invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
      18:54:10:ll_ost00_003 cpuset=/ mems_allowed=0
      18:54:10:Pid: 26086, comm: ll_ost00_003 Not tainted 2.6.32-431.17.1.el6_lustre.g2e529c5.x86_64 #1
      18:54:10:Call Trace:
      18:54:10: [<ffffffff810d0211>] ? cpuset_print_task_mems_allowed+0x91/0xb0
      18:54:10: [<ffffffff811225c0>] ? dump_header+0x90/0x1b0
      18:54:10: [<ffffffff8122781c>] ? security_real_capable_noaudit+0x3c/0x70
      18:54:10: [<ffffffff81122a42>] ? oom_kill_process+0x82/0x2a0
      18:54:10: [<ffffffff8112293e>] ? select_bad_process+0x9e/0x120
      18:54:10: [<ffffffff81122e80>] ? out_of_memory+0x220/0x3c0
      18:54:10: [<ffffffff8112f79f>] ? __alloc_pages_nodemask+0x89f/0x8d0
      18:54:10: [<ffffffff8116e082>] ? kmem_getpages+0x62/0x170
      18:54:10: [<ffffffff8116ec9a>] ? fallback_alloc+0x1ba/0x270
      18:54:10: [<ffffffff8116e6ef>] ? cache_grow+0x2cf/0x320
      18:54:10: [<ffffffff8116ea19>] ? ____cache_alloc_node+0x99/0x160
      18:54:10: [<ffffffff8124bc3c>] ? crypto_create_tfm+0x3c/0xe0
      18:54:10: [<ffffffff8116f7e9>] ? __kmalloc+0x189/0x220
      18:54:10: [<ffffffff8124bc3c>] ? crypto_create_tfm+0x3c/0xe0
      18:54:10: [<ffffffff812525d8>] ? crypto_init_shash_ops+0x68/0x100
      18:54:10: [<ffffffff8124bd4a>] ? __crypto_alloc_tfm+0x6a/0x130
      18:54:10: [<ffffffff8124c5ba>] ? crypto_alloc_base+0x5a/0xb0
      18:54:10: [<ffffffff810554f8>] ? resched_task+0x68/0x80
      18:54:10: [<ffffffffa048d2ca>] ? cfs_crypto_hash_alloc+0x7a/0x290 [libcfs]
      18:54:10: [<ffffffffa048d5da>] ? cfs_crypto_hash_digest+0x6a/0xf0 [libcfs]
      18:54:10: [<ffffffff8116f86c>] ? __kmalloc+0x20c/0x220
      18:54:10: [<ffffffffa082bd73>] ? lustre_msg_calc_cksum+0xd3/0x130 [ptlrpc]
      18:54:10: [<ffffffffa0865a81>] ? null_authorize+0xa1/0x100 [ptlrpc]
      18:54:10: [<ffffffffa0854c56>] ? sptlrpc_svc_wrap_reply+0x56/0x1c0 [ptlrpc]
      18:54:10: [<ffffffffa08241ec>] ? ptlrpc_send_reply+0x1fc/0x7f0 [ptlrpc]
      18:54:10: [<ffffffffa083b675>] ? ptlrpc_at_check_timed+0xc05/0x1360 [ptlrpc]
      18:54:10: [<ffffffffa0832c09>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
      18:54:10: [<ffffffffa083cf68>] ? ptlrpc_main+0x1198/0x1980 [ptlrpc]
      18:54:10: [<ffffffffa083bdd0>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
      18:54:10: [<ffffffff8109ab56>] ? kthread+0x96/0xa0
      18:54:10: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      18:54:10: [<ffffffff8109aac0>] ? kthread+0x0/0xa0
      18:54:10: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      18:54:10:Mem-Info:
      test_0b returned 1

      Info required for matching: replay-ost-single 0b

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: