Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3702

Failure on test suite parallel-scale test_iorssf: client out of memory

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.5.0
    • None
    • server and client : lustre-master build #1592
      client is runnig SLES11 SP2
    • 3
    • 9551

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/16f0f346-fd6e-11e2-9fdb-52540035b04c.

      The sub-test test_iorssf failed with the following error:

      ior failed! 1

      test log

      ERROR in aiori-POSIX.c (line 362): cannot get status of written file.
      ERROR: Cannot allocate memory
      

      client dmesg:

      [71266.413667] LNetError: 29131:0:(lib-lnet.h:457:lnet_md_alloc()) LNET: out of memory at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/client/distro/sles11sp2/ib_stack/inkernel/BUILD/BUILD/lustre-2.4.53/lnet/include/lnet/lib-lnet.h:457 (tried to alloc '(md)' = 4208)
      [71266.413673] LNetError: 29131:0:(lib-lnet.h:457:lnet_md_alloc()) Skipped 4 previous similar messages
      [71266.413676] LNetError: 29131:0:(lib-lnet.h:457:lnet_md_alloc()) LNET: 10747055 total bytes allocated by lnet
      [71266.413679] LNetError: 29131:0:(lib-lnet.h:457:lnet_md_alloc()) Skipped 4 previous similar messages
      [71266.413684] LustreError: 29131:0:(niobuf.c:376:ptlrpc_register_bulk()) lustre-OST0003-osc-ffff880065e08000: LNetMDAttach failed x1442416780175324/0: rc = -12
      [71266.413687] LustreError: 29131:0:(niobuf.c:376:ptlrpc_register_bulk()) Skipped 4 previous similar messages
      [71266.466449] The following is only an harmless informational message.
      [71266.466452] Unless you get a _continuous_flood_ of these messages it means
      [71266.466454] everything is working fine. Allocations from irqs cannot be
      [71266.466455] perfectly reliable and the kernel is designed to handle that.
      [71266.466457] ptlrpcd_1: page allocation failure: order:1, mode:0x40
      [71266.466460] Pid: 29131, comm: ptlrpcd_1 Tainted: G           N  3.0.80-0.7-default #1
      [71266.466462] Call Trace:
      [71266.466476]  [<ffffffff810048b5>] dump_trace+0x75/0x310
      [71266.466483]  [<ffffffff81444163>] dump_stack+0x69/0x6f
      [71266.466488]  [<ffffffff810f7882>] warn_alloc_failed+0x102/0x1a0
      [71266.466493]  [<ffffffff810f9369>] __alloc_pages_slowpath+0x559/0x7f0
      [71266.466496]  [<ffffffff810f97e9>] __alloc_pages_nodemask+0x1e9/0x200
      [71266.466501]  [<ffffffff8113ad16>] kmem_getpages+0x56/0x170
      [71266.466505]  [<ffffffff8113bb5b>] fallback_alloc+0x19b/0x270
      [71266.466508]  [<ffffffff8113c1f4>] __kmalloc+0x284/0x330
      [71266.466527]  [<ffffffffa0515063>] LNetMDAttach+0x163/0x5b0 [lnet]
      [71266.466593]  [<ffffffffa07f1848>] ptlrpc_register_bulk+0x258/0x9e0 [ptlrpc]
      [71266.466654]  [<ffffffffa07f2a73>] ptl_send_rpc+0x173/0xc30 [ptlrpc]
      [71266.466704]  [<ffffffffa07e8990>] ptlrpc_send_new_req+0x4a0/0x870 [ptlrpc]
      [71266.466751]  [<ffffffffa07eb958>] ptlrpc_check_set+0x408/0x1af0 [ptlrpc]
      [71266.466800]  [<ffffffffa081770b>] ptlrpcd_check+0x52b/0x550 [ptlrpc]
      [71266.466862]  [<ffffffffa0817ba7>] ptlrpcd+0x197/0x3a0 [ptlrpc]
      [71266.466896]  [<ffffffff8107b666>] kthread+0x96/0xa0
      [71266.466901]  [<ffffffff8144ff44>] kernel_thread_helper+0x4/0x10
      [71266.466904] Mem-Info:
      

      Attachments

        Issue Links

          Activity

            People

              keith Keith Mannthey (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: