Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3091

Interop 2.1.4<->2.4 failure on test suite parallel-scale test_iorssf: page allocation failure

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • Lustre 2.4.0
    • None
    • None
    • client: 2.1.4
      server: lustre-master build #1346
    • 3
    • 7508

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/81256280-972d-11e2-9ec7-52540035b04c.

      The sub-test test_iorssf failed with the following error:

      ior failed! 1

      OST console shows:

      07:21:27:Lustre: DEBUG MARKER: == parallel-scale test iorssf: iorssf == 07:21:19 (1364394079)
      07:22:19:ll_ost_io00_061: page allocation failure. order:4, mode:0x50
      07:22:19:Pid: 1070, comm: ll_ost_io00_061 Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1
      07:22:19:Call Trace:
      07:22:19: [<ffffffff811231ff>] ? __alloc_pages_nodemask+0x77f/0x940
      07:22:19: [<ffffffff8115d1a2>] ? kmem_getpages+0x62/0x170
      07:22:19: [<ffffffff8115ddba>] ? fallback_alloc+0x1ba/0x270
      07:22:19: [<ffffffff8115d80f>] ? cache_grow+0x2cf/0x320
      07:22:19: [<ffffffff8115db39>] ? ____cache_alloc_node+0x99/0x160
      07:22:19: [<ffffffffa04d8b60>] ? cfs_alloc+0x30/0x60 [libcfs]
      07:22:19: [<ffffffff8115e909>] ? __kmalloc+0x189/0x220
      07:22:19: [<ffffffffa04d8b60>] ? cfs_alloc+0x30/0x60 [libcfs]
      07:22:19: [<ffffffffa0dda4a8>] ? ost_io_thread_init+0x48/0x300 [ost]
      07:22:19: [<ffffffffa07f4c20>] ? ptlrpc_main+0xa0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff810097cc>] ? __switch_to+0x1ac/0x320
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      07:22:19:Mem-Info:
      07:22:19:Node 0 DMA per-cpu:
      07:22:19:CPU    0: hi:    0, btch:   1 usd:   0
      07:22:19:Node 0 DMA32 per-cpu:
      07:22:19:CPU    0: hi:  186, btch:  31 usd:  12
      07:22:19:active_anon:2134 inactive_anon:2382 isolated_anon:0
      07:22:19: active_file:193268 inactive_file:197527 isolated_file:0
      07:22:19: unevictable:0 dirty:8482 writeback:3 unstable:0
      07:22:19: free:20137 slab_reclaimable:14879 slab_unreclaimable:14017
      07:22:19: mapped:855 shmem:48 pagetables:772 bounce:0
      07:22:19:Node 0 DMA free:8348kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:3568kB inactive_file:3076kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15324kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:164kB slab_unreclaimable:520kB kernel_stack:56kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      07:22:19:lowmem_reserve[]: 0 2003 2003 2003
      07:22:19:Node 0 DMA32 free:71580kB min:44720kB low:55900kB high:67080kB active_anon:8536kB inactive_anon:9528kB active_file:769504kB inactive_file:787032kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052064kB mlocked:0kB dirty:33928kB writeback:12kB mapped:3420kB shmem:192kB slab_reclaimable:59352kB slab_unreclaimable:56120kB kernel_stack:2648kB pagetables:3088kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      07:22:19:lowmem_reserve[]: 0 0 0 0
      07:22:19:Node 0 DMA: 35*4kB 32*8kB 13*16kB 12*32kB 5*64kB 1*128kB 3*256kB 2*512kB 3*1024kB 1*2048kB 0*4096kB = 8348kB
      07:22:19:Node 0 DMA32: 13138*4kB 301*8kB 97*16kB 29*32kB 145*64kB 7*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 71456kB
      07:22:19:390833 total pagecache pages
      07:22:19:0 pages in swap cache
      07:22:19:Swap cache stats: add 0, delete 0, find 0/0
      07:22:19:Free swap  = 4128760kB
      07:22:19:Total swap = 4128760kB
      07:22:19:524284 pages RAM
      07:22:19:43608 pages reserved
      07:22:19:303898 pages shared
      07:22:19:154329 pages non-shared
      07:22:19:ll_ost_io00_071: page allocation failure. order:4, mode:0x50
      07:22:19:Pid: 1080, comm: ll_ost_io00_071 Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1
      07:22:19:Call Trace:
      07:22:19: [<ffffffff811231ff>] ? __alloc_pages_nodemask+0x77f/0x940
      07:22:19: [<ffffffff8115d1a2>] ? kmem_getpages+0x62/0x170
      07:22:19: [<ffffffff8115ddba>] ? fallback_alloc+0x1ba/0x270
      07:22:19: [<ffffffff8115d80f>] ? cache_grow+0x2cf/0x320
      07:22:19: [<ffffffff8115db39>] ? ____cache_alloc_node+0x99/0x160
      07:22:19: [<ffffffffa04d8b60>] ? cfs_alloc+0x30/0x60 [libcfs]
      07:22:19: [<ffffffff8115e909>] ? __kmalloc+0x189/0x220
      07:22:19: [<ffffffffa04d8b60>] ? cfs_alloc+0x30/0x60 [libcfs]
      07:22:19: [<ffffffffa0dda4a8>] ? ost_io_thread_init+0x48/0x300 [ost]
      07:22:19: [<ffffffffa07f4c20>] ? ptlrpc_main+0xa0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff810097cc>] ? __switch_to+0x1ac/0x320
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffffa07f4b80>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      07:22:19: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      07:22:19:Mem-Info:
      07:22:19:Node 0 DMA per-cpu:
      07:22:19:CPU    0: hi:    0, btch:   1 usd:   0
      07:22:19:Node 0 DMA32 per-cpu:
      07:22:19:CPU    0: hi:  186, btch:  31 usd:  27
      07:22:19:active_anon:2135 inactive_anon:2382 isolated_anon:0
      07:22:19: active_file:193269 inactive_file:199242 isolated_file:0
      07:22:19: unevictable:0 dirty:12575 writeback:0 unstable:0
      07:22:19: free:18210 slab_reclaimable:14840 slab_unreclaimable:14230
      07:22:19: mapped:858 shmem:48 pagetables:772 bounce:0
      07:22:19:Node 0 DMA free:8348kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:3568kB inactive_file:3076kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15324kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:164kB slab_unreclaimable:520kB kernel_stack:56kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      07:22:19:lowmem_reserve[]: 0 2003 2003 2003
      07:22:19:Node 0 DMA32 free:64492kB min:44720kB low:55900kB high:67080kB active_anon:8540kB inactive_anon:9528kB active_file:769508kB inactive_file:793892kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052064kB mlocked:0kB dirty:50300kB writeback:0kB mapped:3432kB shmem:192kB slab_reclaimable:59196kB slab_unreclaimable:56400kB kernel_stack:2784kB pagetables:3088kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      07:22:19:lowmem_reserve[]: 0 0 0 0
      07:22:19:Node 0 DMA: 35*4kB 32*8kB 13*16kB 12*32kB 5*64kB 1*128kB 3*256kB 2*512kB 3*1024kB 1*2048kB 0*4096kB = 8348kB
      07:22:19:Node 0 DMA32: 10838*4kB 416*8kB 220*16kB 99*32kB 48*64kB 20*128kB 4*256kB 2*512kB 1*1024kB 1*2048kB 0*4096kB = 64120kB
      07:22:19:392558 total pagecache pages
      07:22:19:0 pages in swap cache
      07:22:19:Swap cache stats: add 0, delete 0, find 0/0
      07:22:19:Free swap  = 4128760kB
      07:22:19:Total swap = 4128760kB
      07:22:19:524284 pages RAM
      07:22:19:43608 pages reserved
      07:22:19:304954 pages shared
      07:22:19:154920 pages non-shared
      07:24:41:Lustre: 32084:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1364394272/real 1364394272]  req@ffff88003ecee400 x1430655931444276/t0(0) o105->lustre-OST0003@10.10.4.221@tcp:15/16 lens 344/224 e 0 to 1 dl 1364394279 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      07:24:41:Lustre: 32084:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
      07:24:41:LustreError: 138-a: lustre-OST0003: A client on nid 10.10.4.221@tcp was evicted due to a lock completion callback time out: rc -107
      07:24:41:LustreError: 138-a: lustre-OST0000: A client on nid 10.10.4.221@tcp was evicted due to a lock completion callback time out: rc -107
      07:24:41:LustreError: 1071:0:(ldlm_lib.c:2709:target_bulk_io()) @@@ Eviction on bulk GET  req@ffff88005afc7400 x1430655942465911/t0(0) o4->ec135bcb-dc82-1842-3f19-cd20f72141be@10.10.4.221@tcp:0/0 lens 456/448 e 0 to 0 dl 1364394313 ref 1 fl Interpret:/0/0 rc 0/0
      07:24:41:LustreError: 32035:0:(ldlm_lib.c:2709:target_bulk_io()) @@@ Eviction on bulk GET  req@ffff880049b9c000 x1430655942465913/t0(0) o4->ec135bcb-dc82-1842-3f19-cd20f72141be@10.10.4.221@tcp:0/0 lens 456/448 e 0 to 0 dl 1364394313 ref 1 fl Interpret:/0/0 rc 0/0
      07:24:41:Lustre: lustre-OST0003: Bulk IO write error with ec135bcb-dc82-1842-3f19-cd20f72141be (at 10.10.4.221@tcp), client will retry: rc -107
      07:24:41:Lustre: lustre-OST0000: Bulk IO write error with ec135bcb-dc82-1842-3f19-cd20f72141be (at 10.10.4.221@tcp), client will retry: rc -107
      07:24:41:LustreError: 32031:0:(ldlm_lib.c:2709:target_bulk_io()) @@@ Eviction on bulk GET  req@ffff880066427c00 x1430655942465919/t0(0) o4->ec135bcb-dc82-1842-3f19-cd20f72141be@10.10.4.221@tcp:0/0 lens 456/448 e 0 to 0 dl 1364394314 ref 1 fl Interpret:/0/0 rc 0/0
      07:24:41:LustreError: 32031:0:(ldlm_lib.c:2709:target_bulk_io()) Skipped 1 previous similar message
      07:24:41:Lustre: lustre-OST0003: Bulk IO write error with ec135bcb-dc82-1842-3f19-cd20f72141be (at 10.10.4.221@tcp), client will retry: rc -107
      07:24:41:Lustre: Skipped 1 previous similar message
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: