Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12241

recovery-random-scale test fail_client_mds fails with ‘ptlrpcd_00_00: page allocation stalls’

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.1
    • 3
    • 9223372036854775807

    Description

      recovery-random-scale test_fail_client_mds fails with 'test_fail_client_mds returned 4'. So far, we are only seeing this for SLES12 SP4 failover testing.

      Looking the suite_log for a recent failure with logs at https://testing.whamcloud.com/test_sets/b92c01d0-6692-11e9-8bb1-52540065bddc , we see there is a problem with one of the clients

      Client load failed on node trevis-35vm3, rc=1
      2019-04-23 18:40:10 Terminating clients loads ...
      Duration:               86400
      Server failover period: 1200 seconds
      Exited after:           2703 seconds
      Number of failovers before exit:
      mds1 failed over 3 times
      Status: FAIL: rc=4
       

      Looking at the client (vm3) console log, we see

      [   98.357170] Lustre: DEBUG MARKER: ==== Checking the clients loads AFTER failed client reintegrated -- failure NOT OK
      [   98.717424] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Number of failovers:
      [   98.717424] mds1 failed over 3 times                and counting...
      [   98.924613] Lustre: DEBUG MARKER: Number of failovers:
      [  704.539916] ptlrpcd_00_00: page allocation stalls for 10832ms, order:2, mode:0x1604040(GFP_NOFS|__GFP_COMP|__GFP_NOTRACK), nodemask=(null)
      [  704.542403] ptlrpcd_00_00 cpuset=/ mems_allowed=0
      [  704.543320] CPU: 0 PID: 1542 Comm: ptlrpcd_00_00 Tainted: G           OE      4.12.14-95.13-default #1 SLE12-SP4
      [  704.545126] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  704.546204] Call Trace:
      [  704.546821]  dump_stack+0x5a/0x75
      [  704.547537]  warn_alloc+0xf0/0x190
      [  704.548281]  __alloc_pages_slowpath+0x3a7/0xa0d
      [  704.549166]  __alloc_pages_nodemask+0x1e9/0x210
      [  704.550050]  cache_grow_begin+0x85/0x560
      [  704.550848]  fallback_alloc+0x167/0x1f0
      [  704.551905]  ? ptlrpc_new_bulk+0x442/0x850 [ptlrpc]
      [  704.552848]  __kmalloc+0x1b4/0x210
      [  704.553588]  ptlrpc_new_bulk+0x442/0x850 [ptlrpc]
      [  704.554536]  ptlrpc_prep_bulk_imp+0x58/0x190 [ptlrpc]
      [  704.555599]  osc_brw_prep_request+0x2de/0x1360 [osc]
      [  704.556757]  ? class_handle2object+0x90/0x170 [obdclass]
      [  704.557776]  ? __ldlm_handle2lock+0x36/0x3a0 [ptlrpc]
      [  704.558748]  ? osc_req_attr_set+0x10b/0x5b0 [osc]
      [  704.559662]  osc_build_rpc+0x4e7/0x1030 [osc]
      [  704.560552]  osc_send_write_rpc+0x3f2/0x9a0 [osc]
      [  704.561488]  ? __wake_up_common_lock+0x77/0x90
      [  704.562481]  ? vvp_page_completion_write+0xff/0x350 [lustre]
      [  704.563617]  ? cfs_hash_bd_from_key+0x31/0xa0 [libcfs]
      [  704.564597]  osc_io_unplug0+0x381/0x12b0 [osc]
      [  704.565466]  ? osc_extent_finish+0x657/0xba0 [osc]
      [  704.566400]  ? osc_brw_fini_request+0x72f/0x1790 [osc]
      [  704.567403]  ? at_measured+0x18f/0x310 [ptlrpc]
      [  704.568303]  ? __switch_to_asm+0x40/0x70
      [  704.569081]  brw_interpret+0x2e5/0xe20 [osc]
      [  704.569936]  ? finish_task_switch+0x76/0x260
      [  704.570788]  ptlrpc_check_set.part.25+0x459/0x1da0 [ptlrpc]
      [  704.571860]  ptlrpcd_check+0x3c1/0x560 [ptlrpc]
      [  704.572760]  ? call_timer_fn+0x140/0x140
      [  704.573558]  ptlrpcd+0x23d/0x4a0 [ptlrpc]
      [  704.574360]  ? wake_up_q+0x70/0x70
      [  704.575066]  kthread+0xff/0x140
      [  704.575748]  ? ptlrpcd_check+0x560/0x560 [ptlrpc]
      [  704.576672]  ? __kthread_parkme+0x70/0x70
      [  704.577461]  ret_from_fork+0x35/0x40
      [  704.578228] Mem-Info:
      [  704.578761] active_anon:0 inactive_anon:32 isolated_anon:0
      [  704.578761]  active_file:88761 inactive_file:339637 isolated_file:320
      [  704.578761]  unevictable:20 dirty:2961 writeback:64814 unstable:0
      [  704.578761]  slab_reclaimable:3423 slab_unreclaimable:16689
      [  704.578761]  mapped:6882 shmem:0 pagetables:945 bounce:0
      [  704.578761]  free:12999 free_pcp:302 free_cma:0
      [  704.584496] Node 0 active_anon:0kB inactive_anon:128kB active_file:355044kB inactive_file:1358632kB unevictable:80kB isolated(anon):0kB isolated(file):1152kB mapped:27528kB dirty:11844kB writeback:259256kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
      [  704.589280] Node 0 DMA free:7660kB min:380kB low:472kB high:564kB active_anon:0kB inactive_anon:0kB active_file:1776kB inactive_file:6164kB unevictable:0kB writepending:1776kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:4kB slab_unreclaimable:248kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      [  704.594509] lowmem_reserve[]: 0 1822 1822 1822 1822
      [  704.595458] Node 0 DMA32 free:44336kB min:44672kB low:55840kB high:67008kB active_anon:180kB inactive_anon:160kB active_file:353268kB inactive_file:1352868kB unevictable:80kB writepending:269200kB present:2080744kB managed:1885860kB mlocked:80kB slab_reclaimable:13688kB slab_unreclaimable:66508kB kernel_stack:2160kB pagetables:3780kB bounce:0kB free_pcp:1208kB local_pcp:756kB free_cma:0kB
      [  704.601265] lowmem_reserve[]: 0 0 0 0 0
      [  704.602025] Node 0 DMA: 3*4kB (UME) 2*8kB (UE) 1*16kB (E) 2*32kB (UE) 4*64kB (UME) 5*128kB (UME) 2*256kB (UE) 2*512kB (UE) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 7660kB
      [  704.604659] Node 0 DMA32: 256*4kB (ME) 64*8kB (UME) 261*16kB (UME) 452*32kB (UME) 238*64kB (UM) 70*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44368kB
      [  704.607241] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      [  704.608828] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      [  704.610360] 76961 total pagecache pages
      [  704.611146] 38 pages in swap cache
      [  704.611847] Swap cache stats: add 12242, delete 12204, find 1904/2998
      [  704.613042] Free swap  = 14298108kB
      [  704.613748] Total swap = 14338044kB
      [  704.614463] 524184 pages RAM
      [  704.615061] 0 pages HighMem/MovableOnly
      [  704.615852] 48742 pages reserved
      [  704.616515] 0 pages hwpoisoned
      [  706.881305] kthreadd: page allocation stalls for 10420ms, order:2, mode:0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=(null)
      [  706.891587] kthreadd cpuset=/ mems_allowed=0
      [  706.892439] CPU: 0 PID: 2 Comm: kthreadd Tainted: G           OE      4.12.14-95.13-default #1 SLE12-SP4
      [  706.894135] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  706.895203] Call Trace:
      [  706.895737]  dump_stack+0x5a/0x75
      [  706.896415]  warn_alloc+0xf0/0x190
      [  706.897122]  __alloc_pages_slowpath+0x3a7/0xa0d
      [  706.897989]  __alloc_pages_nodemask+0x1e9/0x210
      [  706.898862]  cache_grow_begin+0x85/0x560
      [  706.899632]  fallback_alloc+0x167/0x1f0
      [  706.900386]  kmem_cache_alloc_node+0x84/0x1f0
      [  706.901229]  ? __switch_to_asm+0x40/0x70
      [  706.902006]  copy_process.part.36+0xef/0x1d00
      [  706.902864]  ? __switch_to_asm+0x40/0x70
      [  706.903642]  ? __switch_to_asm+0x34/0x70
      [  706.904414]  ? __switch_to_asm+0x40/0x70
      [  706.905182]  ? __switch_to_asm+0x34/0x70
      [  706.905940]  ? __switch_to_asm+0x40/0x70
      [  706.906712]  ? __switch_to_asm+0x34/0x70
      [  706.907483]  ? __switch_to_asm+0x40/0x70
      [  706.908284]  ? __switch_to_asm+0x34/0x70
      [  706.909055]  ? __switch_to_asm+0x40/0x70
      [  706.909833]  ? __switch_to_asm+0x34/0x70
      [  706.910604]  ? __switch_to_asm+0x40/0x70
      [  706.911375]  ? __switch_to_asm+0x34/0x70
      [  706.912141]  ? __switch_to_asm+0x40/0x70
      [  706.912910]  ? __switch_to_asm+0x40/0x70
      [  706.913692]  ? __kthread_parkme+0x70/0x70
      [  706.914485]  ? __switch_to+0x7c/0x4a0
      [  706.915215]  _do_fork+0xdd/0x360
      [  706.915872]  kernel_thread+0x25/0x30
      [  706.916621]  kthreadd+0x2d4/0x320
      [  706.917301]  ? kthread_create_on_cpu+0x80/0x80
      [  706.918178]  ret_from_fork+0x35/0x40
      [  706.918898] Mem-Info:
      [  706.919416] active_anon:4 inactive_anon:14 isolated_anon:0
      [  706.919416]  active_file:88761 inactive_file:339626 isolated_file:352
      [  706.919416]  unevictable:20 dirty:2961 writeback:64785 unstable:0
      [  706.919416]  slab_reclaimable:3417 slab_unreclaimable:16684
      [  706.919416]  mapped:6882 shmem:0 pagetables:945 bounce:0
      [  706.919416]  free:13003 free_pcp:330 free_cma:0
      [  706.925059] Node 0 active_anon:16kB inactive_anon:56kB active_file:355044kB inactive_file:1358504kB unevictable:80kB isolated(anon):0kB isolated(file):1280kB mapped:27528kB dirty:11844kB writeback:259140kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
      [  706.929835] Node 0 DMA free:7660kB min:380kB low:472kB high:564kB active_anon:0kB inactive_anon:0kB active_file:1776kB inactive_file:6168kB unevictable:0kB writepending:1776kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:4kB slab_unreclaimable:248kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      [  706.935049] lowmem_reserve[]: 0 1822 1822 1822 1822
      [  706.936001] Node 0 DMA32 free:44352kB min:44672kB low:55840kB high:67008kB active_anon:0kB inactive_anon:0kB active_file:353268kB inactive_file:1352364kB unevictable:80kB writepending:269208kB present:2080744kB managed:1885860kB mlocked:80kB slab_reclaimable:13664kB slab_unreclaimable:66488kB kernel_stack:2160kB pagetables:3780kB bounce:0kB free_pcp:1320kB local_pcp:756kB free_cma:0kB
      [  706.941791] lowmem_reserve[]: 0 0 0 0 0
      [  706.942572] Node 0 DMA: 3*4kB (UME) 2*8kB (UE) 1*16kB (E) 2*32kB (UE) 4*64kB (UME) 5*128kB (UME) 2*256kB (UE) 2*512kB (UE) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 7660kB
      [  706.945182] Node 0 DMA32: 256*4kB (ME) 66*8kB (UME) 261*16kB (UME) 451*32kB (UME) 238*64kB (UM) 70*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44352kB
      [  706.947715] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      [  706.949283] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      [  706.950803] 76940 total pagecache pages
      [  706.951566] 17 pages in swap cache
      [  706.952253] Swap cache stats: add 12242, delete 12225, find 1904/2999
      [  706.953429] Free swap  = 14298108kB
      [  706.954132] Total swap = 14338044kB
      [  706.954838] 524184 pages RAM
      [  706.955444] 0 pages HighMem/MovableOnly
      [  706.956199] 48742 pages reserved
      [  706.956855] 0 pages hwpoisoned
      [  707.797746] sssd_pam: page allocation stalls for 11044ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null)
      [  707.799917] sssd_pam cpuset=/ mems_allowed=0
      [  707.800766] CPU: 1 PID: 548 Comm: sssd_pam Tainted: G           OE      4.12.14-95.13-default #1 SLE12-SP4
      [  707.802495] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  707.803563] Call Trace:
      [  707.804110]  dump_stack+0x5a/0x75
      [  707.804778]  warn_alloc+0xf0/0x190
      [  707.805476]  __alloc_pages_slowpath+0x3a7/0xa0d
      [  707.806354]  ? __switch_to_asm+0x34/0x70
      [  707.807126]  ? __switch_to_asm+0x40/0x70
      [  707.807894]  ? __switch_to_asm+0x34/0x70
      [  707.808663]  ? __switch_to_asm+0x34/0x70
      [  707.809438]  ? __switch_to_asm+0x40/0x70
      [  707.810210]  __alloc_pages_nodemask+0x1e9/0x210
      [  707.811088]  alloc_pages_vma+0x92/0x200
      [  707.811865]  __read_swap_cache_async+0x140/0x210
      [  707.812767]  read_swap_cache_async+0x14/0x30
      [  707.813611]  swapin_readahead+0x107/0x1f0
      [  707.814405]  do_swap_page+0x2b8/0x8b0
      [  707.815149]  __handle_mm_fault+0x783/0xef0
      [  707.815954]  handle_mm_fault+0xc4/0x1d0
      [  707.816725]  __do_page_fault+0x1f3/0x4c0
      [  707.817535]  trace_do_page_fault+0x40/0x120
      [  707.818367]  ? async_page_fault+0x2f/0x50
      [  707.819169]  async_page_fault+0x45/0x50
      [  707.819937] RIP: 0000:0x7ffd2ad725d8
      [  707.820663] RSP: 2328:000055cbf1d95940 EFLAGS: 000f4237
      [  707.841147] dd: page allocation stalls for 11092ms, order:0, mode:0x142004a(GFP_NOFS|__GFP_HIGHMEM|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=(null)
      [  707.844535] dd cpuset=/ mems_allowed=0
      [  707.845295] CPU: 1 PID: 2087 Comm: dd Tainted: G           OE      4.12.14-95.13-default #1 SLE12-SP4
      [  707.846957] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  707.848047] Call Trace:
      [  707.848578]  dump_stack+0x5a/0x75
      [  707.849257]  warn_alloc+0xf0/0x190
      [  707.849949]  __alloc_pages_slowpath+0x3a7/0xa0d
      [  707.850826]  __alloc_pages_nodemask+0x1e9/0x210
      [  707.851701]  alloc_pages_current+0x70/0xf0
      [  707.852526]  pagecache_get_page+0x4d/0x250
      [  707.853476]  ll_write_begin+0xdf/0xd20 [lustre]
      [  707.854373]  generic_perform_write+0xb2/0x190
      [  707.855240]  ? file_update_time+0x3a/0xf0
      [  707.856030]  __generic_file_write_iter+0x184/0x1c0
      [  707.857019]  ? lov_object_maxbytes+0x31/0x40 [lov]
      [  707.857958]  vvp_io_write_start+0x46a/0x760 [lustre]
      [  707.859068]  ? cl_lock_request+0x62/0x1d0 [obdclass]
      [  707.860035]  cl_io_start+0x58/0x110 [obdclass]
      [  707.860941]  cl_io_loop+0xb6/0x190 [obdclass]
      [  707.861815]  ll_file_io_generic+0x5c9/0xb50 [lustre]
      [  707.862808]  ll_file_write_iter+0xe5/0x410 [lustre]
      [  707.863759]  __vfs_write+0xdc/0x150
      [  707.864481]  vfs_write+0xad/0x1a0
      [  707.865166]  SyS_write+0x42/0x90
      [  707.865841]  do_syscall_64+0x74/0x150
      [  707.866600]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      [  707.867567] RIP: 0033:0x7f2e8e44e3d0
      [  707.868291] RSP: 002b:00007ffee5ec7348 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [  707.869673] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2e8e44e3d0
      [  707.871022] RDX: 0000000000001000 RSI: 0000000000b39000 RDI: 0000000000000001
      [  707.872341] RBP: 0000000000001000 R08: 0000000000000003 R09: 00007f2e8e70d698
      [  707.873633] R10: 0000000010a8b550 R11: 0000000000000246 R12: 0000000000b39000
      [  707.874925] R13: 0000000000000000 R14: 0000000000b39000 R15: 00007ffee5ec7abb
      [  707.909648] master: page allocation stalls for 11148ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null)
      [  707.911669] master cpuset=/ mems_allowed=0
      [  707.912483] CPU: 1 PID: 1447 Comm: master Tainted: G           OE      4.12.14-95.13-default #1 SLE12-SP4
      [  707.914201] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  707.915276] Call Trace:
      [  707.915811]  dump_stack+0x5a/0x75
      [  707.916488]  warn_alloc+0xf0/0x190
      [  707.917185]  __alloc_pages_slowpath+0x3a7/0xa0d
      …
      

      Logs for another failure are at
      https://testing.whamcloud.com/test_sets/2edf47d2-6267-11e9-8bb1-52540065bddc

      Attachments

        Issue Links

          Activity

            [LU-12241] recovery-random-scale test fail_client_mds fails with ‘ptlrpcd_00_00: page allocation stalls’
            mdiep Minh Diep added a comment -

            this is hit on non Suse too

            mdiep Minh Diep added a comment - this is hit on non Suse too

            This looks like it is an allocation failure at:

            struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned nfrags, unsigned max_brw,
                                                     enum ptlrpc_bulk_op_type type,
                                                     unsigned portal,
                                                     const struct ptlrpc_bulk_frag_ops *ops)
            {
                    struct ptlrpc_bulk_desc *desc;
            
                    OBD_ALLOC_PTR(desc);
            

            so it is only allocating 280 bytes. The order-2 allocation is likely because the size-512 slab is 16KB for efficiency (assuming some extra slab debugging is enabled). That implies that this node is totally out of memory.

            My thought is that this is another case of the client pagecache is not responsive to being freed under pressure. The memory dump shows inactive_file:339637, so the Lustre page cache is consuming 1326MB of 2048MB RAM on the client, about 65%. At this point, the remote command to check the status of the client load could not be run on the client and it considered the test a failure, even though the client may not actually have seen a failure.

            adilger Andreas Dilger added a comment - This looks like it is an allocation failure at: struct ptlrpc_bulk_desc *ptlrpc_new_bulk(unsigned nfrags, unsigned max_brw, enum ptlrpc_bulk_op_type type, unsigned portal, const struct ptlrpc_bulk_frag_ops *ops) { struct ptlrpc_bulk_desc *desc; OBD_ALLOC_PTR(desc); so it is only allocating 280 bytes. The order-2 allocation is likely because the size-512 slab is 16KB for efficiency (assuming some extra slab debugging is enabled). That implies that this node is totally out of memory. My thought is that this is another case of the client pagecache is not responsive to being freed under pressure. The memory dump shows inactive_ file:339637 , so the Lustre page cache is consuming 1326MB of 2048MB RAM on the client, about 65%. At this point, the remote command to check the status of the client load could not be run on the client and it considered the test a failure, even though the client may not actually have seen a failure.

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: