Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12067

recovery-mds-scale test failover_mds crashes with OOM

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.10.7
    • None
    • SLES client failover testing
    • 3
    • 9223372036854775807

    Description

      recovery-mds-scale test_failover_mds crashes with OOM for SLES client failover testing.

      Looking at the suite_log for the failed test suite https://testing.whamcloud.com/test_sets/87ea5270-429d-11e9-a256-52540065bddc, we can see that one MDS failover takes place and we are checking client load after the failover. The last thing seen in the suite_log is

      Started lustre-MDT0000
      ==== Checking the clients loads AFTER failover -- failure NOT OK
      01:14:58 (1552122898) waiting for trevis-34vm7 network 5 secs ...
      01:14:58 (1552122898) network interface is UP
      CMD: trevis-34vm7 rc=0;
      			val=\$(/usr/sbin/lctl get_param -n catastrophe 2>&1);
      			if [[ \$? -eq 0 && \$val -ne 0 ]]; then
      				echo \$(hostname -s): \$val;
      				rc=\$val;
      			fi;
      			exit \$rc
      CMD: trevis-34vm7 ps auxwww | grep -v grep | grep -q run_dd.sh
      01:14:58 (1552122898) waiting for trevis-34vm8 network 5 secs ...
      01:14:58 (1552122898) network interface is UP
      CMD: trevis-34vm8 rc=0;
      			val=\$(/usr/sbin/lctl get_param -n catastrophe 2>&1);
      			if [[ \$? -eq 0 && \$val -ne 0 ]]; then
      				echo \$(hostname -s): \$val;
      				rc=\$val;
      			fi;
      			exit \$rc
      CMD: trevis-34vm8 ps auxwww | grep -v grep | grep -q run_tar.sh
      mds1 has failed over 1 times, and counting...
      sleeping 1125 seconds... 
      

      Looking at the kernel crash, we see

      [ 1358.821727] jbd2/vda1-8 invoked oom-killer: gfp_mask=0x1420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE), nodemask=0, order=0, oom_score_adj=0
      [ 1358.821760] jbd2/vda1-8 cpuset=/ mems_allowed=0
      [ 1358.821775] CPU: 0 PID: 273 Comm: jbd2/vda1-8 Tainted: G           OE   N  4.4.162-94.69-default #1
      [ 1358.821775] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1358.821782]  0000000000000000 ffffffff8132cdc0 ffff88003733fb88 0000000000000000
      [ 1358.821784]  ffffffff8120b20e 0000000000000000 0000000000000000 0000000000000000
      [ 1358.821785]  0000000000000000 ffffffff810a1fb7 ffffffff81e9aa20 0000000000000000
      [ 1358.821786] Call Trace:
      [ 1358.821886]  [<ffffffff81019b09>] dump_trace+0x59/0x340
      [ 1358.821894]  [<ffffffff81019eda>] show_stack_log_lvl+0xea/0x170
      [ 1358.821897]  [<ffffffff8101acb1>] show_stack+0x21/0x40
      [ 1358.821921]  [<ffffffff8132cdc0>] dump_stack+0x5c/0x7c
      [ 1358.821953]  [<ffffffff8120b20e>] dump_header+0x82/0x215
      [ 1358.821981]  [<ffffffff81199d39>] check_panic_on_oom+0x29/0x50
      [ 1358.821993]  [<ffffffff81199eda>] out_of_memory+0x17a/0x4a0
      [ 1358.822000]  [<ffffffff8119e849>] __alloc_pages_nodemask+0xa19/0xb70
      [ 1358.822019]  [<ffffffff811e6caf>] alloc_pages_current+0x7f/0x100
      [ 1358.822036]  [<ffffffff81196dfd>] pagecache_get_page+0x4d/0x1c0
      [ 1358.822046]  [<ffffffff812443ce>] __getblk_slow+0xce/0x2e0
      [ 1358.822106]  [<ffffffffa01bda15>] jbd2_journal_get_descriptor_buffer+0x35/0x90 [jbd2]
      [ 1358.822127]  [<ffffffffa01b689d>] jbd2_journal_commit_transaction+0x8ed/0x1970 [jbd2]
      [ 1358.822136]  [<ffffffffa01bb3b2>] kjournald2+0xb2/0x260 [jbd2]
      [ 1358.822150]  [<ffffffff810a0e29>] kthread+0xc9/0xe0
      [ 1358.822190]  [<ffffffff8161e1f5>] ret_from_fork+0x55/0x80
      [ 1358.825553] DWARF2 unwinder stuck at ret_from_fork+0x55/0x80
      [ 1358.825554] 
      [ 1358.825558] Leftover inexact backtrace:
                     
      [ 1358.825573]  [<ffffffff810a0d60>] ? kthread_park+0x50/0x50
      [ 1358.825583] Mem-Info:
      [ 1358.825592] active_anon:1696 inactive_anon:1761 isolated_anon:0
                      active_file:219719 inactive_file:219874 isolated_file:0
                      unevictable:20 dirty:0 writeback:0 unstable:0
                      slab_reclaimable:2700 slab_unreclaimable:17655
                      mapped:5110 shmem:2179 pagetables:987 bounce:0
                      free:4229 free_pcp:48 free_cma:0
      [ 1358.825612] Node 0 DMA free:7480kB min:376kB low:468kB high:560kB active_anon:0kB inactive_anon:100kB active_file:3640kB inactive_file:3680kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:172kB shmem:100kB slab_reclaimable:32kB slab_unreclaimable:592kB kernel_stack:48kB pagetables:60kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2866048 all_unreclaimable? yes
      [ 1358.825613] lowmem_reserve[]: 0 1843 1843 1843 1843
      [ 1358.825622] Node 0 DMA32 free:9436kB min:44676kB low:55844kB high:67012kB active_anon:6784kB inactive_anon:6944kB active_file:875236kB inactive_file:875816kB unevictable:80kB isolated(anon):0kB isolated(file):0kB present:2080744kB managed:1900752kB mlocked:80kB dirty:0kB writeback:0kB mapped:20268kB shmem:8616kB slab_reclaimable:10768kB slab_unreclaimable:70028kB kernel_stack:2656kB pagetables:3888kB unstable:0kB bounce:0kB free_pcp:192kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:17166780 all_unreclaimable? yes
      [ 1358.825624] lowmem_reserve[]: 0 0 0 0 0
      [ 1358.825639] Node 0 DMA: 2*4kB (E) 4*8kB (ME) 1*16kB (U) 2*32kB (UM) 3*64kB (UME) 4*128kB (UME) 2*256kB (UE) 2*512kB (ME) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 7480kB
      [ 1358.825645] Node 0 DMA32: 481*4kB (UME) 239*8kB (UME) 78*16kB (ME) 26*32kB (ME) 11*64kB (ME) 22*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9436kB
      [ 1358.825659] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
      [ 1358.825676] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
      [ 1358.825677] 7018 total pagecache pages
      [ 1358.825678] 27 pages in swap cache
      [ 1358.825682] Swap cache stats: add 6097, delete 6070, find 51/94
      [ 1358.825683] Free swap  = 14314056kB
      [ 1358.825683] Total swap = 14338044kB
      [ 1358.825684] 524184 pages RAM
      [ 1358.825684] 0 pages HighMem/MovableOnly
      [ 1358.825685] 45020 pages reserved
      [ 1358.825685] 0 pages hwpoisoned
      [ 1358.825685] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
      [ 1358.825896] [  349]     0   349    10933     1003      22       3     1114             0 systemd-journal
      [ 1358.825902] [  413]   495   413    13124      890      29       4      111             0 rpcbind
      [ 1358.825913] [  415]     0   415     9267      660      20       3      214         -1000 systemd-udevd
      [ 1358.825919] [  462]     0   462     4814      455      14       3       38             0 irqbalance
      [ 1358.825921] [  464]     0   464    29706     1245      59       4      197             0 sssd
      [ 1358.825934] [  476]   499   476    13452      785      28       3      150          -900 dbus-daemon
      [ 1358.825943] [  535]     0   535     7447     1016      19       3      261             0 wickedd-dhcp6
      [ 1358.825945] [  554]     0   554    36530     1435      70       3      264             0 sssd_be
      [ 1358.825953] [  563]     0   563     7448     1054      20       3      265             0 wickedd-dhcp4
      [ 1358.825962] [  564]     0   564     7448     1018      20       3      261             0 wickedd-auto4
      [ 1358.825970] [  565]     0   565    84317      749      37       4      259             0 rsyslogd
      [ 1358.825972] [  571]     0   571    31711     1112      65       3      175             0 sssd_nss
      [ 1358.825974] [  572]     0   572    26059     1058      55       3      169             0 sssd_pam
      [ 1358.825980] [  573]     0   573    24977     1041      51       3      161             0 sssd_ssh
      [ 1358.826054] [  761]     0   761     7480     1030      18       3      299             0 wickedd
      [ 1358.826060] [  764]     0   764     7455      974      21       3      276             0 wickedd-nanny
      [ 1358.826069] [ 1418]     0  1418     2141      455       9       3       26             0 xinetd
      [ 1358.826077] [ 1464]     0  1464    16586     1551      37       3      181         -1000 sshd
      [ 1358.826079] [ 1477]    74  1477     8408      842      18       3      131             0 ntpd
      [ 1358.826094] [ 1490]    74  1490     9461      497      21       3      150             0 ntpd
      [ 1358.826099] [ 1511]   493  1511    55352      609      21       3      149             0 munged
      [ 1358.826107] [ 1527]     0  1527     1664      365       8       3       29             0 agetty
      [ 1358.826115] [ 1529]     0  1529     1664      407       9       3       29             0 agetty
      [ 1358.826117] [ 1536]     0  1536   147212     1080      60       3      346             0 automount
      [ 1358.826125] [ 1611]     0  1611     5513      494      16       3       65             0 systemd-logind
      [ 1358.826127] [ 1828]     0  1828     8861      812      20       3       98             0 master
      [ 1358.826130] [ 1853]    51  1853    12439     1000      24       3      106             0 pickup
      [ 1358.826135] [ 1854]    51  1854    12529     1309      25       3      173             0 qmgr
      [ 1358.826143] [ 1883]     0  1883     5197      531      17       3      144             0 cron
      [ 1358.826250] [15652]     0 15652    17465      844      35       4        0             0 in.mrshd
      [ 1358.826258] [15653]     0 15653     2894      653      11       3        0             0 bash
      [ 1358.826266] [15658]     0 15658     2894      492      11       3        0             0 bash
      [ 1358.826274] [15659]     0 15659     3034      755      10       3        0             0 run_dd.sh
      [ 1358.826313] [16412]    51 16412    16918     1924      32       3        0             0 smtp
      [ 1358.826314] [16422]     0 16422     1062      198       8       3        0             0 dd
      [ 1358.826316] [16425]    51 16425    12447     1130      23       3        0             0 bounce
      [ 1358.826322] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
      

      We have seen several OOM kernel crashes for SLES in past testing such as LU-10319 and LU-9601.

      Attachments

        Issue Links

          Activity

            [LU-12067] recovery-mds-scale test failover_mds crashes with OOM

            We have many kernel crashes due to OOM in recovery-mds-scale test_failover_mds. Here's another one, https://testing.whamcloud.com/test_sets/50fbe26e-ea6e-11e9-be86-52540065bddc, with crash info

            [ 1520.960032] LustreError: 13429:0:(client.c:2020:ptlrpc_check_set()) @@@ bulk transfer failed  req@ffff88002e249b40 x1646846100197008/t4294968624(4294968624) o4->lustre-OST0003-osc-ffff88007b6d0800@10.9.6.23@tcp:6/4 lens 488/416 e 0 to 1 dl 1570555498 ref 3 fl Bulk:ReX/4/0 rc 0/0
            [ 1520.960038] LustreError: 13429:0:(osc_request.c:1924:osc_brw_redo_request()) @@@ redo for recoverable error -5  req@ffff88002e249b40 x1646846100197008/t4294968624(4294968624) o4->lustre-OST0003-osc-ffff88007b6d0800@10.9.6.23@tcp:6/4 lens 488/416 e 0 to 1 dl 1570555498 ref 3 fl Interpret:ReX/4/0 rc -5/0
            [ 1525.630849] irqbalance invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=0, order=0, oom_score_adj=0
            [ 1525.630862] irqbalance cpuset=/ mems_allowed=0
            [ 1525.630869] CPU: 0 PID: 506 Comm: irqbalance Tainted: G           OE   N  4.4.180-94.97-default #1
            [ 1525.630869] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [ 1525.630872]  0000000000000000 ffffffff813303b0 ffff88007a18fd68 0000000000000000
            [ 1525.630873]  ffffffff8120d66e 0000000000000000 0000000000000000 0000000000000000
            [ 1525.630875]  0000000000000000 ffffffff810a2ad7 ffffffff81e9aae0 0000000000000000
            [ 1525.630875] Call Trace:
            [ 1525.630985]  [<ffffffff81019b39>] dump_trace+0x59/0x340
            [ 1525.630993]  [<ffffffff81019f0a>] show_stack_log_lvl+0xea/0x170
            [ 1525.630995]  [<ffffffff8101ace1>] show_stack+0x21/0x40
            [ 1525.631011]  [<ffffffff813303b0>] dump_stack+0x5c/0x7c
            [ 1525.631044]  [<ffffffff8120d66e>] dump_header+0x82/0x215
            [ 1525.631070]  [<ffffffff8119bb79>] check_panic_on_oom+0x29/0x50
            [ 1525.631082]  [<ffffffff8119bd1a>] out_of_memory+0x17a/0x4a0
            [ 1525.631087]  [<ffffffff811a0758>] __alloc_pages_nodemask+0xaf8/0xb70
            [ 1525.631098]  [<ffffffff811eff5d>] kmem_getpages+0x4d/0xf0
            [ 1525.631108]  [<ffffffff811f1b6b>] fallback_alloc+0x19b/0x240
            [ 1525.631110]  [<ffffffff811f33d0>] kmem_cache_alloc+0x240/0x470
            [ 1525.631124]  [<ffffffff812207ec>] getname_flags+0x4c/0x1f0
            [ 1525.631131]  [<ffffffff8121114e>] do_sys_open+0xfe/0x200
            [ 1525.631157]  [<ffffffff81627361>] entry_SYSCALL_64_fastpath+0x20/0xe9
            [ 1525.633784] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x20/0xe9
            [ 1525.633784] 
            [ 1525.633785] Leftover inexact backtrace:
                           
            [ 1525.633808] Mem-Info:
            [ 1525.633816] active_anon:15 inactive_anon:0 isolated_anon:0
                            active_file:161804 inactive_file:264038 isolated_file:64
                            unevictable:20 dirty:0 writeback:46780 unstable:0
                            slab_reclaimable:2735 slab_unreclaimable:26502
                            mapped:985 shmem:1 pagetables:960 bounce:0
                            free:10229 free_pcp:23 free_cma:0
            [ 1525.633825] Node 0 DMA free:7568kB min:376kB low:468kB high:560kB active_anon:0kB inactive_anon:8kB active_file:2144kB inactive_file:5256kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:500kB mapped:176kB shmem:4kB slab_reclaimable:52kB slab_unreclaimable:412kB kernel_stack:16kB pagetables:16kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:44808 all_unreclaimable? yes
            [ 1525.633827] lowmem_reserve[]: 0 1843 1843 1843 1843
            [ 1525.633835] Node 0 DMA32 free:33348kB min:44676kB low:55844kB high:67012kB active_anon:60kB inactive_anon:0kB active_file:645072kB inactive_file:1050812kB unevictable:80kB isolated(anon):0kB isolated(file):384kB present:2080744kB managed:1900776kB mlocked:80kB dirty:0kB writeback:186620kB mapped:3764kB shmem:0kB slab_reclaimable:10888kB slab_unreclaimable:105596kB kernel_stack:2608kB pagetables:3824kB unstable:0kB bounce:0kB free_pcp:92kB local_pcp:68kB free_cma:0kB writeback_tmp:0kB pages_scanned:10890568 all_unreclaimable? yes
            [ 1525.633836] lowmem_reserve[]: 0 0 0 0 0
            [ 1525.633843] Node 0 DMA: 12*4kB (UE) 8*8kB (UE) 8*16kB (UME) 1*32kB (M) 4*64kB (UM) 1*128kB (E) 1*256kB (E) 3*512kB (UME) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 7568kB
            [ 1525.633849] Node 0 DMA32: 1231*4kB (UME) 259*8kB (ME) 79*16kB (UME) 66*32kB (UME) 31*64kB (UME) 46*128kB (UM) 25*256kB (UM) 17*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 33348kB
            [ 1525.633867] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
            [ 1525.633884] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
            [ 1525.633884] 49223 total pagecache pages
            [ 1525.633886] 1 pages in swap cache
            [ 1525.633886] Swap cache stats: add 9520, delete 9519, find 197/379
            [ 1525.633887] Free swap  = 14301256kB
            [ 1525.633887] Total swap = 14338044kB
            [ 1525.633888] 524184 pages RAM
            [ 1525.633888] 0 pages HighMem/MovableOnly
            [ 1525.633888] 45014 pages reserved
            [ 1525.633888] 0 pages hwpoisoned
            [ 1525.633889] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
            [ 1525.634095] [  358]     0   358    10933      321      24       3     1114             0 systemd-journal
            [ 1525.634104] [  395]     0   395     9229      325      20       3      174         -1000 systemd-udevd
            [ 1525.634105] [  397]   495   397    13126      311      31       3      117             0 rpcbind
            [ 1525.634117] [  467]   499   467    12922      268      27       3      150          -900 dbus-daemon
            [ 1525.634129] [  484]     0   484    10883      336      26       3      290             0 wickedd-dhcp4
            [ 1525.634131] [  504]     0   504    29175      327      59       3      236             0 sssd
            [ 1525.634137] [  505]     0   505    10882      305      25       3      286             0 wickedd-auto4
            [ 1525.634138] [  506]     0   506     4815      317      15       3       58             0 irqbalance
            [ 1525.634150] [  507]     0   507    10883      351      26       3      287             0 wickedd-dhcp6
            [ 1525.634155] [  547]     0   547    36535      404      69       4      302             0 sssd_be
            [ 1525.634161] [  567]     0   567    83783      339      37       3      269             0 rsyslogd
            [ 1525.634163] [  582]     0   582    31181      375      62       3      207             0 sssd_nss
            [ 1525.634168] [  583]     0   583    25530      245      53       3      202             0 sssd_pam
            [ 1525.634170] [  584]     0   584    24446      254      52       3      201             0 sssd_ssh
            [ 1525.634267] [  768]     0   768    10913      360      27       3      331             0 wickedd
            [ 1525.634275] [  774]     0   774    10889      290      27       3      292             0 wickedd-nanny
            [ 1525.634283] [ 1430]     0  1430     2142      296      10       3       40             0 xinetd
            [ 1525.634295] [ 1469]     0  1469    16594      318      35       3      180         -1000 sshd
            [ 1525.634297] [ 1472]    74  1472     8412      342      17       3      152             0 ntpd
            [ 1525.634304] [ 1482]    74  1482     9465      306      18       3      153             0 ntpd
            [ 1525.634313] [ 1512]   490  1512    55357        0      21       3      264             0 munged
            [ 1525.634325] [ 1530]     0  1530     1665      286       8       3       29             0 agetty
            [ 1525.634330] [ 1532]     0  1532   147216      292      62       3      374             0 automount
            [ 1525.634338] [ 1533]     0  1533     1665      295       9       3       30             0 agetty
            [ 1525.634340] [ 1601]     0  1601     5516      339      17       3       79             0 systemd-logind
            [ 1525.634342] [ 1857]     0  1857     8864      332      20       3      121             0 master
            [ 1525.634344] [ 1869]    51  1869    12442      289      24       3      129             0 pickup
            [ 1525.634352] [ 1870]    51  1870    12539      290      26       3      166             0 qmgr
            [ 1525.634354] [ 1915]     0  1915     5202      313      15       3      167             0 cron
            [ 1525.634417] [15802]     0 15802    17469      305      35       3      175             0 in.mrshd
            [ 1525.634425] [15803]     0 15803     2895      361      10       3       78             0 bash
            [ 1525.634433] [15808]     0 15808     2895      278      10       3       79             0 bash
            [ 1525.634435] [15809]     0 15809     3035      365      12       3      214             0 run_dd.sh
            [ 1525.634449] [16592]     0 16592     1063      203       7       3       21             0 dd
            [ 1525.634450] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
            
            jamesanunez James Nunez (Inactive) added a comment - We have many kernel crashes due to OOM in recovery-mds-scale test_failover_mds. Here's another one, https://testing.whamcloud.com/test_sets/50fbe26e-ea6e-11e9-be86-52540065bddc , with crash info [ 1520.960032] LustreError: 13429:0:(client.c:2020:ptlrpc_check_set()) @@@ bulk transfer failed req@ffff88002e249b40 x1646846100197008/t4294968624(4294968624) o4->lustre-OST0003-osc-ffff88007b6d0800@10.9.6.23@tcp:6/4 lens 488/416 e 0 to 1 dl 1570555498 ref 3 fl Bulk:ReX/4/0 rc 0/0 [ 1520.960038] LustreError: 13429:0:(osc_request.c:1924:osc_brw_redo_request()) @@@ redo for recoverable error -5 req@ffff88002e249b40 x1646846100197008/t4294968624(4294968624) o4->lustre-OST0003-osc-ffff88007b6d0800@10.9.6.23@tcp:6/4 lens 488/416 e 0 to 1 dl 1570555498 ref 3 fl Interpret:ReX/4/0 rc -5/0 [ 1525.630849] irqbalance invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=0, order=0, oom_score_adj=0 [ 1525.630862] irqbalance cpuset=/ mems_allowed=0 [ 1525.630869] CPU: 0 PID: 506 Comm: irqbalance Tainted: G OE N 4.4.180-94.97-default #1 [ 1525.630869] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 1525.630872] 0000000000000000 ffffffff813303b0 ffff88007a18fd68 0000000000000000 [ 1525.630873] ffffffff8120d66e 0000000000000000 0000000000000000 0000000000000000 [ 1525.630875] 0000000000000000 ffffffff810a2ad7 ffffffff81e9aae0 0000000000000000 [ 1525.630875] Call Trace: [ 1525.630985] [<ffffffff81019b39>] dump_trace+0x59/0x340 [ 1525.630993] [<ffffffff81019f0a>] show_stack_log_lvl+0xea/0x170 [ 1525.630995] [<ffffffff8101ace1>] show_stack+0x21/0x40 [ 1525.631011] [<ffffffff813303b0>] dump_stack+0x5c/0x7c [ 1525.631044] [<ffffffff8120d66e>] dump_header+0x82/0x215 [ 1525.631070] [<ffffffff8119bb79>] check_panic_on_oom+0x29/0x50 [ 1525.631082] [<ffffffff8119bd1a>] out_of_memory+0x17a/0x4a0 [ 1525.631087] [<ffffffff811a0758>] __alloc_pages_nodemask+0xaf8/0xb70 [ 1525.631098] [<ffffffff811eff5d>] kmem_getpages+0x4d/0xf0 [ 1525.631108] [<ffffffff811f1b6b>] fallback_alloc+0x19b/0x240 [ 1525.631110] [<ffffffff811f33d0>] kmem_cache_alloc+0x240/0x470 [ 1525.631124] [<ffffffff812207ec>] getname_flags+0x4c/0x1f0 [ 1525.631131] [<ffffffff8121114e>] do_sys_open+0xfe/0x200 [ 1525.631157] [<ffffffff81627361>] entry_SYSCALL_64_fastpath+0x20/0xe9 [ 1525.633784] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x20/0xe9 [ 1525.633784] [ 1525.633785] Leftover inexact backtrace: [ 1525.633808] Mem-Info: [ 1525.633816] active_anon:15 inactive_anon:0 isolated_anon:0 active_file:161804 inactive_file:264038 isolated_file:64 unevictable:20 dirty:0 writeback:46780 unstable:0 slab_reclaimable:2735 slab_unreclaimable:26502 mapped:985 shmem:1 pagetables:960 bounce:0 free:10229 free_pcp:23 free_cma:0 [ 1525.633825] Node 0 DMA free:7568kB min:376kB low:468kB high:560kB active_anon:0kB inactive_anon:8kB active_file:2144kB inactive_file:5256kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:500kB mapped:176kB shmem:4kB slab_reclaimable:52kB slab_unreclaimable:412kB kernel_stack:16kB pagetables:16kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:44808 all_unreclaimable? yes [ 1525.633827] lowmem_reserve[]: 0 1843 1843 1843 1843 [ 1525.633835] Node 0 DMA32 free:33348kB min:44676kB low:55844kB high:67012kB active_anon:60kB inactive_anon:0kB active_file:645072kB inactive_file:1050812kB unevictable:80kB isolated(anon):0kB isolated(file):384kB present:2080744kB managed:1900776kB mlocked:80kB dirty:0kB writeback:186620kB mapped:3764kB shmem:0kB slab_reclaimable:10888kB slab_unreclaimable:105596kB kernel_stack:2608kB pagetables:3824kB unstable:0kB bounce:0kB free_pcp:92kB local_pcp:68kB free_cma:0kB writeback_tmp:0kB pages_scanned:10890568 all_unreclaimable? yes [ 1525.633836] lowmem_reserve[]: 0 0 0 0 0 [ 1525.633843] Node 0 DMA: 12*4kB (UE) 8*8kB (UE) 8*16kB (UME) 1*32kB (M) 4*64kB (UM) 1*128kB (E) 1*256kB (E) 3*512kB (UME) 1*1024kB (E) 2*2048kB (ME) 0*4096kB = 7568kB [ 1525.633849] Node 0 DMA32: 1231*4kB (UME) 259*8kB (ME) 79*16kB (UME) 66*32kB (UME) 31*64kB (UME) 46*128kB (UM) 25*256kB (UM) 17*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 33348kB [ 1525.633867] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 1525.633884] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 1525.633884] 49223 total pagecache pages [ 1525.633886] 1 pages in swap cache [ 1525.633886] Swap cache stats: add 9520, delete 9519, find 197/379 [ 1525.633887] Free swap = 14301256kB [ 1525.633887] Total swap = 14338044kB [ 1525.633888] 524184 pages RAM [ 1525.633888] 0 pages HighMem/MovableOnly [ 1525.633888] 45014 pages reserved [ 1525.633888] 0 pages hwpoisoned [ 1525.633889] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name [ 1525.634095] [ 358] 0 358 10933 321 24 3 1114 0 systemd-journal [ 1525.634104] [ 395] 0 395 9229 325 20 3 174 -1000 systemd-udevd [ 1525.634105] [ 397] 495 397 13126 311 31 3 117 0 rpcbind [ 1525.634117] [ 467] 499 467 12922 268 27 3 150 -900 dbus-daemon [ 1525.634129] [ 484] 0 484 10883 336 26 3 290 0 wickedd-dhcp4 [ 1525.634131] [ 504] 0 504 29175 327 59 3 236 0 sssd [ 1525.634137] [ 505] 0 505 10882 305 25 3 286 0 wickedd-auto4 [ 1525.634138] [ 506] 0 506 4815 317 15 3 58 0 irqbalance [ 1525.634150] [ 507] 0 507 10883 351 26 3 287 0 wickedd-dhcp6 [ 1525.634155] [ 547] 0 547 36535 404 69 4 302 0 sssd_be [ 1525.634161] [ 567] 0 567 83783 339 37 3 269 0 rsyslogd [ 1525.634163] [ 582] 0 582 31181 375 62 3 207 0 sssd_nss [ 1525.634168] [ 583] 0 583 25530 245 53 3 202 0 sssd_pam [ 1525.634170] [ 584] 0 584 24446 254 52 3 201 0 sssd_ssh [ 1525.634267] [ 768] 0 768 10913 360 27 3 331 0 wickedd [ 1525.634275] [ 774] 0 774 10889 290 27 3 292 0 wickedd-nanny [ 1525.634283] [ 1430] 0 1430 2142 296 10 3 40 0 xinetd [ 1525.634295] [ 1469] 0 1469 16594 318 35 3 180 -1000 sshd [ 1525.634297] [ 1472] 74 1472 8412 342 17 3 152 0 ntpd [ 1525.634304] [ 1482] 74 1482 9465 306 18 3 153 0 ntpd [ 1525.634313] [ 1512] 490 1512 55357 0 21 3 264 0 munged [ 1525.634325] [ 1530] 0 1530 1665 286 8 3 29 0 agetty [ 1525.634330] [ 1532] 0 1532 147216 292 62 3 374 0 automount [ 1525.634338] [ 1533] 0 1533 1665 295 9 3 30 0 agetty [ 1525.634340] [ 1601] 0 1601 5516 339 17 3 79 0 systemd-logind [ 1525.634342] [ 1857] 0 1857 8864 332 20 3 121 0 master [ 1525.634344] [ 1869] 51 1869 12442 289 24 3 129 0 pickup [ 1525.634352] [ 1870] 51 1870 12539 290 26 3 166 0 qmgr [ 1525.634354] [ 1915] 0 1915 5202 313 15 3 167 0 cron [ 1525.634417] [15802] 0 15802 17469 305 35 3 175 0 in.mrshd [ 1525.634425] [15803] 0 15803 2895 361 10 3 78 0 bash [ 1525.634433] [15808] 0 15808 2895 278 10 3 79 0 bash [ 1525.634435] [15809] 0 15809 3035 365 12 3 214 0 run_dd.sh [ 1525.634449] [16592] 0 16592 1063 203 7 3 21 0 dd [ 1525.634450] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
            jamesanunez James Nunez (Inactive) added a comment - - edited

            Moved comment to LU-11410

            jamesanunez James Nunez (Inactive) added a comment - - edited Moved comment to LU-11410

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: