Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7372

replay-dual test_26: test failed to respond and timed out

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5, Lustre 2.12.4
    • None
    • Server/Client : master, build # 3225 RHEL 6.7
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/1e79d2a6-7d21-11e5-a254-5254006e85c2.

      The sub-test test_26 failed with the following error:

      test failed to respond and timed out
      

      Client dmesg:

      Lustre: DEBUG MARKER: test_26 fail mds1 1 times
      LustreError: 980:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1445937610, 300s ago), entering recovery for MGS@10.2.4.140@tcp ns: MGC10.2.4.140@tcp lock: ffff88007bdd82c0/0x956ab2c8047544d6 lrc: 4/1,0 mode: --/CR res: [0x65727473756c:0x2:0x0].0x0 rrc: 1 type: PLN flags: 0x1000000000000 nid: local remote: 0x223a79061b204538 expref: -99 pid: 980 timeout: 0 lvb_type: 0
      Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1445937910/real 1445937910]  req@ffff880028347980 x1516173751413108/t0(0) o250->MGC10.2.4.140@tcp@10.2.4.140@tcp:26/25 lens 520/544 e 0 to 1 dl 1445937916 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) Skipped 67 previous similar messages
      

      MDS console:

      09:22:17:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED   req@ffff88004d92c980 x1516158358024328/t0(0) o101->lustre-MDT0000-lwp-MDT0000@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      09:25:19:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 6 previous similar messages
      09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) lustre-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-5
      09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) Skipped 1 previous similar message
      09:25:19:INFO: task umount:24629 blocked for more than 120 seconds.
      09:25:19:      Not tainted 2.6.32-573.7.1.el6_lustre.x86_64 #1
      09:25:19:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      09:25:19:umount        D 0000000000000000     0 24629  24628 0x00000080
      09:25:19: ffff880059e2bb48 0000000000000086 0000000000000000 00000000000708b7
      09:25:20: 0000603500000000 000000ac00000000 00001c1fd9b9c014 ffff880059e2bb98
      09:25:20: ffff880059e2bb58 0000000101d3458a ffff880076ee3ad8 ffff880059e2bfd8
      09:25:20:Call Trace:
      09:25:20: [<ffffffff8153a756>] __mutex_lock_slowpath+0x96/0x210
      09:25:20: [<ffffffff8153a27b>] mutex_lock+0x2b/0x50
      09:25:20: [<ffffffffa02cb30d>] mgc_process_config+0x1dd/0x1210 [mgc]
      09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      09:25:20: [<ffffffffa07fe28d>] obd_process_config.clone.0+0x8d/0x2e0 [obdclass]
      09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      09:25:20: [<ffffffffa08024c2>] lustre_end_log+0x262/0x6a0 [obdclass]
      09:25:20: [<ffffffffa082efb1>] server_put_super+0x911/0xed0 [obdclass]
      09:25:20: [<ffffffff811b0116>] ? invalidate_inodes+0xf6/0x190
      09:25:20: [<ffffffff8119437b>] generic_shutdown_super+0x5b/0xe0
      09:25:20: [<ffffffff81194466>] kill_anon_super+0x16/0x60
      09:25:20: [<ffffffffa07fa096>] lustre_kill_super+0x36/0x60 [obdclass]
      09:25:20: [<ffffffff81194c07>] deactivate_super+0x57/0x80
      09:25:20: [<ffffffff811b4a7f>] mntput_no_expire+0xbf/0x110
      09:25:20: [<ffffffff811b55cb>] sys_umount+0x7b/0x3a0
      09:25:20: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      

      Info required for matching: replay-dual test_26

      Attachments

        1. 1453855057.tgz
          24.15 MB
        2. log-7372
          65 kB

        Issue Links

          Activity

            [LU-7372] replay-dual test_26: test failed to respond and timed out

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43982/
            Subject: LU-7372 tests: re-enable replay-dual test_26
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 2da8f7adbe4a0c3eeecf8fda44fb6a4e4f9a16dd

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43982/ Subject: LU-7372 tests: re-enable replay-dual test_26 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 2da8f7adbe4a0c3eeecf8fda44fb6a4e4f9a16dd

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43978/
            Subject: LU-7372 tests: re-enable replay-dual test_26
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 8cd0ad8c3f7755a9ff41da297a5130a6857fae5c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43978/ Subject: LU-7372 tests: re-enable replay-dual test_26 Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 8cd0ad8c3f7755a9ff41da297a5130a6857fae5c

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43977/
            Subject: LU-7372 tests: skip replay-dual test_24/25
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 13e11cf70cc8102d006a681276094517c22e4a47

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43977/ Subject: LU-7372 tests: skip replay-dual test_24/25 Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 13e11cf70cc8102d006a681276094517c22e4a47

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43982
            Subject: LU-7372 tests: re-enable replay-dual test_26
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0f509199a25db416759c3bbcce85c6b79d623585

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43982 Subject: LU-7372 tests: re-enable replay-dual test_26 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0f509199a25db416759c3bbcce85c6b79d623585

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43980
            Subject: LU-7372 tests: re-enable replay-dual test_24
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 87211c8150c48a1a7876ac52cd2e30b34814eaa3

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43980 Subject: LU-7372 tests: re-enable replay-dual test_24 Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 87211c8150c48a1a7876ac52cd2e30b34814eaa3

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43979
            Subject: LU-7372 tests: re-enable replay-dual test_25
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: b188061abcc6e73ea52e99b18797bd74e01e6d75

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43979 Subject: LU-7372 tests: re-enable replay-dual test_25 Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: b188061abcc6e73ea52e99b18797bd74e01e6d75

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43978
            Subject: LU-7372 tests: re-enable replay-dual test_26
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 8934a96b91ce014a7fe73689fd2d293f436cd716

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43978 Subject: LU-7372 tests: re-enable replay-dual test_26 Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 8934a96b91ce014a7fe73689fd2d293f436cd716

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43977
            Subject: LU-7372 tests: skip replay-dual test_24/25
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 1398315916128263f37a5b53b0d1a9286c5b3574

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43977 Subject: LU-7372 tests: skip replay-dual test_24/25 Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 1398315916128263f37a5b53b0d1a9286c5b3574

            John L. Hammond (jhammond@whamcloud.com) merged in patch https://review.whamcloud.com/33052/
            Subject: LU-7372 tests: stop running replay-dual test 26
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set:
            Commit: c5427cbab935259c54957b2ff50e7736f240cd08

            gerrit Gerrit Updater added a comment - John L. Hammond (jhammond@whamcloud.com) merged in patch https://review.whamcloud.com/33052/ Subject: LU-7372 tests: stop running replay-dual test 26 Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: c5427cbab935259c54957b2ff50e7736f240cd08

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33052
            Subject: LU-7372 tests: stop running replay-dual test 26
            Project: fs/lustre-release
            Branch: b2_10
            Current Patch Set: 1
            Commit: debba6d3ae60a2448c5a59f46b751b605c2ee69c

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33052 Subject: LU-7372 tests: stop running replay-dual test 26 Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: debba6d3ae60a2448c5a59f46b751b605c2ee69c

            Looking at the MDS console logs, the following test sessions have essentially the same stack trace as what is described in this ticket. Looking at the kernel crash logs shows oom-killer.

            For https://testing.whamcloud.com/test_sets/a498de80-9ade-11e8-8ee3-52540065bddc, the kernel crash log shows that tar envokes the oom-killer:

            [60604.459486] tar invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
            [60604.460400] tar cpuset=/ mems_allowed=0
            [60604.460823] CPU: 0 PID: 16324 Comm: tar Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
            [60604.461874] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [60604.462439] Call Trace:
            [60604.462724]  [<ffffffff81b0e84e>] dump_stack+0x19/0x1b
            [60604.463220]  [<ffffffff81b0a1d0>] dump_header+0x90/0x229
            [60604.463737]  [<ffffffff81b1badf>] ? notifier_call_chain+0x4f/0x70
            [60604.464338]  [<ffffffff814c17b8>] ? __blocking_notifier_call_chain+0x58/0x70
            [60604.465018]  [<ffffffff8159826e>] check_panic_on_oom+0x2e/0x60
            [60604.465589]  [<ffffffff8159868b>] out_of_memory+0x23b/0x4f0
            [60604.466124]  [<ffffffff8159f224>] __alloc_pages_nodemask+0xaa4/0xbb0
            [60604.466735]  [<ffffffff815ec525>] alloc_pages_vma+0xb5/0x200
            [60604.467279]  [<ffffffff815dae45>] __read_swap_cache_async+0x115/0x190
            [60604.467886]  [<ffffffff815daee6>] read_swap_cache_async+0x26/0x60
            [60604.468472]  [<ffffffff815dafc8>] swapin_readahead+0xa8/0x110
            [60604.469034]  [<ffffffff815c5f37>] handle_pte_fault+0x777/0xc30
            [60604.469601]  [<ffffffff815c7c3d>] handle_mm_fault+0x39d/0x9b0
            [60604.470163]  [<ffffffff81525092>] ? from_kgid_munged+0x12/0x20
            [60604.470717]  [<ffffffff81b1b557>] __do_page_fault+0x197/0x4f0
            [60604.471260]  [<ffffffff81b1b996>] trace_do_page_fault+0x56/0x150
            [60604.471829]  [<ffffffff81b1af22>] do_async_page_fault+0x22/0xf0
            [60604.472405]  [<ffffffff81b17788>] async_page_fault+0x28/0x30
            [60604.472938] Mem-Info:
            [60604.473179] active_anon:0 inactive_anon:6 isolated_anon:0
             active_file:274 inactive_file:308 isolated_file:0
             unevictable:0 dirty:38 writeback:0 unstable:0
             slab_reclaimable:3259 slab_unreclaimable:38342
             mapped:181 shmem:12 pagetables:1345 bounce:0
             free:12914 free_pcp:0 free_cma:0
            [60604.476120] Node 0 DMA free:7020kB min:416kB low:520kB high:624kB active_anon:4kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:12kB writeback:0kB mapped:4kB shmem:48kB slab_reclaimable:132kB slab_unreclaimable:724kB kernel_stack:16kB pagetables:124kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1494 all_unreclaimable? yes
            [60604.480017] lowmem_reserve[]: 0 1660 1660 1660
            [60604.480573] Node 0 DMA32 free:44636kB min:44636kB low:55792kB high:66952kB active_anon:0kB inactive_anon:24kB active_file:1092kB inactive_file:1232kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080744kB managed:1704004kB mlocked:0kB dirty:140kB writeback:0kB mapped:720kB shmem:0kB slab_reclaimable:12904kB slab_unreclaimable:152644kB kernel_stack:2848kB pagetables:5256kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3668 all_unreclaimable? yes
            [60604.484717] lowmem_reserve[]: 0 0 0 0
            [60604.485180] Node 0 DMA: 9*4kB (UM) 5*8kB (UM) 10*16kB (M) 9*32kB (UM) 8*64kB (UM) 5*128kB (UM) 5*256kB (UM) 4*512kB (UM) 0*1024kB 1*2048kB (M) 0*4096kB = 7052kB
            [60604.486948] Node 0 DMA32: 2654*4kB (EM) 2385*8kB (EM) 941*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44752kB
            [60604.488485] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
            [60604.489308] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
            [60604.490109] 303 total pagecache pages
            [60604.490472] 6 pages in swap cache
            [60604.490790] Swap cache stats: add 86869, delete 86863, find 18036/30348
            [60604.491421] Free swap  = 3521276kB
            [60604.491749] Total swap = 3671036kB
            [60604.492082] 524184 pages RAM
            [60604.492368] 0 pages HighMem/MovableOnly
            [60604.492812] 94206 pages reserved
            [60604.493137] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
            [60604.493889] [  333]     0   333     9868        0      22      108             0 systemd-journal
            [60604.494776] [  359]     0   359    29149        0      26       79             0 lvmetad
            [60604.495542] [  366]     0   366    11101        1      23      147         -1000 systemd-udevd
            [60604.496357] [  453]     0   453    13877        0      26      119         -1000 auditd
            [60604.497124] [  480]     0   480     6627        1      19       95             0 systemd-logind
            [60604.497929] [  481]   999   481   134608        0      60     2165             0 polkitd
            [60604.498689] [  482]     0   482     5381        0      15       59             0 irqbalance
            [60604.499486] [  483]    81   483    14590        1      34      213          -900 dbus-daemon
            [60604.500283] [  484]    32   484    17305        0      38      160             0 rpcbind
            [60604.501042] [  485]     0   485    48770        0      36      126             0 gssproxy
            [60604.501815] [  486]     0   486   137505        0      87      654             0 NetworkManager
            [60604.502639] [  501]   998   501    30087        0      29      123             0 chronyd
            [60604.503403] [  533]     0   533    26849        1      53      499             0 dhclient
            [60604.504171] [  895]     0   895   143453       42      98     2797             0 tuned
            [60604.504929] [  896]     0   896    28203        1      55      257         -1000 sshd
            [60604.505686] [  900]     0   900    74575        8      73      916             0 rsyslogd
            [60604.506456] [  906]     0   906     6791        1      18       62             0 xinetd
            [60604.507223] [  913]    29   913    10605        0      24      209             0 rpc.statd
            [60604.508002] [  917]   997   917    56469        0      23      285             0 munged
            [60604.508760] [  992]     0   992    31570        1      19      155             0 crond
            [60604.509518] [  993]     0   993     6476        0      19       52             0 atd
            [60604.510256] [  996]     0   996   167981        0      69      580             0 automount
            [60604.511040] [ 1007]     0  1007    27522        1      12       33             0 agetty
            [60604.511804] [ 1009]     0  1009    27522        1       9       32             0 agetty
            [60604.512563] [ 1181]     0  1181    22408        0      43      265             0 master
            [60604.513318] [ 1200]    89  1200    22451        0      46      253             0 qmgr
            [60604.514075] [10963]     0 10963    39169        0      78      351             0 sshd
            [60604.514809] [10965]     0 10965    28296        1      14       58             0 run_test.sh
            [60604.515601] [11233]     0 11233    29536        1      16      790             0 bash
            [60604.516346] [21585]     0 21585    29536        0      13      790             0 bash
            [60604.517086] [21586]     0 21586    26988        0      10       27             0 tee
            [60604.517808] [21767]     0 21767    29573        1      14      840             0 bash
            [60604.518551] [31867]    89 31867    22434        0      45      251             0 pickup
            [60604.519302] [11943]     0 11943    29607        1      14      863             0 bash
            [60604.520033] [11944]     0 11944    26988        0      10       28             0 tee
            [60604.520775] [12397]     0 12397    29607        0      14      863             0 bash
            [60604.521528] [12398]     0 12398    29607        0      14      863             0 bash
            [60604.522264] [12911]     0 12911    24022        0      22       84             0 pdsh
            [60604.523013] [12912]     0 12912    29228        0      14       42             0 sed
            [60604.523754] [15184]     0 15184    29474        1      13      732             0 rundbench
            [60604.524547] [15195]     0 15195     1618       26       9       52             0 dbench
            [60604.525302] [15196]     0 15196     1620       61       9       62             0 dbench
            [60604.526052] [16324]     0 16324    30920       53      15       89             0 tar
            [60604.526795] [16325]     0 16325    30852       57      17       81             0 tar
            [60604.527548] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
            
            [60604.528460] CPU: 0 PID: 16324 Comm: tar Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
            [60604.529473] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [60604.530011] Call Trace:
            [60604.530259]  [<ffffffff81b0e84e>] dump_stack+0x19/0x1b
            [60604.530746]  [<ffffffff81b08b50>] panic+0xe8/0x21f
            [60604.531196]  [<ffffffff81598295>] check_panic_on_oom+0x55/0x60
            [60604.531747]  [<ffffffff8159868b>] out_of_memory+0x23b/0x4f0
            [60604.532268]  [<ffffffff8159f224>] __alloc_pages_nodemask+0xaa4/0xbb0
            [60604.532865]  [<ffffffff815ec525>] alloc_pages_vma+0xb5/0x200
            [60604.533400]  [<ffffffff815dae45>] __read_swap_cache_async+0x115/0x190
            [60604.533994]  [<ffffffff815daee6>] read_swap_cache_async+0x26/0x60
            [60604.534565]  [<ffffffff815dafc8>] swapin_readahead+0xa8/0x110
            [60604.535096]  [<ffffffff815c5f37>] handle_pte_fault+0x777/0xc30
            [60604.535643]  [<ffffffff815c7c3d>] handle_mm_fault+0x39d/0x9b0
            [60604.536180]  [<ffffffff81525092>] ? from_kgid_munged+0x12/0x20
            [60604.536733]  [<ffffffff81b1b557>] __do_page_fault+0x197/0x4f0
            [60604.537268]  [<ffffffff81b1b996>] trace_do_page_fault+0x56/0x150
            [60604.537828]  [<ffffffff81b1af22>] do_async_page_fault+0x22/0xf0
            [60604.538383]  [<ffffffff81b17788>] async_page_fault+0x28/0x30
            

            The kernel-crash for https://testing.whamcloud.com/test_sets/0bb1397a-9bb9-11e8-8ee3-52540065bddc has tar envoking the oom-killer, but we have a few different *_newfstat calls in the kernel crash:

            [44806.676691] Lustre: DEBUG MARKER: test_26 fail mds1 1 times
            [44998.330738] tar invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
            [44998.332386] tar cpuset=/ mems_allowed=0
            [44998.333090] CPU: 0 PID: 28253 Comm: tar Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
            [44998.335039] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [44998.335669] Call Trace:
            [44998.335982]  [<ffffffffabd0e84e>] dump_stack+0x19/0x1b
            [44998.336497]  [<ffffffffabd0a1d0>] dump_header+0x90/0x229
            [44998.337027]  [<ffffffffabd1badf>] ? notifier_call_chain+0x4f/0x70
            [44998.337645]  [<ffffffffab6c17b8>] ? __blocking_notifier_call_chain+0x58/0x70
            [44998.338336]  [<ffffffffab79826e>] check_panic_on_oom+0x2e/0x60
            [44998.338902]  [<ffffffffab79868b>] out_of_memory+0x23b/0x4f0
            [44998.339448]  [<ffffffffab79f224>] __alloc_pages_nodemask+0xaa4/0xbb0
            [44998.340074]  [<ffffffffab7ec525>] alloc_pages_vma+0xb5/0x200
            [44998.340639]  [<ffffffffab7dae45>] __read_swap_cache_async+0x115/0x190
            [44998.341254]  [<ffffffffab7daee6>] read_swap_cache_async+0x26/0x60
            [44998.341850]  [<ffffffffab7dafc8>] swapin_readahead+0xa8/0x110
            [44998.342416]  [<ffffffffab7c5f37>] handle_pte_fault+0x777/0xc30
            [44998.342995]  [<ffffffffab7c7c3d>] handle_mm_fault+0x39d/0x9b0
            [44998.343566]  [<ffffffffabd1b557>] __do_page_fault+0x197/0x4f0
            [44998.344126]  [<ffffffffabd1b996>] trace_do_page_fault+0x56/0x150
            [44998.344722]  [<ffffffffabd1af22>] do_async_page_fault+0x22/0xf0
            [44998.345287]  [<ffffffffabd17788>] async_page_fault+0x28/0x30
            [44998.345888]  [<ffffffffab959730>] ? copy_user_generic_string+0x30/0x40
            [44998.346533]  [<ffffffffab82142f>] ? cp_new_stat+0x14f/0x180
            [44998.347077]  [<ffffffffab8215b4>] SYSC_newfstat+0x34/0x60
            [44998.347607]  [<ffffffffab82179e>] SyS_newfstat+0xe/0x10
            [44998.348120]  [<ffffffffabd20795>] system_call_fastpath+0x1c/0x21
            [44998.348710]  [<ffffffffabd206e1>] ? system_call_after_swapgs+0xae/0x146
            [44998.349352] Mem-Info:
            [44998.349595] active_anon:2 inactive_anon:18 isolated_anon:0
             active_file:15 inactive_file:1090 isolated_file:0
             unevictable:0 dirty:0 writeback:30 unstable:0
             slab_reclaimable:3679 slab_unreclaimable:38446
             mapped:55 shmem:0 pagetables:1385 bounce:0
             free:12871 free_pcp:108 free_cma:0
            [44998.352588] Node 0 DMA free:6956kB min:416kB low:520kB high:624kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:72kB slab_unreclaimable:684kB kernel_stack:32kB pagetables:24kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
            [44998.356534] lowmem_reserve[]: 0 1660 1660 1660
            [44998.357078] Node 0 DMA32 free:44528kB min:44636kB low:55792kB high:66952kB active_anon:8kB inactive_anon:72kB active_file:60kB inactive_file:4360kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080744kB managed:1704004kB mlocked:0kB dirty:0kB writeback:120kB mapped:220kB shmem:0kB slab_reclaimable:14644kB slab_unreclaimable:153100kB kernel_stack:2832kB pagetables:5516kB unstable:0kB bounce:0kB free_pcp:432kB local_pcp:4kB free_cma:0kB writeback_tmp:0kB pages_scanned:1441 all_unreclaimable? yes
            [44998.361327] lowmem_reserve[]: 0 0 0 0
            [44998.361788] Node 0 DMA: 17*4kB (U) 11*8kB (U) 5*16kB (U) 3*32kB (U) 0*64kB 2*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 3*2048kB (M) 0*4096kB = 6988kB
            [44998.363459] Node 0 DMA32: 1871*4kB (EM) 2399*8kB (UEM) 1117*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44548kB
            [44998.365051] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
            [44998.365894] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
            [44998.366710] 1163 total pagecache pages
            [44998.367087] 0 pages in swap cache
            [44998.367423] Swap cache stats: add 95573, delete 95573, find 17225/30172
            [44998.368063] Free swap  = 3521020kB
            [44998.368403] Total swap = 3671036kB
            [44998.368744] 524184 pages RAM
            [44998.369026] 0 pages HighMem/MovableOnly
            [44998.369409] 94206 pages reserved
            [44998.369738] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
            [44998.370539] [  337]     0   337    10055        1      24       98             0 systemd-journal
            [44998.371399] [  359]     0   359    29149        0      26       79             0 lvmetad
            [44998.372180] [  364]     0   364    11100        1      24      146         -1000 systemd-udevd
            [44998.373025] [  455]     0   455    13877        0      27      119         -1000 auditd
            [44998.373804] [  481]     0   481     5381        0      15       59             0 irqbalance
            [44998.374612] [  484]    32   484    17305        0      36      139             0 rpcbind
            [44998.375402] [  486]    81   486    14554        1      33      176          -900 dbus-daemon
            [44998.376211] [  489]     0   489    48770        0      37      126             0 gssproxy
            [44998.377005] [  492]   998   492    30087        0      29      124             0 chronyd
            [44998.377795] [  499]     0   499   137506        0      87     1128             0 NetworkManager
            [44998.378649] [  500]   999   500   134608        0      61     1403             0 polkitd
            [44998.379433] [  501]     0   501     6594        1      17       77             0 systemd-logind
            [44998.380252] [  546]     0   546    26849        1      53      498             0 dhclient
            [44998.381077] [  900]     0   900    28203        1      60      257         -1000 sshd
            [44998.381846] [  902]     0   902   143453        0      96     3303             0 tuned
            [44998.382623] [  909]     0   909    74575        0      75      908             0 rsyslogd
            [44998.383418] [  913]    29   913    10605        0      25      209             0 rpc.statd
            [44998.384215] [  914]     0   914     6791        1      17       63             0 xinetd
            [44998.385003] [  918]   997   918    56469        0      21      274             0 munged
            [44998.385798] [  983]     0   983     6476        0      19       52             0 atd
            [44998.386544] [  985]     0   985    31570        0      21      155             0 crond
            [44998.387319] [  993]     0   993   167982        0      69      547             0 automount
            [44998.388108] [  997]     0   997    27522        1       9       32             0 agetty
            [44998.388892] [ 1001]     0  1001    27522        1      12       32             0 agetty
            [44998.389690] [ 1320]     0  1320    22408        0      42      259             0 master
            [44998.390467] [ 1346]    89  1346    22451        0      45      254             0 qmgr
            [44998.391223] [10968]     0 10968    39169        0      77      365             0 sshd
            [44998.391985] [10970]     0 10970    28296        1      13       58             0 run_test.sh
            [44998.392810] [11240]     0 11240    29470        1      16      733             0 bash
            [44998.393561] [18492]    89 18492    22434        0      46      252             0 pickup
            [44998.394334] [ 1201]     0  1201    29470        0      13      733             0 bash
            [44998.395082] [ 1202]     0  1202    26988        0       9       27             0 tee
            [44998.395840] [ 1396]     0  1396    29538        1      12      784             0 bash
            [44998.396621] [23895]     0 23895    29538        1      12      806             0 bash
            [44998.397382] [23896]     0 23896    26988        0      10       28             0 tee
            [44998.398123] [24348]     0 24348    29538        0      12      806             0 bash
            [44998.398885] [24349]     0 24349    29538        0      12      806             0 bash
            [44998.399652] [24854]     0 24854    24023        0      21       87             0 pdsh
            [44998.400403] [24855]     0 24855    29228        0      13       42             0 sed
            [44998.401140] [27114]     0 27114    29438        1      14      675             0 rundbench
            [44998.401941] [27125]     0 27125     1618        0       9       44             0 dbench
            [44998.402713] [27126]     0 27126     1620        1       9       56             0 dbench
            [44998.403486] [28252]     0 28252    30920        0      17       95             0 tar
            [44998.404231] [28253]     0 28253    30852        0      16       77             0 tar
            [44998.404986] [28255]     0 28255    40840        6      37      209             0 crond
            [44998.405759] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled
            
            [44998.406685] CPU: 0 PID: 28253 Comm: tar Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
            [44998.407727] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [44998.408277] Call Trace:
            [44998.408522]  [<ffffffffabd0e84e>] dump_stack+0x19/0x1b
            [44998.409021]  [<ffffffffabd08b50>] panic+0xe8/0x21f
            [44998.409491]  [<ffffffffab798295>] check_panic_on_oom+0x55/0x60
            [44998.410058]  [<ffffffffab79868b>] out_of_memory+0x23b/0x4f0
            [44998.410588]  [<ffffffffab79f224>] __alloc_pages_nodemask+0xaa4/0xbb0
            [44998.411198]  [<ffffffffab7ec525>] alloc_pages_vma+0xb5/0x200
            [44998.411747]  [<ffffffffab7dae45>] __read_swap_cache_async+0x115/0x190
            [44998.412363]  [<ffffffffab7daee6>] read_swap_cache_async+0x26/0x60
            [44998.412951]  [<ffffffffab7dafc8>] swapin_readahead+0xa8/0x110
            [44998.413498]  [<ffffffffab7c5f37>] handle_pte_fault+0x777/0xc30
            [44998.414062]  [<ffffffffab7c7c3d>] handle_mm_fault+0x39d/0x9b0
            [44998.414610]  [<ffffffffabd1b557>] __do_page_fault+0x197/0x4f0
            [44998.415172]  [<ffffffffabd1b996>] trace_do_page_fault+0x56/0x150
            [44998.415747]  [<ffffffffabd1af22>] do_async_page_fault+0x22/0xf0
            [44998.416313]  [<ffffffffabd17788>] async_page_fault+0x28/0x30
            [44998.416860]  [<ffffffffab959730>] ? copy_user_generic_string+0x30/0x40
            [44998.417479]  [<ffffffffab82142f>] ? cp_new_stat+0x14f/0x180
            [44998.418024]  [<ffffffffab8215b4>] SYSC_newfstat+0x34/0x60
            [44998.418542]  [<ffffffffab82179e>] SyS_newfstat+0xe/0x10
            [44998.419056]  [<ffffffffabd20795>] system_call_fastpath+0x1c/0x21
            [44998.419651]  [<ffffffffabd206e1>] ? system_call_after_swapgs+0xae/0x146
            
            jamesanunez James Nunez (Inactive) added a comment - Looking at the MDS console logs, the following test sessions have essentially the same stack trace as what is described in this ticket. Looking at the kernel crash logs shows oom-killer. For https://testing.whamcloud.com/test_sets/a498de80-9ade-11e8-8ee3-52540065bddc , the kernel crash log shows that tar envokes the oom-killer: [60604.459486] tar invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 [60604.460400] tar cpuset=/ mems_allowed=0 [60604.460823] CPU: 0 PID: 16324 Comm: tar Kdump: loaded Tainted: G OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [60604.461874] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [60604.462439] Call Trace: [60604.462724] [<ffffffff81b0e84e>] dump_stack+0x19/0x1b [60604.463220] [<ffffffff81b0a1d0>] dump_header+0x90/0x229 [60604.463737] [<ffffffff81b1badf>] ? notifier_call_chain+0x4f/0x70 [60604.464338] [<ffffffff814c17b8>] ? __blocking_notifier_call_chain+0x58/0x70 [60604.465018] [<ffffffff8159826e>] check_panic_on_oom+0x2e/0x60 [60604.465589] [<ffffffff8159868b>] out_of_memory+0x23b/0x4f0 [60604.466124] [<ffffffff8159f224>] __alloc_pages_nodemask+0xaa4/0xbb0 [60604.466735] [<ffffffff815ec525>] alloc_pages_vma+0xb5/0x200 [60604.467279] [<ffffffff815dae45>] __read_swap_cache_async+0x115/0x190 [60604.467886] [<ffffffff815daee6>] read_swap_cache_async+0x26/0x60 [60604.468472] [<ffffffff815dafc8>] swapin_readahead+0xa8/0x110 [60604.469034] [<ffffffff815c5f37>] handle_pte_fault+0x777/0xc30 [60604.469601] [<ffffffff815c7c3d>] handle_mm_fault+0x39d/0x9b0 [60604.470163] [<ffffffff81525092>] ? from_kgid_munged+0x12/0x20 [60604.470717] [<ffffffff81b1b557>] __do_page_fault+0x197/0x4f0 [60604.471260] [<ffffffff81b1b996>] trace_do_page_fault+0x56/0x150 [60604.471829] [<ffffffff81b1af22>] do_async_page_fault+0x22/0xf0 [60604.472405] [<ffffffff81b17788>] async_page_fault+0x28/0x30 [60604.472938] Mem-Info: [60604.473179] active_anon:0 inactive_anon:6 isolated_anon:0 active_file:274 inactive_file:308 isolated_file:0 unevictable:0 dirty:38 writeback:0 unstable:0 slab_reclaimable:3259 slab_unreclaimable:38342 mapped:181 shmem:12 pagetables:1345 bounce:0 free:12914 free_pcp:0 free_cma:0 [60604.476120] Node 0 DMA free:7020kB min:416kB low:520kB high:624kB active_anon:4kB inactive_anon:0kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:12kB writeback:0kB mapped:4kB shmem:48kB slab_reclaimable:132kB slab_unreclaimable:724kB kernel_stack:16kB pagetables:124kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1494 all_unreclaimable? yes [60604.480017] lowmem_reserve[]: 0 1660 1660 1660 [60604.480573] Node 0 DMA32 free:44636kB min:44636kB low:55792kB high:66952kB active_anon:0kB inactive_anon:24kB active_file:1092kB inactive_file:1232kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080744kB managed:1704004kB mlocked:0kB dirty:140kB writeback:0kB mapped:720kB shmem:0kB slab_reclaimable:12904kB slab_unreclaimable:152644kB kernel_stack:2848kB pagetables:5256kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3668 all_unreclaimable? yes [60604.484717] lowmem_reserve[]: 0 0 0 0 [60604.485180] Node 0 DMA: 9*4kB (UM) 5*8kB (UM) 10*16kB (M) 9*32kB (UM) 8*64kB (UM) 5*128kB (UM) 5*256kB (UM) 4*512kB (UM) 0*1024kB 1*2048kB (M) 0*4096kB = 7052kB [60604.486948] Node 0 DMA32: 2654*4kB (EM) 2385*8kB (EM) 941*16kB (UM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44752kB [60604.488485] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [60604.489308] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [60604.490109] 303 total pagecache pages [60604.490472] 6 pages in swap cache [60604.490790] Swap cache stats: add 86869, delete 86863, find 18036/30348 [60604.491421] Free swap = 3521276kB [60604.491749] Total swap = 3671036kB [60604.492082] 524184 pages RAM [60604.492368] 0 pages HighMem/MovableOnly [60604.492812] 94206 pages reserved [60604.493137] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [60604.493889] [ 333] 0 333 9868 0 22 108 0 systemd-journal [60604.494776] [ 359] 0 359 29149 0 26 79 0 lvmetad [60604.495542] [ 366] 0 366 11101 1 23 147 -1000 systemd-udevd [60604.496357] [ 453] 0 453 13877 0 26 119 -1000 auditd [60604.497124] [ 480] 0 480 6627 1 19 95 0 systemd-logind [60604.497929] [ 481] 999 481 134608 0 60 2165 0 polkitd [60604.498689] [ 482] 0 482 5381 0 15 59 0 irqbalance [60604.499486] [ 483] 81 483 14590 1 34 213 -900 dbus-daemon [60604.500283] [ 484] 32 484 17305 0 38 160 0 rpcbind [60604.501042] [ 485] 0 485 48770 0 36 126 0 gssproxy [60604.501815] [ 486] 0 486 137505 0 87 654 0 NetworkManager [60604.502639] [ 501] 998 501 30087 0 29 123 0 chronyd [60604.503403] [ 533] 0 533 26849 1 53 499 0 dhclient [60604.504171] [ 895] 0 895 143453 42 98 2797 0 tuned [60604.504929] [ 896] 0 896 28203 1 55 257 -1000 sshd [60604.505686] [ 900] 0 900 74575 8 73 916 0 rsyslogd [60604.506456] [ 906] 0 906 6791 1 18 62 0 xinetd [60604.507223] [ 913] 29 913 10605 0 24 209 0 rpc.statd [60604.508002] [ 917] 997 917 56469 0 23 285 0 munged [60604.508760] [ 992] 0 992 31570 1 19 155 0 crond [60604.509518] [ 993] 0 993 6476 0 19 52 0 atd [60604.510256] [ 996] 0 996 167981 0 69 580 0 automount [60604.511040] [ 1007] 0 1007 27522 1 12 33 0 agetty [60604.511804] [ 1009] 0 1009 27522 1 9 32 0 agetty [60604.512563] [ 1181] 0 1181 22408 0 43 265 0 master [60604.513318] [ 1200] 89 1200 22451 0 46 253 0 qmgr [60604.514075] [10963] 0 10963 39169 0 78 351 0 sshd [60604.514809] [10965] 0 10965 28296 1 14 58 0 run_test.sh [60604.515601] [11233] 0 11233 29536 1 16 790 0 bash [60604.516346] [21585] 0 21585 29536 0 13 790 0 bash [60604.517086] [21586] 0 21586 26988 0 10 27 0 tee [60604.517808] [21767] 0 21767 29573 1 14 840 0 bash [60604.518551] [31867] 89 31867 22434 0 45 251 0 pickup [60604.519302] [11943] 0 11943 29607 1 14 863 0 bash [60604.520033] [11944] 0 11944 26988 0 10 28 0 tee [60604.520775] [12397] 0 12397 29607 0 14 863 0 bash [60604.521528] [12398] 0 12398 29607 0 14 863 0 bash [60604.522264] [12911] 0 12911 24022 0 22 84 0 pdsh [60604.523013] [12912] 0 12912 29228 0 14 42 0 sed [60604.523754] [15184] 0 15184 29474 1 13 732 0 rundbench [60604.524547] [15195] 0 15195 1618 26 9 52 0 dbench [60604.525302] [15196] 0 15196 1620 61 9 62 0 dbench [60604.526052] [16324] 0 16324 30920 53 15 89 0 tar [60604.526795] [16325] 0 16325 30852 57 17 81 0 tar [60604.527548] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled [60604.528460] CPU: 0 PID: 16324 Comm: tar Kdump: loaded Tainted: G OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [60604.529473] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [60604.530011] Call Trace: [60604.530259] [<ffffffff81b0e84e>] dump_stack+0x19/0x1b [60604.530746] [<ffffffff81b08b50>] panic+0xe8/0x21f [60604.531196] [<ffffffff81598295>] check_panic_on_oom+0x55/0x60 [60604.531747] [<ffffffff8159868b>] out_of_memory+0x23b/0x4f0 [60604.532268] [<ffffffff8159f224>] __alloc_pages_nodemask+0xaa4/0xbb0 [60604.532865] [<ffffffff815ec525>] alloc_pages_vma+0xb5/0x200 [60604.533400] [<ffffffff815dae45>] __read_swap_cache_async+0x115/0x190 [60604.533994] [<ffffffff815daee6>] read_swap_cache_async+0x26/0x60 [60604.534565] [<ffffffff815dafc8>] swapin_readahead+0xa8/0x110 [60604.535096] [<ffffffff815c5f37>] handle_pte_fault+0x777/0xc30 [60604.535643] [<ffffffff815c7c3d>] handle_mm_fault+0x39d/0x9b0 [60604.536180] [<ffffffff81525092>] ? from_kgid_munged+0x12/0x20 [60604.536733] [<ffffffff81b1b557>] __do_page_fault+0x197/0x4f0 [60604.537268] [<ffffffff81b1b996>] trace_do_page_fault+0x56/0x150 [60604.537828] [<ffffffff81b1af22>] do_async_page_fault+0x22/0xf0 [60604.538383] [<ffffffff81b17788>] async_page_fault+0x28/0x30 The kernel-crash for https://testing.whamcloud.com/test_sets/0bb1397a-9bb9-11e8-8ee3-52540065bddc has tar envoking the oom-killer, but we have a few different *_newfstat calls in the kernel crash: [44806.676691] Lustre: DEBUG MARKER: test_26 fail mds1 1 times [44998.330738] tar invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 [44998.332386] tar cpuset=/ mems_allowed=0 [44998.333090] CPU: 0 PID: 28253 Comm: tar Kdump: loaded Tainted: G W OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [44998.335039] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [44998.335669] Call Trace: [44998.335982] [<ffffffffabd0e84e>] dump_stack+0x19/0x1b [44998.336497] [<ffffffffabd0a1d0>] dump_header+0x90/0x229 [44998.337027] [<ffffffffabd1badf>] ? notifier_call_chain+0x4f/0x70 [44998.337645] [<ffffffffab6c17b8>] ? __blocking_notifier_call_chain+0x58/0x70 [44998.338336] [<ffffffffab79826e>] check_panic_on_oom+0x2e/0x60 [44998.338902] [<ffffffffab79868b>] out_of_memory+0x23b/0x4f0 [44998.339448] [<ffffffffab79f224>] __alloc_pages_nodemask+0xaa4/0xbb0 [44998.340074] [<ffffffffab7ec525>] alloc_pages_vma+0xb5/0x200 [44998.340639] [<ffffffffab7dae45>] __read_swap_cache_async+0x115/0x190 [44998.341254] [<ffffffffab7daee6>] read_swap_cache_async+0x26/0x60 [44998.341850] [<ffffffffab7dafc8>] swapin_readahead+0xa8/0x110 [44998.342416] [<ffffffffab7c5f37>] handle_pte_fault+0x777/0xc30 [44998.342995] [<ffffffffab7c7c3d>] handle_mm_fault+0x39d/0x9b0 [44998.343566] [<ffffffffabd1b557>] __do_page_fault+0x197/0x4f0 [44998.344126] [<ffffffffabd1b996>] trace_do_page_fault+0x56/0x150 [44998.344722] [<ffffffffabd1af22>] do_async_page_fault+0x22/0xf0 [44998.345287] [<ffffffffabd17788>] async_page_fault+0x28/0x30 [44998.345888] [<ffffffffab959730>] ? copy_user_generic_string+0x30/0x40 [44998.346533] [<ffffffffab82142f>] ? cp_new_stat+0x14f/0x180 [44998.347077] [<ffffffffab8215b4>] SYSC_newfstat+0x34/0x60 [44998.347607] [<ffffffffab82179e>] SyS_newfstat+0xe/0x10 [44998.348120] [<ffffffffabd20795>] system_call_fastpath+0x1c/0x21 [44998.348710] [<ffffffffabd206e1>] ? system_call_after_swapgs+0xae/0x146 [44998.349352] Mem-Info: [44998.349595] active_anon:2 inactive_anon:18 isolated_anon:0 active_file:15 inactive_file:1090 isolated_file:0 unevictable:0 dirty:0 writeback:30 unstable:0 slab_reclaimable:3679 slab_unreclaimable:38446 mapped:55 shmem:0 pagetables:1385 bounce:0 free:12871 free_pcp:108 free_cma:0 [44998.352588] Node 0 DMA free:6956kB min:416kB low:520kB high:624kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:72kB slab_unreclaimable:684kB kernel_stack:32kB pagetables:24kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [44998.356534] lowmem_reserve[]: 0 1660 1660 1660 [44998.357078] Node 0 DMA32 free:44528kB min:44636kB low:55792kB high:66952kB active_anon:8kB inactive_anon:72kB active_file:60kB inactive_file:4360kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080744kB managed:1704004kB mlocked:0kB dirty:0kB writeback:120kB mapped:220kB shmem:0kB slab_reclaimable:14644kB slab_unreclaimable:153100kB kernel_stack:2832kB pagetables:5516kB unstable:0kB bounce:0kB free_pcp:432kB local_pcp:4kB free_cma:0kB writeback_tmp:0kB pages_scanned:1441 all_unreclaimable? yes [44998.361327] lowmem_reserve[]: 0 0 0 0 [44998.361788] Node 0 DMA: 17*4kB (U) 11*8kB (U) 5*16kB (U) 3*32kB (U) 0*64kB 2*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 3*2048kB (M) 0*4096kB = 6988kB [44998.363459] Node 0 DMA32: 1871*4kB (EM) 2399*8kB (UEM) 1117*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44548kB [44998.365051] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [44998.365894] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [44998.366710] 1163 total pagecache pages [44998.367087] 0 pages in swap cache [44998.367423] Swap cache stats: add 95573, delete 95573, find 17225/30172 [44998.368063] Free swap = 3521020kB [44998.368403] Total swap = 3671036kB [44998.368744] 524184 pages RAM [44998.369026] 0 pages HighMem/MovableOnly [44998.369409] 94206 pages reserved [44998.369738] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [44998.370539] [ 337] 0 337 10055 1 24 98 0 systemd-journal [44998.371399] [ 359] 0 359 29149 0 26 79 0 lvmetad [44998.372180] [ 364] 0 364 11100 1 24 146 -1000 systemd-udevd [44998.373025] [ 455] 0 455 13877 0 27 119 -1000 auditd [44998.373804] [ 481] 0 481 5381 0 15 59 0 irqbalance [44998.374612] [ 484] 32 484 17305 0 36 139 0 rpcbind [44998.375402] [ 486] 81 486 14554 1 33 176 -900 dbus-daemon [44998.376211] [ 489] 0 489 48770 0 37 126 0 gssproxy [44998.377005] [ 492] 998 492 30087 0 29 124 0 chronyd [44998.377795] [ 499] 0 499 137506 0 87 1128 0 NetworkManager [44998.378649] [ 500] 999 500 134608 0 61 1403 0 polkitd [44998.379433] [ 501] 0 501 6594 1 17 77 0 systemd-logind [44998.380252] [ 546] 0 546 26849 1 53 498 0 dhclient [44998.381077] [ 900] 0 900 28203 1 60 257 -1000 sshd [44998.381846] [ 902] 0 902 143453 0 96 3303 0 tuned [44998.382623] [ 909] 0 909 74575 0 75 908 0 rsyslogd [44998.383418] [ 913] 29 913 10605 0 25 209 0 rpc.statd [44998.384215] [ 914] 0 914 6791 1 17 63 0 xinetd [44998.385003] [ 918] 997 918 56469 0 21 274 0 munged [44998.385798] [ 983] 0 983 6476 0 19 52 0 atd [44998.386544] [ 985] 0 985 31570 0 21 155 0 crond [44998.387319] [ 993] 0 993 167982 0 69 547 0 automount [44998.388108] [ 997] 0 997 27522 1 9 32 0 agetty [44998.388892] [ 1001] 0 1001 27522 1 12 32 0 agetty [44998.389690] [ 1320] 0 1320 22408 0 42 259 0 master [44998.390467] [ 1346] 89 1346 22451 0 45 254 0 qmgr [44998.391223] [10968] 0 10968 39169 0 77 365 0 sshd [44998.391985] [10970] 0 10970 28296 1 13 58 0 run_test.sh [44998.392810] [11240] 0 11240 29470 1 16 733 0 bash [44998.393561] [18492] 89 18492 22434 0 46 252 0 pickup [44998.394334] [ 1201] 0 1201 29470 0 13 733 0 bash [44998.395082] [ 1202] 0 1202 26988 0 9 27 0 tee [44998.395840] [ 1396] 0 1396 29538 1 12 784 0 bash [44998.396621] [23895] 0 23895 29538 1 12 806 0 bash [44998.397382] [23896] 0 23896 26988 0 10 28 0 tee [44998.398123] [24348] 0 24348 29538 0 12 806 0 bash [44998.398885] [24349] 0 24349 29538 0 12 806 0 bash [44998.399652] [24854] 0 24854 24023 0 21 87 0 pdsh [44998.400403] [24855] 0 24855 29228 0 13 42 0 sed [44998.401140] [27114] 0 27114 29438 1 14 675 0 rundbench [44998.401941] [27125] 0 27125 1618 0 9 44 0 dbench [44998.402713] [27126] 0 27126 1620 1 9 56 0 dbench [44998.403486] [28252] 0 28252 30920 0 17 95 0 tar [44998.404231] [28253] 0 28253 30852 0 16 77 0 tar [44998.404986] [28255] 0 28255 40840 6 37 209 0 crond [44998.405759] Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled [44998.406685] CPU: 0 PID: 28253 Comm: tar Kdump: loaded Tainted: G W OE ------------ 3.10.0-862.9.1.el7.x86_64 #1 [44998.407727] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [44998.408277] Call Trace: [44998.408522] [<ffffffffabd0e84e>] dump_stack+0x19/0x1b [44998.409021] [<ffffffffabd08b50>] panic+0xe8/0x21f [44998.409491] [<ffffffffab798295>] check_panic_on_oom+0x55/0x60 [44998.410058] [<ffffffffab79868b>] out_of_memory+0x23b/0x4f0 [44998.410588] [<ffffffffab79f224>] __alloc_pages_nodemask+0xaa4/0xbb0 [44998.411198] [<ffffffffab7ec525>] alloc_pages_vma+0xb5/0x200 [44998.411747] [<ffffffffab7dae45>] __read_swap_cache_async+0x115/0x190 [44998.412363] [<ffffffffab7daee6>] read_swap_cache_async+0x26/0x60 [44998.412951] [<ffffffffab7dafc8>] swapin_readahead+0xa8/0x110 [44998.413498] [<ffffffffab7c5f37>] handle_pte_fault+0x777/0xc30 [44998.414062] [<ffffffffab7c7c3d>] handle_mm_fault+0x39d/0x9b0 [44998.414610] [<ffffffffabd1b557>] __do_page_fault+0x197/0x4f0 [44998.415172] [<ffffffffabd1b996>] trace_do_page_fault+0x56/0x150 [44998.415747] [<ffffffffabd1af22>] do_async_page_fault+0x22/0xf0 [44998.416313] [<ffffffffabd17788>] async_page_fault+0x28/0x30 [44998.416860] [<ffffffffab959730>] ? copy_user_generic_string+0x30/0x40 [44998.417479] [<ffffffffab82142f>] ? cp_new_stat+0x14f/0x180 [44998.418024] [<ffffffffab8215b4>] SYSC_newfstat+0x34/0x60 [44998.418542] [<ffffffffab82179e>] SyS_newfstat+0xe/0x10 [44998.419056] [<ffffffffabd20795>] system_call_fastpath+0x1c/0x21 [44998.419651] [<ffffffffabd206e1>] ? system_call_after_swapgs+0xae/0x146

            People

              bobijam Zhenyu Xu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: