Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
1 x 300TB OST and 85% of filesystem is filled up.
e2freefrag is not able to finish because of OOM below.
# rpm -qa | grep e2fsprogs e2fsprogs-1.46.2.wc3-0.el7.x86_64 e2fsprogs-libs-1.46.2.wc3-0.el7.x86_64 e2fsprogs-devel-1.46.2.wc3-0.el7.x86_64 # df -t lustre Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda 313826717284 262586960268 48064149188 85% /lustre/ost0000 # e2freefrag /dev/sda
Jan 12 12:13:52 es7990e1-vm1 kernel: Call Trace: Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf835a9>] dump_stack+0x19/0x1b Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf7e648>] dump_header+0x90/0x229 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac906492>] ? ktime_get_ts64+0x52/0xf0 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac95db1f>] ? delayacct_end+0x8f/0xb0 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac9c204d>] oom_kill_process+0x2cd/0x490 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac9c1a3d>] ? oom_unkillable_task+0xcd/0x120 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac9c273a>] out_of_memory+0x31a/0x500 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac9c9354>] __alloc_pages_nodemask+0xad4/0xbe0 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffaca1c739>] alloc_pages_vma+0xa9/0x200 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffac9f6337>] handle_mm_fault+0xcb7/0xfb0 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf90653>] __do_page_fault+0x213/0x500 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf90a26>] trace_do_page_fault+0x56/0x150 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf8ffa2>] do_async_page_fault+0x22/0xf0 Jan 12 12:13:52 es7990e1-vm1 kernel: [<ffffffffacf8c7a8>] async_page_fault+0x28/0x30 Jan 12 12:13:52 es7990e1-vm1 kernel: Mem-Info: Jan 12 12:13:52 es7990e1-vm1 kernel: active_anon:32965316 inactive_anon:953248 isolated_anon:0#012 active_file:19299 inactive_file:18944 isolated_file:0#012 unevictable:0 dirty:0 writeback:2 unstable:0#012 slab_reclaimable:122379 slab_unreclaimable:74262#012 mapped:8092 shmem:8083 pagetables:69898 bounce:0#012 free:2249818 free_pcp:3985 free_cma:0 Jan 12 12:13:52 es7990e1-vm1 kernel: Node 0 DMA free:15892kB min:868kB low:1084kB high:1300kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Jan 12 12:13:52 es7990e1-vm1 kernel: lowmem_reserve[]: 0 943 150076 150076 Jan 12 12:13:52 es7990e1-vm1 kernel: Node 0 DMA32 free:648832kB min:52748kB low:65932kB high:79120kB active_anon:59992kB inactive_anon:47604kB active_file:1524kB inactive_file:1020kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080608kB managed:966464kB mlocked:0kB dirty:0kB writeback:0kB mapped:252kB shmem:252kB slab_reclaimable:6616kB slab_unreclaimable:5092kB kernel_stack:384kB pagetables:236kB unstable:0kB bounce:0kB free_pcp:7136kB local_pcp:264kB free_cma:0kB writeback_tmp:0kB pages_scanned:5844902 all_unreclaimable? yes Jan 12 12:13:52 es7990e1-vm1 kernel: lowmem_reserve[]: 0 0 149132 149132 Jan 12 12:13:52 es7990e1-vm1 kernel: Node 0 Normal free:8334548kB min:8334988kB low:10418732kB high:12502480kB active_anon:131801272kB inactive_anon:3765388kB active_file:75672kB inactive_file:74756kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:155189248kB managed:152711608kB mlocked:0kB dirty:0kB writeback:8kB mapped:32116kB shmem:32080kB slab_reclaimable:482900kB slab_unreclaimable:291940kB kernel_stack:19024kB pagetables:279356kB unstable:0kB bounce:0kB free_pcp:8804kB local_pcp:264kB free_cma:0kB writeback_tmp:0kB pages_scanned:261380 all_unreclaimable? yes Jan 12 12:13:52 es7990e1-vm1 kernel: lowmem_reserve[]: 0 0 0 0 Jan 12 12:13:52 es7990e1-vm1 kernel: Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB Jan 12 12:13:53 es7990e1-vm1 kernel: Node 0 DMA32: 247*4kB (UE) 268*8kB (UEM) 209*16kB (UE) 161*32kB (UE) 83*64kB (UEM) 10*128kB (UEM) 94*256kB (UEM) 88*512kB (UE) 9*1024kB (EM) 39*2048kB (UEM) 115*4096kB (UEM) = 647468kB Jan 12 12:13:53 es7990e1-vm1 kernel: Node 0 Normal: 8198*4kB (UEM) 8367*8kB (UEM) 6746*16kB (UE) 4778*32kB (UEM) 3271*64kB (UEM) 1641*128kB (UE) 694*256kB (UEM) 274*512kB (UEM) 146*1024kB (UEM) 110*2048kB (UE) 1675*4096kB (U) = 8333488kB Jan 12 12:13:53 es7990e1-vm1 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Jan 12 12:13:53 es7990e1-vm1 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Jan 12 12:13:53 es7990e1-vm1 kernel: 46685 total pagecache pages Jan 12 12:13:53 es7990e1-vm1 kernel: 227 pages in swap cache Jan 12 12:13:53 es7990e1-vm1 kernel: Swap cache stats: add 2739967, delete 2739738, find 67504/71273 Jan 12 12:13:53 es7990e1-vm1 kernel: Free swap = 0kB Jan 12 12:13:53 es7990e1-vm1 kernel: Total swap = 5472252kB Jan 12 12:13:53 es7990e1-vm1 kernel: 39321462 pages RAM Jan 12 12:13:53 es7990e1-vm1 kernel: 0 pages HighMem/MovableOnly Jan 12 12:13:53 es7990e1-vm1 kernel: 897967 pages reserved Jan 12 12:13:53 es7990e1-vm1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name Jan 12 12:13:53 es7990e1-vm1 kernel: [ 807] 0 807 35383 9157 70 31 0 systemd-journal Jan 12 12:13:53 es7990e1-vm1 kernel: [ 838] 0 838 11412 5 23 134 -1000 systemd-udevd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 973] 0 973 152045 0 38 208 0 lvmetad Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1737] 0 1737 13883 22 27 89 -1000 auditd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1766] 0 1766 6596 20 18 54 0 systemd-logind Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1770] 0 1770 5444 50 16 65 0 irqbalance Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1775] 0 1775 22652 0 47 224 0 rngd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1790] 998 1790 2145 7 10 30 0 lsmd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1800] 32 1800 17314 14 38 129 0 rpcbind Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1806] 81 1806 15046 51 34 82 -900 dbus-daemon Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1919] 0 1919 50357 11 39 123 0 gssproxy Jan 12 12:13:53 es7990e1-vm1 kernel: [ 1970] 0 1970 13220 1 32 205 0 smartd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 2899] 0 2899 25736 1 48 517 0 dhclient Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3184] 0 3184 6261 0 17 58 0 xinetd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3193] 0 3193 76950 5874 80 348 0 rsyslogd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3198] 0 3198 28235 1 57 257 -1000 sshd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3235] 0 3235 6477 0 18 53 0 atd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3247] 0 3247 57127 0 40 188 0 sharpd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3277] 0 3277 27551 1 10 33 0 agetty Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3279] 0 3279 27551 1 13 33 0 agetty Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3472] 0 3472 31596 20 20 135 0 crond Jan 12 12:13:53 es7990e1-vm1 kernel: [ 3473] 38 3473 6954 1 17 150 0 ntpd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 6542] 999 6542 153604 0 63 883 0 polkitd Jan 12 12:13:53 es7990e1-vm1 kernel: [ 6555] 29 6555 16802 0 38 285 0 rpc.statd Jan 12 12:13:53 es7990e1-vm1 kernel: [13484] 0 13484 39204 40 77 303 0 sshd Jan 12 12:13:53 es7990e1-vm1 kernel: [13486] 0 13486 29271 1 13 506 0 bash Jan 12 12:13:53 es7990e1-vm1 kernel: [13770] 0 13770 32001 1 18 144 0 screen Jan 12 12:13:53 es7990e1-vm1 kernel: [13771] 0 13771 29273 1 14 489 0 bash Jan 12 12:13:53 es7990e1-vm1 kernel: [13950] 0 13950 35405515 33909205 68922 1355850 0 e2freefrag Jan 12 12:13:53 es7990e1-vm1 kernel: [13951] 0 13951 27013 0 9 25 0 tee Jan 12 12:13:53 es7990e1-vm1 kernel: [13984] 0 13984 40796 372 36 77 0 top Jan 12 12:13:53 es7990e1-vm1 kernel: Out of memory: Kill process 13950 (e2freefrag) score 861 or sacrifice child Jan 12 12:13:53 es7990e1-vm1 kernel: Killed process 13950 (e2freefrag), UID 0, total-vm:141622060kB, anon-rss:135636820kB, file-rss:0kB, shmem-rss:0kB Jan 12 12:13:53 es7990e1-vm1 kernel: systemd-journal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Jan 12 12:13:53 es7990e1-vm1 kernel: systemd-journal cpuset=/ mems_allowed=0 Jan 12 12:13:53 es7990e1-vm1 kernel: CPU: 4 PID: 807 Comm: systemd-journal Kdump: loaded Tainted: G OE ------------ T 3.10.0-1160.31.1.el7_lustre.ddn15.x86_64 #
vm.min_free_kbytes = 8388608 didn't help in this case.