AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题

问题[oom](server)

Martin Hope
dtw
Asked: 2021-04-10 04:24:41 +0800 CST

EC2 t2.micro 实例中 CentOS8 上的 mariadb oom-killer

  • 0

我在运行 nginx、mariadb、php 和 WordPress 的 t2.micro 实例 (1GB) 中似乎存在内存问题。

我可以看到 mariadb.service 经常被杀死(我使用了grep -e kill /var/log/messages下面的示例输出)。如您所见,mysqld 正在杀死 mariadb(这不是自杀吗?)。

我已经尝试对 mariadb 进行各种调整和调整,但我认为这更像是一个整体系统问题。

当“崩溃”发生时,我无法隔离。我可以愉快地登录到 WordPress 管理员,离开工作,当我切换到新区域(提示 dbase 调用)时,站点挂起。SSH 连接也会同时挂起。

t2.micro 是不是不够强大?

Apr  9 11:22:32 ip-172-31-20-68 systemd[1]: mariadb.service: Main process exited, code=killed, status=9/KILL
Apr  9 11:23:25 ip-172-31-20-68 kernel: mysqld invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr  9 11:23:25 ip-172-31-20-68 kernel: oom_kill_process.cold.28+0xb/0x10
Apr  9 11:23:25 ip-172-31-20-68 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mariadb.service,task=mysqld,pid=6383,uid=27
Apr  9 11:23:25 ip-172-31-20-68 systemd[1]: mariadb.service: Main process exited, code=killed, status=9/KILL
Apr  9 11:27:04 ip-172-31-20-68 kernel: tuned invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr  9 11:27:04 ip-172-31-20-68 kernel: oom_kill_process.cold.28+0xb/0x10
Apr  9 11:27:04 ip-172-31-20-68 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mariadb.service,task=mysqld,pid=6483,uid=27
Apr  9 11:27:51 ip-172-31-20-68 NetworkManager[928]: <info>  [1617967671.1410] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Apr  9 11:27:51 ip-172-31-20-68 NetworkManager[928]: <info>  [1617967671.1411] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Apr  9 12:04:48 ip-172-31-20-68 kernel: mysqld invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr  9 12:04:48 ip-172-31-20-68 kernel: oom_kill_process.cold.28+0xb/0x10
Apr  9 12:04:48 ip-172-31-20-68 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mariadb.service,task=mysqld,pid=1746,uid=27
Apr  9 12:04:48 ip-172-31-20-68 systemd[1]: mariadb.service: Main process exited, code=killed, status=9/KILL
Apr  9 12:06:04 ip-172-31-20-68 kernel: tuned invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr  9 12:06:04 ip-172-31-20-68 kernel: oom_kill_process.cold.28+0xb/0x10
Apr  9 12:06:04 ip-172-31-20-68 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-3.scope,task=dnf,pid=1982,uid=0
Apr  9 12:08:30 ip-172-31-20-68 kernel: php-fpm invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Apr  9 12:08:31 ip-172-31-20-68 kernel: oom_kill_process.cold.28+0xb/0x10
Apr  9 12:08:31 ip-172-31-20-68 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mariadb.service,task=mysqld,pid=2126,uid=27
Apr  9 12:08:31 ip-172-31-20-68 systemd[1]: mariadb.service: Main process exited, code=killed, status=9/KILL
amazon-ec2 mariadb oom
  • 1 个回答
  • 231 Views
Martin Hope
foss4me
Asked: 2020-10-22 15:44:14 +0800 CST

对许多小文件进行基准测试时,fio 3.23 核心转储

  • 6

我被要求fio提供此测试数据集的基准测试结果:1048576x1MiB。因此,整体大小为1TiB。该集合包含2^20 个 1MiB文件。服务器运行CentOS Linux release 7.8.2003 (Core)。它有足够的内存:

[root@tbn-6 src]# free -g
              total        used        free      shared  buff/cache   available
Mem:            376           8         365           0           2         365
Swap:             3           2           1

它实际上不是物理服务器。相反,它是一个具有以下 CPU 的 Docker 容器:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                48
On-line CPU(s) list:   0-47
Thread(s) per core:    2
Core(s) per socket:    12
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
[...]

为什么是码头工人?我们正在开展一个项目,评估使用容器而不是物理服务器的适当性。回到fio问题。

fio我记得我以前在处理包含许多小文件的数据集时遇到了麻烦。所以,我做了以下检查:

[root@tbn-6 src]# ulimit -Hn
8388608
[root@tbn-6 src]# ulimit -Sn
8388608
[root@tbn-6 src]# cat /proc/sys/kernel/shmmax
18446744073692774399

在我看来一切都很好。在撰写本文时,我还使用 GCC 9编译了最新的 fio 3.23 。

[root@tbn-6 src]# fio --version
fio-3.23

这是作业文件:

[root@tbn-6 src]# cat testfio.ini 
[writetest]
thread=1
blocksize=2m
rw=randwrite
direct=1
buffered=0
ioengine=psync
gtod_reduce=1
numjobs=12
iodepth=1
runtime=180
group_reporting=1
percentage_random=90
opendir=./1048576x1MiB

注:以上内容,可取出以下内容:

[...]
gtod_reduce=1
[...]
runtime=180
group_reporting=1
[...]

其余的必须保留。这是因为在我们看来运行 fio 时,作业文件的设置方式应尽可能模拟应用程序与存储的交互,即使知道fio!=也是如此the application。

我第一次跑步是这样的

[root@tbn-6 src]# fio testfio.ini
smalloc: OOM. Consider using --alloc-size to increase the shared memory available.
smalloc: size = 368, alloc_size = 388, blocks = 13
smalloc: pool 0, free/total blocks 1/524320
smalloc: pool 1, free/total blocks 8/524320
smalloc: pool 2, free/total blocks 10/524320
smalloc: pool 3, free/total blocks 10/524320
smalloc: pool 4, free/total blocks 10/524320
smalloc: pool 5, free/total blocks 10/524320
smalloc: pool 6, free/total blocks 10/524320
smalloc: pool 7, free/total blocks 10/524320
fio: filesetup.c:1613: alloc_new_file: Assertion `0' failed.
Aborted (core dumped)

好的,是时候使用--alloc-size

[root@tbn-6 src]# fio --alloc-size=776 testfio.ini
smalloc: OOM. Consider using --alloc-size to increase the shared memory available.
smalloc: size = 368, alloc_size = 388, blocks = 13
smalloc: pool 0, free/total blocks 1/524320
smalloc: pool 1, free/total blocks 8/524320
smalloc: pool 2, free/total blocks 10/524320
smalloc: pool 3, free/total blocks 10/524320
smalloc: pool 4, free/total blocks 10/524320
smalloc: pool 5, free/total blocks 10/524320
smalloc: pool 6, free/total blocks 10/524320
smalloc: pool 7, free/total blocks 10/524320
smalloc: pool 8, free/total blocks 8/524288
smalloc: pool 9, free/total blocks 8/524288
smalloc: pool 10, free/total blocks 8/524288
smalloc: pool 11, free/total blocks 8/524288
smalloc: pool 12, free/total blocks 8/524288
smalloc: pool 13, free/total blocks 8/524288
smalloc: pool 14, free/total blocks 8/524288
smalloc: pool 15, free/total blocks 8/524288
fio: filesetup.c:1613: alloc_new_file: Assertion `0' failed.
Aborted (core dumped)

回到原点 :(

我肯定错过了什么。任何帮助都非常有义务。

memory fio crash oom
  • 1 个回答
  • 429 Views
Martin Hope
3voC
Asked: 2019-08-13 02:13:03 +0800 CST

尽管有大量免费 SWAP,但 OOM

  • 0

在某些 ML 训练期间htop,当 OOM 发生时,显示所有 RAM (16GB) 和仅 2G(16GB 中)的 SWAP 的使用情况。

dmesg显示:

[pon sie 12 11:53:44 2019] Purging GPU memory, 0 bytes freed, 131072 bytes still pinned.
[pon sie 12 11:53:44 2019] 73728 and 0 bytes still available in the bound and unbound GPU page lists.
[pon sie 12 11:53:44 2019] iscsid invoked oom-killer: gfp_mask=0x24200ca, order=0, oom_score_adj=0
[pon sie 12 11:53:44 2019] iscsid cpuset=/ mems_allowed=0
[pon sie 12 11:53:44 2019] CPU: 1 PID: 1306 Comm: iscsid Tainted: P           OE   4.4.0-116-generic #140-Ubuntu
[pon sie 12 11:53:44 2019] Hardware name: MSI MS-7996/H110M PRO-VD (MS-7996), BIOS 2.E0 08/11/2017
[pon sie 12 11:53:44 2019]  0000000000000286 faad9ce5c4d517dc ffff8804677a79d8 ffffffff813ffc13
[pon sie 12 11:53:44 2019]  ffff8804677a7b90 ffff880464551e00 ffff8804677a7a48 ffffffff8121012e
[pon sie 12 11:53:44 2019]  0000000000000015 0000000000000000 ffff88046464f300 ffff880463138000
[pon sie 12 11:53:44 2019] Call Trace:
[pon sie 12 11:53:44 2019]  [<ffffffff813ffc13>] dump_stack+0x63/0x90
[pon sie 12 11:53:44 2019]  [<ffffffff8121012e>] dump_header+0x5a/0x1c5
[pon sie 12 11:53:44 2019]  [<ffffffff81397c44>] ? apparmor_capable+0xc4/0x1b0
[pon sie 12 11:53:44 2019]  [<ffffffff811968f2>] oom_kill_process+0x202/0x3c0
[pon sie 12 11:53:44 2019]  [<ffffffff81196d19>] out_of_memory+0x219/0x460
[pon sie 12 11:53:44 2019]  [<ffffffff8119cd45>] __alloc_pages_slowpath.constprop.88+0x965/0xb00
[pon sie 12 11:53:44 2019]  [<ffffffff8119d168>] __alloc_pages_nodemask+0x288/0x2a0
[pon sie 12 11:53:44 2019]  [<ffffffff811e84ed>] alloc_pages_vma+0xad/0x250
[pon sie 12 11:53:44 2019]  [<ffffffff811d8ece>] __read_swap_cache_async+0xee/0x140
[pon sie 12 11:53:44 2019]  [<ffffffff811d8f46>] read_swap_cache_async+0x26/0x60
[pon sie 12 11:53:44 2019]  [<ffffffff811d9085>] swapin_readahead+0x105/0x1b0
[pon sie 12 11:53:44 2019]  [<ffffffff811c60f0>] handle_mm_fault+0x1320/0x1820
[pon sie 12 11:53:44 2019]  [<ffffffff810f340c>] ? hrtimer_nanosleep+0xdc/0x210
[pon sie 12 11:53:44 2019]  [<ffffffff8106c747>] __do_page_fault+0x197/0x400
[pon sie 12 11:53:44 2019]  [<ffffffff8106c9d2>] do_page_fault+0x22/0x30
[pon sie 12 11:53:44 2019]  [<ffffffff818519d8>] page_fault+0x28/0x30
[pon sie 12 11:53:44 2019] Mem-Info:
[pon sie 12 11:53:44 2019] active_anon:3530449 inactive_anon:422028 isolated_anon:384
                            active_file:179 inactive_file:100 isolated_file:0
                            unevictable:913 dirty:0 writeback:0 unstable:0
                            slab_reclaimable:9555 slab_unreclaimable:17443
                            mapped:666425 shmem:1068600 pagetables:11052 bounce:0
                            free:35287 free_pcp:0 free_cma:0
[pon sie 12 11:53:44 2019] Node 0 DMA free:15888kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[pon sie 12 11:53:44 2019] lowmem_reserve[]: 0 1811 15874 15874 15874
[pon sie 12 11:53:44 2019] Node 0 DMA32 free:63400kB min:7704kB low:9628kB high:11556kB active_anon:1349712kB inactive_anon:449972kB active_file:164kB inactive_file:172kB unevictable:904kB isolated(anon):768kB isolated(file):0kB present:1987100kB managed:1906332kB mlocked:904kB dirty:0kB writeback:0kB mapped:53540kB shmem:495788kB slab_reclaimable:3864kB slab_unreclaimable:7812kB kernel_stack:400kB pagetables:7000kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:11349276 all_unreclaimable? yes
[pon sie 12 11:53:44 2019] lowmem_reserve[]: 0 0 14062 14062 14062
[pon sie 12 11:53:44 2019] Node 0 Normal free:61860kB min:59812kB low:74764kB high:89716kB active_anon:12772084kB inactive_anon:1238140kB active_file:552kB inactive_file:228kB unevictable:2748kB isolated(anon):768kB isolated(file):0kB present:14663680kB managed:14400224kB mlocked:2748kB dirty:0kB writeback:0kB mapped:2612160kB shmem:3778612kB slab_reclaimable:34356kB slab_unreclaimable:61960kB kernel_stack:3952kB pagetables:37208kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:89423048 all_unreclaimable? yes
[pon sie 12 11:53:44 2019] lowmem_reserve[]: 0 0 0 0 0
[pon sie 12 11:53:44 2019] Node 0 DMA: 2*4kB (U) 3*8kB (U) 3*16kB (U) 0*32kB 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
[pon sie 12 11:53:44 2019] Node 0 DMA32: 147*4kB (UME) 54*8kB (UME) 458*16kB (UME) 139*32kB (UME) 106*64kB (UME) 51*128kB (UME) 23*256kB (UE) 17*512kB (UE) 3*1024kB (UME) 0*2048kB 5*4096kB (M) = 64252kB
[pon sie 12 11:53:44 2019] Node 0 Normal: 267*4kB (UME) 183*8kB (UME) 230*16kB (UMEH) 215*32kB (UMEH) 86*64kB (UEH) 86*128kB (UMEH) 70*256kB (UMEH) 26*512kB (UMEH) 1*1024kB (U) 0*2048kB 0*4096kB = 61860kB
[pon sie 12 11:53:44 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[pon sie 12 11:53:44 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[pon sie 12 11:53:44 2019] 1083770 total pagecache pages
[pon sie 12 11:53:44 2019] 14283 pages in swap cache
[pon sie 12 11:53:44 2019] Swap cache stats: add 81268022, delete 81253739, find 37239561/47307273
[pon sie 12 11:53:44 2019] Free swap  = 14632936kB
[pon sie 12 11:53:44 2019] Total swap = 16662524kB
[pon sie 12 11:53:44 2019] 4166692 pages RAM
[pon sie 12 11:53:44 2019] 0 pages HighMem/MovableOnly
[pon sie 12 11:53:44 2019] 86081 pages reserved
[pon sie 12 11:53:44 2019] 0 pages cma reserved
[pon sie 12 11:53:44 2019] 0 pages hwpoisoned
[pon sie 12 11:53:44 2019] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[pon sie 12 11:53:44 2019] [  449]     0   449    42126      113      18       3       75             0 lvmetad
[pon sie 12 11:53:44 2019] [  461]     0   461    11311      172      22       3      361         -1000 systemd-udevd
[pon sie 12 11:53:44 2019] [  917]     0   917   151045      242      28       3      730             0 lxcfs
[pon sie 12 11:53:44 2019] [  924]     0   924     7557      409      19       3       80             0 cron
[pon sie 12 11:53:44 2019] [  929]     0   929     1099      288       7       3       36             0 acpid
[pon sie 12 11:53:44 2019] [  944]   112   944    11227      404      25       3      117             0 avahi-daemon
[pon sie 12 11:53:44 2019] [  946]     0   946     6511      375      18       3       77             0 atd
[pon sie 12 11:53:44 2019] [  951]     0   951     6320      398      17       4      227             0 smartd
[pon sie 12 11:53:44 2019] [  954]   104   954    64098      283      28       3      283             0 rsyslogd
[pon sie 12 11:53:44 2019] [  956]   107   956    10792      329      25       3      191          -900 dbus-daemon
[pon sie 12 11:53:44 2019] [  977]   112   977    11196        0      24       3       81             0 avahi-daemon
[pon sie 12 11:53:44 2019] [  982]     0   982   112713      309      73       4      700             0 NetworkManager
[pon sie 12 11:53:44 2019] [  983]     0   983     8863      347      20       3      241             0 openvpn
[pon sie 12 11:53:44 2019] [  986]     0   986    69321      328      37       3      234             0 accounts-daemon
[pon sie 12 11:53:44 2019] [ 1002]     0  1002     3343      226      11       3       41             0 mdadm
[pon sie 12 11:53:44 2019] [ 1011]     0  1011    69277      371      39       4      188             0 polkitd
[pon sie 12 11:53:44 2019] [ 1087]     0  1087     4030      321      13       3      215             0 dhclient
[pon sie 12 11:53:44 2019] [ 1275]     0  1275    16377      341      37       3      185         -1000 sshd
[pon sie 12 11:53:44 2019] [ 1284]     0  1284   211630        0      86       6     9081          -500 dockerd
[pon sie 12 11:53:44 2019] [ 1306]     0  1306     1305      405       9       3       51             0 iscsid
[pon sie 12 11:53:44 2019] [ 1310]     0  1310     1430      876       9       3        0           -17 iscsid
[pon sie 12 11:53:44 2019] [ 1409]     0  1409     4868      367      14       3       73             0 irqbalance
[pon sie 12 11:53:44 2019] [ 1418]     0  1418     4289      313      14       3       39             0 agetty
[pon sie 12 11:53:44 2019] [ 1457]     0  1457   168379        0      65       6     5264          -500 docker-containe
[pon sie 12 11:53:44 2019] [ 2193]     0  2193    10705      145      14       5      154          -500 docker-proxy
[pon sie 12 11:53:44 2019] [ 2229]     0  2229    66420      186      21       5      660          -500 docker-proxy
[pon sie 12 11:53:44 2019] [ 2264]     0  2264    29138      123      16       5      167          -500 docker-proxy
[pon sie 12 11:53:44 2019] [ 2299]     0  2299     1876        0       8       5      347          -999 docker-containe
[pon sie 12 11:53:44 2019] [ 2391]   100  2391      396        0       7       3       29             0 entrypoint.sh
[pon sie 12 11:53:44 2019] [ 2638]   100  2638    12125        0      30       3     7058             0 tor
[pon sie 12 11:53:44 2019] [30130]     0 30130     7160      153      20       3       89             0 systemd-logind
[pon sie 12 11:53:44 2019] [30475]     0 30475    10959      278      23       3     1630             0 systemd-journal
[pon sie 12 11:53:44 2019] [31698]  1001 31698    11319      336      25       3      216             0 systemd
[pon sie 12 11:53:44 2019] [31701]  1001 31701    53717        0      39       3     1509             0 (sd-pam)
[pon sie 12 11:53:44 2019] [31955]   100 31955    24023      294      17       3       72             0 systemd-timesyn
[pon sie 12 11:53:44 2019] [31981]     0 31981    23731      413      49       4      236             0 sshd
[pon sie 12 11:53:44 2019] [32013]  1001 32013    23731      266      48       4      259             0 sshd
[pon sie 12 11:53:44 2019] [32014]  1001 32014     6235      500      17       3      599             0 bash
[pon sie 12 11:53:44 2019] [32194]     0 32194    23731      419      49       3      236             0 sshd
[pon sie 12 11:53:44 2019] [32287]  1001 32287    23788      197      48       3      244             0 sshd
[pon sie 12 11:53:44 2019] [32288]  1001 32288     6229      420      17       3      659             0 bash
[pon sie 12 11:53:44 2019] [32738]  1001 32738 11646105  3547862    9767      28   454793             0 training.py
[pon sie 12 11:53:44 2019] [32754]     0 32754     4286      267      14       3       45             0 nvidia-persiste
[pon sie 12 11:53:44 2019] [  343]     0   343    23731      437      51       3      236             0 sshd
[pon sie 12 11:53:44 2019] [  445]  1001   445    23731      101      49       3      206             0 sshd
[pon sie 12 11:53:44 2019] [  451]  1001   451     6188      480      17       3      514             0 bash
[pon sie 12 11:53:44 2019] [  473]  1001   473     6847      408      18       3        2             0 htop
[pon sie 12 11:53:44 2019] Out of memory: Kill process 32738 (training.py) score 486 or sacrifice child
[pon sie 12 11:53:44 2019] Killed process 32738 (training.py) total-vm:46584420kB, anon-rss:11534960kB, file-rss:2656488kB

为什么不使用剩余的 SWAP?

oom
  • 1 个回答
  • 288 Views
Martin Hope
Sudh33ra
Asked: 2017-03-17 03:28:21 +0800 CST

Ansible 抛出“错误!发现一名工人处于死亡状态”错误

  • 0

当我运行一个简单地将目录从一个地方复制到另一个地方的剧本时,ansible 抛出

错误!一名工人被发现处于死亡状态

错误。经过一番谷歌搜索,看起来这是由 oom-killer 杀死 ansible 进程引起的(但我不确定是否是这种情况)。我的记忆是:

              total        used        free      shared  buff/cache   available
Mem:            991         372         448           1         170         467
Swap:           511         365         146

我不知道如何解决它。我应该提一下,当我第一次执行剧本时,我只有 RAM,因为内存不足而无法运行。之后,我添加了交换。不确定它是否相关,但请注意它是交换文件,而不是单独的分区。

我在运行时观察了内存,我发现一旦运行该任务,空闲交换就会很快下降。当它达到 0 时抛出错误消息。


我正在运行以下剧本。

---
- hosts: localhost
  become: true
  become_method: sudo
  become_user: root

  vars:
    portals:
      - mysite
    contentPath: "/var/www/"
    backupPath: "/home/dataFiles/backups/"

  tasks:

    - name: backup content
      copy:
        src: "{{ contentPath }}/{{ item }}"
        dest: "{{backupPath}}/{{ item }}/{{ ansible_date_time.date }}/"
      with_items:
        - "{{ portals }}"
...

我上面给出的错误是我从ansible中得到的唯一信息。即使冗长地运行剧本也不会为此提供任何额外的东西。

memory automation swap ansible oom
  • 2 个回答
  • 11108 Views
Martin Hope
Patrick
Asked: 2017-01-31 23:23:45 +0800 CST

即使有足够的可用内存,Linux 进程也会被杀死

  • 9

我正在调查为什么我们的两个进程被 Linux OOM 杀手杀死——尽管似乎在这两个时间都有足够的 RAM 和大量可用的 SWAP。

当我将其解释为这个答案时,第一个内存请求要求 2^2=4 页(16KB)的内存(订单标志)并希望它来自“正常”区域。

Jan 27 04:26:14 kernel: [639964.652706] java invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0

如果我正确解析输出,空间就绰绰有余了:

Node 0 Normal free:178144kB min:55068kB low:68832kB high:82600kB 

几分钟后第二次有同样的请求——而且似乎也有足够的可用空间。

为什么会触发OOM杀手呢?我解析信息是否错误?

  • 该系统是具有 4.4.0-59 x64 内核的 14.04 Ubuntu
  • 该vm.overcommit_memory设置设置为“0”(启发式),这可能不是最佳的。

实例一:

Jan 27 04:26:14 kernel: [639964.652706] java invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
Jan 27 04:26:14 kernel: [639964.652711] java cpuset=/ mems_allowed=0
Jan 27 04:26:14 kernel: [639964.652716] CPU: 5 PID: 2152 Comm: java Not tainted 4.4.0-59-generic #80~14.04.1-Ubuntu
Jan 27 04:26:14 kernel: [639964.652717] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 06/22/2012
Jan 27 04:26:14 kernel: [639964.652719]  0000000000000000 ffff88041a963b38 ffffffff813dbd6c ffff88041a963cf0
Jan 27 04:26:14 kernel: [639964.652721]  0000000000000000 ffff88041a963bc8 ffffffff811fafc6 0000000000000000
Jan 27 04:26:14 kernel: [639964.652722]  0000000000000000 0000000000000000 ffff88042a6d1b88 0000000000000015
Jan 27 04:26:14 kernel: [639964.652724] Call Trace:
Jan 27 04:26:14 kernel: [639964.652731]  [<ffffffff813dbd6c>] dump_stack+0x63/0x87
Jan 27 04:26:14 kernel: [639964.652736]  [<ffffffff811fafc6>] dump_header+0x5b/0x1d5
Jan 27 04:26:14 kernel: [639964.652741]  [<ffffffff813766f1>] ? apparmor_capable+0xd1/0x180
Jan 27 04:26:14 kernel: [639964.652746]  [<ffffffff81188b35>] oom_kill_process+0x205/0x3d0
Jan 27 04:26:14 kernel: [639964.652747]  [<ffffffff8118916b>] out_of_memory+0x40b/0x460
Jan 27 04:26:14 kernel: [639964.652749]  [<ffffffff811fba7f>] __alloc_pages_slowpath.constprop.87+0x742/0x7ad
Jan 27 04:26:14 kernel: [639964.652752]  [<ffffffff8118e167>] __alloc_pages_nodemask+0x237/0x240
Jan 27 04:26:14 kernel: [639964.652754]  [<ffffffff8118e32d>] alloc_kmem_pages_node+0x4d/0xd0
Jan 27 04:26:14 kernel: [639964.652758]  [<ffffffff8107c125>] copy_process+0x185/0x1ce0
Jan 27 04:26:14 kernel: [639964.652763]  [<ffffffff810fd0b4>] ? do_futex+0xf4/0x520
Jan 27 04:26:14 kernel: [639964.652766]  [<ffffffff810a71c9>] ? resched_curr+0xa9/0xd0
Jan 27 04:26:14 kernel: [639964.652768]  [<ffffffff8107de1a>] _do_fork+0x8a/0x310
Jan 27 04:26:14 kernel: [639964.652769]  [<ffffffff8107e149>] SyS_clone+0x19/0x20
Jan 27 04:26:14 kernel: [639964.652775]  [<ffffffff81802c76>] entry_SYSCALL_64_fastpath+0x16/0x75
Jan 27 04:26:14 kernel: [639964.652776] Mem-Info:
Jan 27 04:26:14 kernel: [639964.652780] active_anon:1596719 inactive_anon:281182 isolated_anon:0
Jan 27 04:26:14 kernel: [639964.652780]  active_file:953586 inactive_file:952370 isolated_file:0
Jan 27 04:26:14 kernel: [639964.652780]  unevictable:0 dirty:7358 writeback:0 unstable:0
Jan 27 04:26:14 kernel: [639964.652780]  slab_reclaimable:217903 slab_unreclaimable:12162
Jan 27 04:26:14 kernel: [639964.652780]  mapped:40068 shmem:34861 pagetables:8261 bounce:0
Jan 27 04:26:14 kernel: [639964.652780]  free:71705 free_pcp:0 free_cma:0
Jan 27 04:26:14 kernel: [639964.652783] Node 0 DMA free:15892kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jan 27 04:26:14 kernel: [639964.652787] lowmem_reserve[]: 0 2951 16005 16005 16005
Jan 27 04:26:14 kernel: [639964.652789] Node 0 DMA32 free:92784kB min:12448kB low:15560kB high:18672kB active_anon:1094416kB inactive_anon:368444kB active_file:579188kB inactive_file:561504kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB managed:3048784kB mlocked:0kB dirty:1188kB writeback:0kB mapped:32604kB shmem:27372kB slab_reclaimable:336288kB slab_unreclaimable:7196kB kernel_stack:1520kB pagetables:3964kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 27 04:26:14 kernel: [639964.652793] lowmem_reserve[]: 0 0 13054 13054 13054
Jan 27 04:26:14 kernel: [639964.652795] Node 0 Normal free:178144kB min:55068kB low:68832kB high:82600kB active_anon:5292460kB inactive_anon:756284kB active_file:3235156kB inactive_file:3247976kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13367448kB mlocked:0kB dirty:28244kB writeback:0kB mapped:127668kB shmem:112072kB slab_reclaimable:535324kB slab_unreclaimable:41436kB kernel_stack:3968kB pagetables:29080kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
Jan 27 04:26:14 kernel: [639964.652798] lowmem_reserve[]: 0 0 0 0 0
Jan 27 04:26:14 kernel: [639964.652800] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Jan 27 04:26:14 kernel: [639964.652807] Node 0 DMA32: 18127*4kB (UME) 2601*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 93316kB
Jan 27 04:26:14 kernel: [639964.652814] Node 0 Normal: 32943*4kB (UMEH) 5702*8kB (UMEH) 19*16kB (H) 13*32kB (H) 9*64kB (H) 2*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 178940kB
Jan 27 04:26:14 kernel: [639964.652820] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jan 27 04:26:14 kernel: [639964.652821] 1949078 total pagecache pages
Jan 27 04:26:14 kernel: [639964.652822] 8225 pages in swap cache
Jan 27 04:26:14 kernel: [639964.652824] Swap cache stats: add 1131771, delete 1123546, find 7366438/7540102
Jan 27 04:26:14 kernel: [639964.652824] Free swap  = 4080988kB
Jan 27 04:26:14 kernel: [639964.652825] Total swap = 4194300kB
Jan 27 04:26:14 kernel: [639964.652826] 4194174 pages RAM
Jan 27 04:26:14 kernel: [639964.652826] 0 pages HighMem/MovableOnly
Jan 27 04:26:14 kernel: [639964.652827] 86139 pages reserved
Jan 27 04:26:14 kernel: [639964.652828] 0 pages cma reserved
Jan 27 04:26:14 kernel: [639964.652828] 0 pages hwpoisoned
Jan 27 04:26:14 kernel: [639964.652829] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jan 27 04:26:14 kernel: [639964.652834] [  424]     0   424     4909      388      14       3       68             0 upstart-udev-br
Jan 27 04:26:14 kernel: [639964.652836] [  439]     0   439    13075      456      29       3      322         -1000 systemd-udevd
Jan 27 04:26:14 kernel: [639964.652839] [  724]     0   724     3816      226      13       3       53             0 upstart-socket-
Jan 27 04:26:14 kernel: [639964.652840] [  813]     0   813     5856      449      16       3       57             0 rpcbind
Jan 27 04:26:14 kernel: [639964.652842] [  865]   108   865     5386      456      16       3      113             0 rpc.statd
Jan 27 04:26:14 kernel: [639964.652844] [ 1034]     0  1034     3820      281      12       3       35             0 upstart-file-br
Jan 27 04:26:14 kernel: [639964.652846] [ 1041]   102  1041     9817      366      23       3       50             0 dbus-daemon
Jan 27 04:26:14 kernel: [639964.652847] [ 1045]   101  1045    65018     1203      31       3      384             0 rsyslogd
Jan 27 04:26:14 kernel: [639964.652849] [ 1056]     0  1056    10870      525      26       4       49             0 systemd-logind
Jan 27 04:26:14 kernel: [639964.652851] [ 1063]     0  1063     5870        0      16       3       53             0 rpc.idmapd
Jan 27 04:26:14 kernel: [639964.652852] [ 1153]     0  1153     2558      371       9       3      517             0 dhclient
Jan 27 04:26:14 kernel: [639964.652854] [ 1374]     0  1374     3955      401      13       3       40             0 getty
Jan 27 04:26:14 kernel: [639964.652855] [ 1377]     0  1377     3955      406      13       3       38             0 getty
Jan 27 04:26:14 kernel: [639964.652857] [ 1383]     0  1383     3955      406      13       3       39             0 getty
Jan 27 04:26:14 kernel: [639964.652858] [ 1384]     0  1384     3955      418      13       3       37             0 getty
Jan 27 04:26:14 kernel: [639964.652859] [ 1386]     0  1386     3955      418      12       3       38             0 getty
Jan 27 04:26:14 kernel: [639964.652861] [ 1403]     0  1403    15346      735      34       3      142         -1000 sshd
Jan 27 04:26:14 kernel: [639964.652863] [ 1436]     0  1436     4825      408      13       3       28             0 irqbalance
Jan 27 04:26:14 kernel: [639964.652864] [ 1440]     0  1440     1093      379       8       3       35             0 acpid
Jan 27 04:26:14 kernel: [639964.652866] [ 1442]     0  1442     4785      176      14       3       38             0 atd
Jan 27 04:26:14 kernel: [639964.652867] [ 1443]     0  1443     5914      466      17       3       43             0 cron
Jan 27 04:26:14 kernel: [639964.652869] [ 1464]   105  1464    61957     3600      59       3      273          -900 postgres
Jan 27 04:26:14 kernel: [639964.652870] [ 1561]   107  1561     7864      657      21       3      113             0 ntpd
Jan 27 04:26:14 kernel: [639964.652872] [ 1762]   105  1762    62036    35419     117       3      264             0 postgres
Jan 27 04:26:14 kernel: [639964.652873] [ 1763]   105  1763    61957    35051     117       3      266             0 postgres
Jan 27 04:26:14 kernel: [639964.652875] [ 1764]   105  1764    61957     1773      52       3      306             0 postgres
Jan 27 04:26:14 kernel: [639964.652877] [ 1765]   105  1765    62166     4999     116       3      374             0 postgres
Jan 27 04:26:14 kernel: [639964.652878] [ 1766]   105  1766    25910      617      48       3      274             0 postgres
Jan 27 04:26:14 kernel: [639964.652880] [ 1834]  1004  1834  2002886   692615    1549      10    12707             0 java
Jan 27 04:26:14 kernel: [639964.652881] [ 1921]   106  1921     5835      452      16       3      112             0 nrpe
Jan 27 04:26:14 kernel: [639964.652883] [ 1943]     0  1943   175986      420      41       4       50             0 nscd
Jan 27 04:26:14 kernel: [639964.652884] [ 1978]   109  1978   111112      309      48       4      213             0 nslcd
Jan 27 04:26:14 kernel: [639964.652886] [ 2007]     8  2007     3172      326      11       3       52             0 nullmailer-send
Jan 27 04:26:14 kernel: [639964.652887] [ 2092]     0  2092    34005     1947      70       3     3067             0 /usr/bin/monito
Jan 27 04:26:14 kernel: [639964.652889] [ 2110]     0  2110     1901      367       9       3       25             0 getty
Jan 27 04:26:14 kernel: [639964.652891] [ 2146] 65534  2146    34005     1101      67       3     3810             0 monitorix-httpd
Jan 27 04:26:14 kernel: [639964.652893] [24525]   105 24525  1826264  1151331    3568      10      299             0 postgres
Jan 27 04:26:14 kernel: [639964.652895] [20380]   105 20380    62511    36514     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652897] [21273]   105 21273    62532    36508     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652898] [22133]   105 22133    62610    36827     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652900] [22135]   105 22135    62541    34994     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652901] [22428]     0 22428    15436      739      35       4       11             0 cron
Jan 27 04:26:14 kernel: [639964.652903] [22429]     0 22429    15489      749      35       4       12             0 cron
Jan 27 04:26:14 kernel: [639964.652904] [22442]     0 22442     1112      198       8       3        0             0 sh
Jan 27 04:26:14 kernel: [639964.652906] [22443]  1004 22443     1112      191       8       3        0             0 sh
Jan 27 04:26:14 kernel: [639964.652908] [22444]  1004 22444     3102      748      11       3        0             0 syncDaily.sh
Jan 27 04:26:14 kernel: [639964.652909] [22445]     0 22445     1112      420       8       3        0             0 cron-apt
Jan 27 04:26:14 kernel: [639964.652911] [22465]  1004 22465    55074    10532     113       3        0             0 rsync
Jan 27 04:26:14 kernel: [639964.652912] [22466]     0 22466     1087      171       8       3        0             0 sleep
Jan 27 04:26:14 kernel: [639964.652914] [22467]  1004 22467    29820     9151      62       3        0             0 rsync
Jan 27 04:26:14 kernel: [639964.652915] [22468]  1004 22468    61770     7168     125       3        0             0 rsync
Jan 27 04:26:14 kernel: [639964.652917] [22990]   105 22990    62490    35099     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652919] [23138]   105 23138    62491    35578     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652920] [23139]   105 23139    62690    36657     121       3      236             0 postgres
Jan 27 04:26:14 kernel: [639964.652922] [23140]   105 23140    62455    32973     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652923] [23631]   105 23631    62518    34978     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652925] [23635]   105 23635    62506    35193     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652927] [23636]   105 23636    62455    30085     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652928] [23637]   105 23637    62470    33106     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652930] [23639]   105 23639    62511    34295     120       3      237             0 postgres
Jan 27 04:26:14 kernel: [639964.652940] Out of memory: Kill process 24525 (postgres) score 224 or sacrifice child
Jan 27 04:26:14 kernel: [639964.652975] Killed process 24525 (postgres) total-vm:7305056kB, anon-rss:4460244kB, file-rss:145080kB

实例二

Jan 27 04:34:36 kernel: [640466.131656] java invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
Jan 27 04:34:36 kernel: [640466.131660] java cpuset=/ mems_allowed=0
Jan 27 04:34:36 kernel: [640466.131665] CPU: 7 PID: 2152 Comm: java Not tainted 4.4.0-59-generic #80~14.04.1-Ubuntu
Jan 27 04:34:36 kernel: [640466.131666] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 06/22/2012
Jan 27 04:34:36 kernel: [640466.131668]  0000000000000000 ffff88041a963b38 ffffffff813dbd6c ffff88041a963cf0
Jan 27 04:34:36 kernel: [640466.131670]  0000000000000000 ffff88041a963bc8 ffffffff811fafc6 0000000000000000
Jan 27 04:34:36 kernel: [640466.131671]  0000000000000000 0000000000000000 ffff88042a6d1b88 0000000000000015
Jan 27 04:34:36 kernel: [640466.131673] Call Trace:
Jan 27 04:34:36 kernel: [640466.131698]  [<ffffffff813dbd6c>] dump_stack+0x63/0x87
Jan 27 04:34:36 kernel: [640466.131712]  [<ffffffff811fafc6>] dump_header+0x5b/0x1d5
Jan 27 04:34:36 kernel: [640466.131721]  [<ffffffff813766f1>] ? apparmor_capable+0xd1/0x180
Jan 27 04:34:36 kernel: [640466.131728]  [<ffffffff81188b35>] oom_kill_process+0x205/0x3d0
Jan 27 04:34:36 kernel: [640466.131730]  [<ffffffff8118916b>] out_of_memory+0x40b/0x460
Jan 27 04:34:36 kernel: [640466.131732]  [<ffffffff811fba7f>] __alloc_pages_slowpath.constprop.87+0x742/0x7ad
Jan 27 04:34:36 kernel: [640466.131734]  [<ffffffff8118e167>] __alloc_pages_nodemask+0x237/0x240
Jan 27 04:34:36 kernel: [640466.131736]  [<ffffffff8118e32d>] alloc_kmem_pages_node+0x4d/0xd0
Jan 27 04:34:36 kernel: [640466.131745]  [<ffffffff8107c125>] copy_process+0x185/0x1ce0
Jan 27 04:34:36 kernel: [640466.131755]  [<ffffffff810fd0b4>] ? do_futex+0xf4/0x520
Jan 27 04:34:36 kernel: [640466.131761]  [<ffffffff810a71c9>] ? resched_curr+0xa9/0xd0
Jan 27 04:34:36 kernel: [640466.131763]  [<ffffffff8107de1a>] _do_fork+0x8a/0x310
Jan 27 04:34:36 kernel: [640466.131765]  [<ffffffff8107e149>] SyS_clone+0x19/0x20
Jan 27 04:34:36 kernel: [640466.131779]  [<ffffffff81802c76>] entry_SYSCALL_64_fastpath+0x16/0x75
Jan 27 04:34:36 kernel: [640466.131781] Mem-Info:
Jan 27 04:34:36 kernel: [640466.131784] active_anon:463046 inactive_anon:339934 isolated_anon:0
Jan 27 04:34:36 kernel: [640466.131784]  active_file:1074992 inactive_file:1398029 isolated_file:0
Jan 27 04:34:36 kernel: [640466.131784]  unevictable:0 dirty:1307 writeback:0 unstable:0
Jan 27 04:34:36 kernel: [640466.131784]  slab_reclaimable:626085 slab_unreclaimable:26239
Jan 27 04:34:36 kernel: [640466.131784]  mapped:40618 shmem:35429 pagetables:4038 bounce:0
Jan 27 04:34:36 kernel: [640466.131784]  free:161367 free_pcp:0 free_cma:0
Jan 27 04:34:36 kernel: [640466.131788] Node 0 DMA free:15892kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jan 27 04:34:36 kernel: [640466.131792] lowmem_reserve[]: 0 2951 16005 16005 16005
Jan 27 04:34:36 kernel: [640466.131794] Node 0 DMA32 free:112056kB min:12448kB low:15560kB high:18672kB active_anon:465908kB inactive_anon:478436kB active_file:620808kB inactive_file:963844kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB managed:3048784kB mlocked:0kB dirty:844kB writeback:0kB mapped:11132kB shmem:5644kB slab_reclaimable:390764kB slab_unreclaimable:8488kB kernel_stack:1408kB pagetables:2304kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 27 04:34:36 kernel: [640466.131798] lowmem_reserve[]: 0 0 13054 13054 13054
Jan 27 04:34:36 kernel: [640466.131800] Node 0 Normal free:517520kB min:55068kB low:68832kB high:82600kB active_anon:1386276kB inactive_anon:881300kB active_file:3679160kB inactive_file:4628272kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13367448kB mlocked:0kB dirty:4384kB writeback:0kB mapped:151340kB shmem:136072kB slab_reclaimable:2113576kB slab_unreclaimable:96452kB kernel_stack:3904kB pagetables:13848kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 27 04:34:36 kernel: [640466.131803] lowmem_reserve[]: 0 0 0 0 0
Jan 27 04:34:36 kernel: [640466.131805] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15892kB
Jan 27 04:34:36 kernel: [640466.131812] Node 0 DMA32: 20157*4kB (UME) 4165*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 113948kB
Jan 27 04:34:36 kernel: [640466.131817] Node 0 Normal: 119665*4kB (UMEH) 4706*8kB (UMEH) 12*16kB (H) 13*32kB (H) 10*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 518068kB
Jan 27 04:34:36 kernel: [640466.131824] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jan 27 04:34:36 kernel: [640466.131825] 2516698 total pagecache pages
Jan 27 04:34:36 kernel: [640466.131826] 8199 pages in swap cache
Jan 27 04:34:36 kernel: [640466.131828] Swap cache stats: add 1131970, delete 1123771, find 7374629/7548428
Jan 27 04:34:36 kernel: [640466.131828] Free swap  = 4085700kB
Jan 27 04:34:36 kernel: [640466.131829] Total swap = 4194300kB
Jan 27 04:34:36 kernel: [640466.131830] 4194174 pages RAM
Jan 27 04:34:36 kernel: [640466.131830] 0 pages HighMem/MovableOnly
Jan 27 04:34:36 kernel: [640466.131831] 86139 pages reserved
Jan 27 04:34:36 kernel: [640466.131832] 0 pages cma reserved
Jan 27 04:34:36 kernel: [640466.131832] 0 pages hwpoisoned
Jan 27 04:34:36 kernel: [640466.131833] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Jan 27 04:34:36 kernel: [640466.131838] [  424]     0   424     4909      388      14       3       68             0 upstart-udev-br
Jan 27 04:34:36 kernel: [640466.131841] [  439]     0   439    13075      456      29       3      322         -1000 systemd-udevd
Jan 27 04:34:36 kernel: [640466.131843] [  724]     0   724     3816      226      13       3       53             0 upstart-socket-
Jan 27 04:34:36 kernel: [640466.131845] [  813]     0   813     5856      449      16       3       57             0 rpcbind
Jan 27 04:34:36 kernel: [640466.131846] [  865]   108   865     5386      456      16       3      113             0 rpc.statd
Jan 27 04:34:36 kernel: [640466.131848] [ 1034]     0  1034     3820      281      12       3       35             0 upstart-file-br
Jan 27 04:34:36 kernel: [640466.131850] [ 1041]   102  1041     9817      366      23       3       50             0 dbus-daemon
Jan 27 04:34:36 kernel: [640466.131852] [ 1045]   101  1045    65018     1255      31       3      362             0 rsyslogd
Jan 27 04:34:36 kernel: [640466.131854] [ 1056]     0  1056    10870      525      26       4       49             0 systemd-logind
Jan 27 04:34:36 kernel: [640466.131855] [ 1063]     0  1063     5870        0      16       3       53             0 rpc.idmapd
Jan 27 04:34:36 kernel: [640466.131857] [ 1153]     0  1153     2558      371       9       3      517             0 dhclient
Jan 27 04:34:36 kernel: [640466.131858] [ 1374]     0  1374     3955      401      13       3       40             0 getty
Jan 27 04:34:36 kernel: [640466.131860] [ 1377]     0  1377     3955      406      13       3       38             0 getty
Jan 27 04:34:36 kernel: [640466.131861] [ 1383]     0  1383     3955      406      13       3       39             0 getty
Jan 27 04:34:36 kernel: [640466.131863] [ 1384]     0  1384     3955      418      13       3       37             0 getty
Jan 27 04:34:36 kernel: [640466.131864] [ 1386]     0  1386     3955      418      12       3       38             0 getty
Jan 27 04:34:36 kernel: [640466.131866] [ 1403]     0  1403    15346      735      34       3      142         -1000 sshd
Jan 27 04:34:36 kernel: [640466.131868] [ 1436]     0  1436     4825      408      13       3       28             0 irqbalance
Jan 27 04:34:36 kernel: [640466.131869] [ 1440]     0  1440     1093      379       8       3       35             0 acpid
Jan 27 04:34:36 kernel: [640466.131871] [ 1442]     0  1442     4785      176      14       3       38             0 atd
Jan 27 04:34:36 kernel: [640466.131872] [ 1443]     0  1443     5914      466      17       3       43             0 cron
Jan 27 04:34:36 kernel: [640466.131874] [ 1464]   105  1464    61957     4409      59       3      254          -900 postgres
Jan 27 04:34:36 kernel: [640466.131876] [ 1561]   107  1561     7864      657      21       3      113             0 ntpd
Jan 27 04:34:36 kernel: [640466.131877] [ 1834]  1004  1834  2002886   692883    1549      10    12598             0 java
Jan 27 04:34:36 kernel: [640466.131879] [ 1921]   106  1921     5835      452      16       3      112             0 nrpe
Jan 27 04:34:36 kernel: [640466.131880] [ 1943]     0  1943   175986      420      41       4       50             0 nscd
Jan 27 04:34:36 kernel: [640466.131882] [ 1978]   109  1978   111112      309      48       4      213             0 nslcd
Jan 27 04:34:36 kernel: [640466.131883] [ 2007]     8  2007     3172      326      11       3       52             0 nullmailer-send
Jan 27 04:34:36 kernel: [640466.131885] [ 2092]     0  2092    34005     1947      70       3     3067             0 /usr/bin/monito
Jan 27 04:34:36 kernel: [640466.131887] [ 2110]     0  2110     1901      367       9       3       25             0 getty
Jan 27 04:34:36 kernel: [640466.131888] [ 2146] 65534  2146    34005     1101      67       3     3810             0 monitorix-httpd
Jan 27 04:34:36 kernel: [640466.131891] [22428]     0 22428    15436      739      35       4       11             0 cron
Jan 27 04:34:36 kernel: [640466.131892] [22429]     0 22429    15489      749      35       4       12             0 cron
Jan 27 04:34:36 kernel: [640466.131894] [22442]     0 22442     1112      198       8       3        0             0 sh
Jan 27 04:34:36 kernel: [640466.131895] [22443]  1004 22443     1112      191       8       3        0             0 sh
Jan 27 04:34:36 kernel: [640466.131897] [22444]  1004 22444     3102      748      11       3        0             0 syncDaily.sh
Jan 27 04:34:36 kernel: [640466.131899] [22445]     0 22445     1112      420       8       3        0             0 cron-apt
Jan 27 04:34:36 kernel: [640466.131900] [22465]  1004 22465    54754    27012     113       3        0             0 rsync
Jan 27 04:34:36 kernel: [640466.131902] [22466]     0 22466     1087      171       8       3        0             0 sleep
Jan 27 04:34:36 kernel: [640466.131903] [22467]  1004 22467    34234    26953      72       3        0             0 rsync
Jan 27 04:34:36 kernel: [640466.131905] [22468]  1004 22468    62154    15613     126       3        0             0 rsync
Jan 27 04:34:36 kernel: [640466.131907] [24170]   105 24170    61990    34251     117       3      237             0 postgres
Jan 27 04:34:36 kernel: [640466.131908] [24171]   105 24171    61957    33191     115       3      238             0 postgres
Jan 27 04:34:36 kernel: [640466.131910] [24172]   105 24172    61957     1190      52       3      238             0 postgres
Jan 27 04:34:36 kernel: [640466.131911] [24173]   105 24173    62166     1333      54       3      239             0 postgres
Jan 27 04:34:36 kernel: [640466.131913] [24174]   105 24174    25876      642      48       3      242             0 postgres
Jan 27 04:34:36 kernel: [640466.131915] [24175]   105 24175    62464    35199     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131916] [24203]   105 24203    62467    22296     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131918] [24266]   105 24266    62475    36452     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131920] [24317]   105 24317    62424    17702     119       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131921] [24318]   105 24318    62449    24858     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131923] [24320]   105 24320    62485    24779     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131925] [24321]   105 24321    62449    27595     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131926] [24452]   105 24452    62484    16118     120       3      232             0 postgres
Jan 27 04:34:36 kernel: [640466.131928] Out of memory: Kill process 1834 (java) score 137 or sacrifice child
Jan 27 04:34:36 kernel: [640466.132070] Killed process 1834 (java) total-vm:8011544kB, anon-rss:2763340kB, file-rss:8192kB
linux oom
  • 1 个回答
  • 7497 Views
Martin Hope
Mike Conigliaro
Asked: 2016-07-26 08:23:11 +0800 CST

莫名其妙的内存泄漏。什么在这个系统上使用了 ~10GB 的内存?

  • 10

运行大约 18 小时后,该系统使用了约 10GB 的内存,导致当我们运行我们的日常任务时触发 OOM-killer:

# free -h
             total       used       free     shared    buffers     cached
Mem:           14G       9.4G       5.3G       400K        27M        59M
-/+ buffers/cache:       9.3G       5.4G
Swap:           0B         0B         0B

# cat /proc/meminfo
MemTotal:       15400928 kB
MemFree:         5567028 kB
Buffers:           28464 kB
Cached:            60816 kB
SwapCached:            0 kB
Active:           321464 kB
Inactive:          59156 kB
Active(anon):     291464 kB
Inactive(anon):      316 kB
Active(file):      30000 kB
Inactive(file):    58840 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                40 kB
Writeback:             0 kB
AnonPages:        291380 kB
Mapped:            14356 kB
Shmem:               400 kB
Slab:             364596 kB
SReclaimable:      18856 kB
SUnreclaim:       345740 kB
KernelStack:        1832 kB
PageTables:         3720 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     7700464 kB
Committed_AS:     313224 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       35976 kB
VmallocChunk:   34359678732 kB
HardwareCorrupted:     0 kB
AnonHugePages:    231424 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:     9598976 kB
DirectMap2M:     6260736 kB

但是,进程似乎并没有使用大量的内存:

# top -o %MEM -n 1
top - 15:07:00 up 18:28,  1 user,  load average: 0.00, 0.01, 0.05
Tasks: 155 total,   1 running, 154 sleeping,   0 stopped,   0 zombie
%Cpu(s): 23.7 us,  4.8 sy,  0.0 ni, 71.4 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:  15400928 total,  9838560 used,  5562368 free,    29764 buffers
KiB Swap:        0 total,        0 used,        0 free.    62760 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1333 root      20   0 5763204 274132   5352 S   0.0  1.8   7:00.19 java
 1466 newrelic  20   0  251484   4884   2056 S   0.0  0.0   0:56.41 nrsysmond
16804 root      20   0  105636   4212   3224 S   0.0  0.0   0:00.00 sshd
16876 root      20   0   21420   3908   1764 S   0.0  0.0   0:00.03 bash
16858 ubuntu    20   0   21456   3828   1684 S   0.0  0.0   0:00.05 bash
  770 root      20   0   10216   2868    576 S   0.0  0.0   0:00.02 dhclient
    1 root      20   0   33700   2216    624 S   0.0  0.0   0:35.50 init
16875 root      20   0   63664   2084   1612 S   0.0  0.0   0:00.00 sudo
16857 ubuntu    20   0  105636   1860    880 S   0.0  0.0   0:00.01 sshd
16920 root      20   0   23688   1528   1064 R   0.0  0.0   0:00.00 top
16803 postfix   20   0   27400   1492   1216 S   0.0  0.0   0:00.00 pickup
  976 root      20   0   43444   1100    748 S   0.0  0.0   0:00.00     systemd-logind
  572 root      20   0   51480   1048    308 S   0.0  0.0   0:00.53     systemd-udevd
 1840 ntp       20   0   31448   1044    448 S   0.0  0.0   0:02.94 ntpd
  990 syslog    20   0  255836    924     76 S   0.0  0.0   0:00.13 rsyslogd
 1167 root      20   0   61372    828    148 S   0.0  0.0   0:00.00 sshd
  945 message+  20   0   39212    788    416 S   0.0  0.0   0:00.12 dbus-daemon
 1323 root      20   0   20692    676      0 S   0.0  0.0   0:40.92 wrapper
 1230 root      20   0   19320    588    244 S   0.0  0.0   0:04.57 irqbalance
 1538 root      20   0   25336    500    188 S   0.0  0.0   0:00.18 master
  567 root      20   0   19604    480     96 S   0.0  0.0   0:00.34     upstart-udev-br
 1175 root      20   0   23648    404    156 S   0.0  0.0   0:00.08 cron
 1005 root      20   0   15272    348     88 S   0.0  0.0   0:00.08     upstart-file-br

临时和共享内存文件系统基本上是空的:

# df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            7.4G   12K  7.4G   1% /dev
tmpfs           1.5G  384K  1.5G   1% /run
/dev/xvda1      9.8G  6.7G  2.7G  72% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
none            5.0M     0  5.0M   0% /run/lock
none            7.4G     0  7.4G   0% /run/shm
none            100M     0  100M   0% /run/user
/dev/xvda15     104M  4.7M   99M   5% /boot/efi
/dev/xvdb        64G  1.1G   60G   2% /mnt

smem说它正在被内核使用:

# smem -tw
Area                           Used      Cache   Noncache
firmware/hardware                 0          0          0
kernel image                      0          0          0
kernel dynamic memory       9525544      92468    9433076
userspace memory             311064      15648     295416
free memory                 5564320    5564320          0
----------------------------------------------------------
                           15400928    5672436    9728492

但slabtop没有帮助:

# slabtop -o -s c
 Active / Total Objects (% used)    : 2915263 / 2937006 (99.3%)
 Active / Total Slabs (% used)      : 60745 / 60745 (100.0%)
 Active / Total Caches (% used)     : 68 / 103 (66.0%)
 Active / Total Size (% used)       : 356086.71K / 360884.30K (98.7%)
 Minimum / Average / Maximum Object : 0.01K / 0.12K / 14.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
2226784 2226784 100%    0.07K  39764       56    159056K Acpi-ParseExt
273408 272598  99%    0.25K   8544       32     68352K kmalloc-256
  8568   8560  99%    4.00K   1071        8     34272K kmalloc-4096
 52320  52320 100%    0.50K   1635       32     26160K kmalloc-512
  1988   1975  99%    8.00K    497        4     15904K kmalloc-8192
 58044  53370  91%    0.19K   2764       21     11056K kmalloc-192
150016 141356  94%    0.06K   2344       64      9376K kmalloc-64
  5016   3504  69%    0.96K    152       33      4864K ext4_inode_cache
  7280   6834  93%    0.57K    260       28      4160K inode_cache
 20265  20067  99%    0.19K    965       21      3860K dentry
  1760   1721  97%    2.00K    110       16      3520K kmalloc-2048
 19800  19800 100%    0.11K    550       36      2200K sysfs_dir_cache
  2112   1966  93%    1.00K     66       32      2112K kmalloc-1024
   305    260  85%    6.00K     61        5      1952K task_struct
 14616  14242  97%    0.09K    348       42      1392K kmalloc-96
  2125   2092  98%    0.63K     85       25      1360K proc_inode_cache
  2324   2324 100%    0.55K     83       28      1328K radix_tree_node
  9828   9828 100%    0.10K    252       39      1008K buffer_head
  1400   1400 100%    0.62K     56       25       896K sock_inode_cache
    54     39  72%   12.00K     27        2       864K nvidia_stack_cache
   975    975 100%    0.81K     25       39       800K task_xstate
   690    515  74%    1.06K     23       30       736K signal_cache

到目前为止,我能够解决此问题的唯一方法是重新启动。10GB 内存藏在哪里?

linux ubuntu memory-usage memory-leak oom
  • 2 个回答
  • 5103 Views
Martin Hope
Nick Bull
Asked: 2016-05-20 02:39:15 +0800 CST

AWS WordPress 网站 - OOM 杀死 Apache

  • 2

这个问题最初是在 StackOverflow.com 上提出的,我已将其复制到更合适的 ServerFault.com 站点。我投票结束的原始问题可以在这里找到

我在 AWS 上托管了一个小型、低流量的 WordPress 博客。不幸的是,大约每周一次左右,该网站会变得不可用,并且尝试访问该网站会使用户挂起,直到连接超时。在这些停机期间,我也无法通过 SSH 访问服务器,直到服务器重新启动。它总是在重启实例后立即恢复。


AWS EC2 日志的最后几行(30,000 个字符限制阻止全部发布):

[1079988.125918] Out of memory: Kill process 32620 (httpd) score 16 or sacrifice child

[1079988.130913] Killed process 32620 (httpd) total-vm:510348kB, anon-rss:32328kB, file-rss:0kB

[1079996.872570] httpd invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0

[1079996.887776] httpd cpuset=/ mems_allowed=0

[1079996.892671] CPU: 0 PID: 374 Comm: httpd Tainted: G            E   4.4.5-15.26.amzn1.x86_64 #1

[1079996.896664] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015

[1079996.896664]  0000000000000000 ffff88002f66bb58 ffffffff812c88ef ffff88002f66bd40

[1079996.896664]  0000000000000000 ffff88002f66bbe8 ffffffff811cdc8d 0000000000000001

[1079996.896664]  ffffffff810b2c61 0000000000000000 0000000000000010 ffffffff817d07f9

[1079996.896664] Call Trace:

[1079996.896664]  [<ffffffff812c88ef>] dump_stack+0x63/0x84

[1079996.896664]  [<ffffffff811cdc8d>] dump_header+0x5e/0x1d8

[1079996.896664]  [<ffffffff810b2c61>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20

[1079996.896664]  [<ffffffff811634d5>] oom_kill_process+0x205/0x3d0

[1079996.896664]  [<ffffffff81163b35>] out_of_memory+0x435/0x480

[1079996.896664]  [<ffffffff81168bee>] __alloc_pages_nodemask+0x91e/0xa60

[1079996.896664]  [<ffffffff811ae5e5>] alloc_pages_vma+0xb5/0x210

[1079996.896664]  [<ffffffff8118e296>] handle_mm_fault+0x13a6/0x1870

[1079996.896664]  [<ffffffff810b2c61>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20

[1079996.896664]  [<ffffffff810922c2>] ? finish_task_switch+0x72/0x1d0

[1079996.896664]  [<ffffffff8105ea23>] __do_page_fault+0x183/0x3f0

[1079996.896664]  [<ffffffff8105ecb2>] do_page_fault+0x22/0x30

[1079996.896664]  [<ffffffff814de858>] page_fault+0x28/0x30

[1079996.958270] Mem-Info:

[1079996.959270] active_anon:457775 inactive_anon:13 isolated_anon:0

[1079996.959270]  active_file:16 inactive_file:106 isolated_file:0

[1079996.959270]  unevictable:0 dirty:0 writeback:0 unstable:0

[1079996.959270]  slab_reclaimable:2369 slab_unreclaimable:10892

[1079996.959270]  mapped:24 shmem:35 pagetables:33000 bounce:0

[1079996.959270]  free:3686 free_pcp:68 free_cma:0

[1079996.973173] Node 0 DMA free:7976kB min:40kB low:48kB high:60kB active_anon:6348kB inactive_anon:0kB active_file:4kB inactive_file:24kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:72kB slab_unreclaimable:268kB kernel_stack:0kB pagetables:1036kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:168 all_unreclaimable? yes

[1079996.991168] lowmem_reserve[]: 0 1984 1984 1984

[1079996.993371] Node 0 DMA32 free:6768kB min:5584kB low:6980kB high:8376kB active_anon:1824752kB inactive_anon:52kB active_file:60kB inactive_file:400kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2080768kB managed:2035632kB mlocked:0kB dirty:0kB writeback:0kB mapped:88kB shmem:140kB slab_reclaimable:9404kB slab_unreclaimable:43300kB kernel_stack:3920kB pagetables:130964kB unstable:0kB bounce:0kB free_pcp:272kB local_pcp:272kB free_cma:0kB writeback_tmp:0kB pages_scanned:2808 all_unreclaimable? yes

[1079997.012134] lowmem_reserve[]: 0 0 0 0

[1079997.014077] Node 0 DMA: 30*4kB (ME) 71*8kB (UM) 26*16kB (UME) 3*32kB (ME) 0*64kB 1*128kB (E) 2*256kB (ME) 2*512kB (ME) 1*1024kB (E) 2*2048kB (UM) 0*4096kB = 7984kB

[1079997.021833] Node 0 DMA32: 478*4kB (UME) 55*8kB (UM) 210*16kB (UM) 31*32kB (M) 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6768kB

[1079997.028501] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB

[1079997.032184] 160 total pagecache pages

[1079997.033858] 0 pages in swap cache

[1079997.035455] Swap cache stats: add 0, delete 0, find 0/0

[1079997.037793] Free swap  = 0kB

[1079997.039117] Total swap = 0kB

[1079997.040511] 524189 pages RAM

[1079997.041780] 0 pages HighMem/MovableOnly

[1079997.043474] 11305 pages reserved

[1079997.044872] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name

[1079997.048548] [ 1524]     0  1524     2866      229      12       3        0         -1000 udevd

[1079997.052366] [ 1647]     0  1647     2865      213      11       3        0         -1000 udevd

[1079997.056081] [ 1648]     0  1648     2865      210      11       3        0         -1000 udevd

[1079997.059637] [ 2045]     0  2045     2340      123      10       3        0             0 dhclient

[1079997.063283] [ 2097]     0  2097    28018      106      25       3        0         -1000 auditd

[1079997.067016] [ 2118]     0  2118    61865      375      25       3        0             0 rsyslogd

[1079997.070879] [ 2140]     0  2140     1095       37       8       3        0             0 rngd

[1079997.074484] [ 2158]    32  2158     8823       97      22       3        0             0 rpcbind

[1079997.078205] [ 2179]    29  2179     9965      201      24       3        0             0 rpc.statd

[1079997.081954] [ 2210]    81  2210     5448       61      15       3        0             0 dbus-daemon

[1079997.085861] [ 2339]     0  2339    19458      203      40       3        0         -1000 sshd

[1079997.089500] [ 2361]    38  2361     7322      140      18       3        0             0 ntpd

[1079997.093148] [ 2654]     0  2654    22245      463      45       3        0             0 sendmail

[1079997.096960] [ 2663]    51  2663    20109      369      40       4        0             0 sendmail

[1079997.100976] [ 2678]     0  2678   102872     1914     200       3        0             0 httpd

[1079997.104736] [ 2689]     0  2689    29880      146      15       3        0             0 crond

[1079997.108453] [ 2703]     0  2703     4267       41      13       3        0             0 atd

[1079997.112100] [ 2733]     0  2733     1615       31       8       3        0             0 agetty

[1079997.115895] [ 2735]     0  2735     1078       24       8       3        0             0 mingetty

[1079997.119835] [ 2738]     0  2738     1078       24       8       3        0             0 mingetty

[1079997.123661] [ 2740]     0  2740     1078       24       7       3        0             0 mingetty

[1079997.127430] [ 2742]     0  2742     1078       24       8       3        0             0 mingetty

[1079997.131420] [ 2744]     0  2744     1078       23       8       3        0             0 mingetty

[1079997.135291] [ 2746]     0  2746     1078       24       8       3        0             0 mingetty

[1079997.139070] [32214]    48 32214   126502     7229     209       3        0             0 httpd

[1079997.142642] [32297]    48 32297   125281     6111     207       3        0             0 httpd

[1079997.146286] [32310]    48 32310   125754     6561     208       3        0             0 httpd

[1079997.150023] [32346]    48 32346   125990     6820     208       3        0             0 httpd

[1079997.153576] [32365]    48 32365   127220     7837     210       3        0             0 httpd

[1079997.157241] [32369]    48 32369   125991     6820     208       3        0             0 httpd

[1079997.160923] [32395]    48 32395   125282     6129     207       3        0             0 httpd

[1079997.164515] [32498]    48 32498   125998     6758     208       3        0             0 httpd

[1079997.168490] [32515]    48 32515   125281     6111     207       3        0             0 httpd

[1079997.172046] [32575]    48 32575   125732     6475     208       3        0             0 httpd

[1079997.175868] [32578]    48 32578   125683     6508     207       3        0             0 httpd

[1079997.179602] [32584]    48 32584   125731     6561     208       3        0             0 httpd

[1079997.183372] [32598]    48 32598   125683     6508     207       3        0             0 httpd

[1079997.187140] [32603]    48 32603   125731     6561     208       3        0             0 httpd

[1079997.190715] [32608]    48 32608   125683     6508     207       3        0             0 httpd

[1079997.194454] [32612]    48 32612   126566     7496     209       3        0             0 httpd

[1079997.198263] [32625]    48 32625   125524     6305     207       3        0             0 httpd

[1079997.201863] [32635]    48 32635   125731     6561     208       3        0             0 httpd

[1079997.205449] [32649]    48 32649   125990     6748     208       3        0             0 httpd

[1079997.209316] [32650]    48 32650   107262     6519     206       3        0             0 httpd

[1079997.213170] [32652]    48 32652   127395     7886     211       3        0             0 httpd

[1079997.216906] [32658]    48 32658   125729     6527     208       3        0             0 httpd

[1079997.220644] [32661]    48 32661   108843     7773     210       3        0             0 httpd

[1079997.224418] [32666]    48 32666   125282     6111     207       3        0             0 httpd

[1079997.228078] [32669]    48 32669   126892     7484     210       3        0             0 httpd

[1079997.231832] [32675]    48 32675   125706     6520     207       3        0             0 httpd

[1079997.235511] [32676]    48 32676   125683     6508     207       3        0             0 httpd

[1079997.239227] [32681]    48 32681   125684     6508     207       3        0             0 httpd

[1079997.243019] [32691]    48 32691   125993     6747     208       3        0             0 httpd

[1079997.246836] [32696]    48 32696   125683     6507     207       3        0             0 httpd

[1079997.250732] [32698]    48 32698   126371     6954     209       3        0             0 httpd

[1079997.254287] [32700]    48 32700   107752     6734     207       3        0             0 httpd

[1079997.258018] [32703]    48 32703   106347     5553     205       3        0             0 httpd

[1079997.261647] [32714]    48 32714   125684     6508     207       3        0             0 httpd

[1079997.265446] [32723]    48 32723   125731     6561     208       3        0             0 httpd

[1079997.269066] [  307]    48   307   125685     6506     207       3        0             0 httpd

[1079997.272679] [  314]    48   314   108660     7632     209       3        0             0 httpd

[1079997.276452] [  319]    48   319   107752     6723     207       3        0             0 httpd

[1079997.280364] [  320]    48   320   107240     6427     206       3        0             0 httpd

[1079997.284027] [  323]    48   323   107888     6912     208       3        0             0 httpd

[1079997.287595] [  324]    48   324   107122     6264     206       3        0             0 httpd

[1079997.291243] [  325]    48   325   122406     7539     209       3        0             0 httpd

[1079997.294995] [  326]    48   326   108274     7130     208       3        0             0 httpd

[1079997.298756] [  327]    48   327   124978     7732     210       3        0             0 httpd

[1079997.302330] [  329]    48   329   108274     7267     208       3        0             0 httpd

[1079997.306007] [  333]    48   333   106928     6052     206       3        0             0 httpd

[1079997.309818] [  334]    48   334   105747     4938     203       3        0             0 httpd

[1079997.313613] [  338]    48   338   108539     7521     209       3        0             0 httpd

[1079997.317252] [  339]    48   339   107816     6838     208       3        0             0 httpd

[1079997.320966] [  340]    48   340   108361     7341     209       3        0             0 httpd

[1079997.324587] [  342]    48   342   107944     6986     208       3        0             0 httpd

[1079997.328421] [  343]    48   343   103541     2605     199       3        0             0 httpd

[1079997.331943] [  344]    48   344   106288     5479     205       3        0             0 httpd

[1079997.335650] [  346]    48   346   106493     5678     205       3        0             0 httpd

[1079997.339365] [  347]    48   347   108326     7310     209       3        0             0 httpd

[1079997.343168] [  349]    48   349   106412     5579     205       3        0             0 httpd

[1079997.346819] [  350]    48   350   106612     5832     205       3        0             0 httpd

[1079997.350423] [  351]    48   351   107440     6484     207       3        0             0 httpd

[1079997.354093] [  352]    48   352   106984     6133     206       3        0             0 httpd

[1079997.357822] [  354]    48   354   107122     6265     206       3        0             0 httpd

[1079997.361467] [  355]    48   355   103541     2604     199       3        0             0 httpd

[1079997.365074] [  358]    48   358   103541     2605     199       3        0             0 httpd

[1079997.368805] [  359]    48   359   106475     5660     205       3        0             0 httpd

[1079997.372852] [  360]    48   360   104710     3796     201       3        0             0 httpd

[1079997.376601] [  370]    48   370   103538     2622     199       3        0             0 httpd

[1079997.380201] [  372]    48   372   103538     2622     199       3        0             0 httpd

[1079997.383966] [  373]    48   373   107752     6733     207       3        0             0 httpd

[1079997.387998] [  374]    48   374   108539     7520     209       3        0             0 httpd

[1079997.391913] [  375]    48   375   103541     2604     199       3        0             0 httpd

[1079997.395537] [  376]    48   376   103538     2626     199       3        0             0 httpd

[1079997.399174] [  378]    48   378   106467     5646     205       3        0             0 httpd

[1079997.402981] [  381]    48   381   107952     6967     208       3        0             0 httpd

[1079997.406805] [  383]    48   383   108338     7328     209       3        0             0 httpd

[1079997.410374] [  385]    48   385   105068     4224     202       3        0             0 httpd

[1079997.414010] [  388]    48   388   104866     4047     202       3        0             0 httpd

[1079997.417701] [  392]    48   392   105881     5036     204       3        0             0 httpd

[1079997.421384] [  394]    48   394   106347     5540     205       3        0             0 httpd

[1079997.425365] [  396]    48   396   103538     2626     199       3        0             0 httpd

[1079997.429044] [  407]    48   407   103838     2961     200       3        0             0 httpd

[1079997.432577] [  412]    48   412   103542     2605     199       3        0             0 httpd

[1079997.436201] [  413]    48   413   105079     4262     202       3        0             0 httpd

[1079997.439868] [  416]    48   416   103541     2605     199       3        0             0 httpd

[1079997.443501] [  422]    48   422   103541     2604     199       3        0             0 httpd

[1079997.447093] [  424]    48   424   103540     2637     199       3        0             0 httpd

[1079997.450727] [  426]    48   426   103542     2604     199       3        0             0 httpd

[1079997.454398] [  427]    48   427   103541     2604     199       3        0             0 httpd

[1079997.458434] [  428]    48   428   103538     2622     199       3        0             0 httpd

[1079997.463763] [  429]    48   429   103541     2604     199       3        0             0 httpd

[1079997.467592] [  437]    48   437   103538     2627     199       3        0             0 httpd

[1079997.471172] [  438]    48   438   104324     3456     201       3        0             0 httpd

[1079997.474751] [  439]    48   439   103541     2604     199       3        0             0 httpd

[1079997.478432] [  440]    48   440   103545     2622     199       3        0             0 httpd

[1079997.482044] [  441]    48   441   103539     2622     199       3        0             0 httpd

[1079997.485780] [  442]    48   442   106474     5667     205       3        0             0 httpd

[1079997.489572] [  445]    48   445   103547     2620     199       3        0             0 httpd

[1079997.493165] [  448]    48   448   103541     2604     199       3        0             0 httpd

[1079997.496719] [  449]    48   449   104500     3638     201       3        0             0 httpd

[1079997.500255] [  450]    48   450   106208     5364     204       3        0             0 httpd

[1079997.503768] [  468]    48   468   103613     2733     199       3        0             0 httpd

[1079997.507327] [  469]    48   469   103538     2622     199       3        0             0 httpd

[1079997.511042] [  470]    48   470   103541     2604     199       3        0             0 httpd

[1079997.514670] [  471]    48   471   103541     2604     199       3        0             0 httpd

[1079997.518298] [  472]    48   472   103542     2605     199       3        0             0 httpd

[1079997.522039] [  480]    48   480   104314     3471     201       3        0             0 httpd

[1079997.525663] [  481]    48   481   103610     2670     199       3        0             0 httpd

[1079997.529538] [  482]    48   482   103541     2604     199       3        0             0 httpd

[1079997.533280] [  483]    48   483   103542     2604     199       3        0             0 httpd

[1079997.536999] [  484]    48   484   103538     2622     199       3        0             0 httpd

[1079997.540651] [  485]    48   485   103538     2622     199       3        0             0 httpd

[1079997.544204] [  486]    48   486   104094     3224     200       3        0             0 httpd

[1079997.547808] [  488]    48   488   119730     2929     201       3        0             0 httpd

[1079997.551370] [  489]    48   489   103538     2622     199       3        0             0 httpd

[1079997.554962] [  490]    48   490   104808     3968     202       3        0             0 httpd

[1079997.558875] [  491]    48   491   103538     2622     199       3        0             0 httpd

[1079997.562490] [  494]    48   494   104059     3199     200       3        0             0 httpd

[1079997.566339] [  495]    48   495   104143     3232     200       3        0             0 httpd

[1079997.570045] [  501]    48   501   103564     2562     199       3        0             0 httpd

[1079997.573871] [  502]    48   502   104707     3771     201       3        0             0 httpd

[1079997.577369] [  503]    48   503   103538     2621     199       3        0             0 httpd

[1079997.580897] [  504]    48   504   103541     2609     199       3        0             0 httpd

[1079997.584364] [  505]    48   505   103539     2621     199       3        0             0 httpd

[1079997.588037] [  506]    48   506   103545     2621     199       3        0             0 httpd

[1079997.591946] [  507]    48   507   103541     2604     199       3        0             0 httpd

[1079997.595820] [  508]    48   508   103538     2622     199       3        0             0 httpd

[1079997.599555] [  509]    48   509   103541     2604     199       3        0             0 httpd

[1079997.603102] [  510]    48   510   103475     2562     199       3        0             0 httpd

[1079997.609123] [  511]    48   511   103539     2622     199       3        0             0 httpd

[1079997.613009] [  512]    48   512   103281     2335     199       3        0             0 httpd

[1079997.616695] [  513]    48   513   103475     2562     199       3        0             0 httpd

[1079997.620272] [  514]    48   514   103541     2604     199       3        0             0 httpd

[1079997.623949] [  515]    48   515   103475     2562     199       3        0             0 httpd

[1079997.627601] [  516]    48   516   103066     2051     198       3        0             0 httpd

[1079997.631257] [  517]    48   517   103285     2307     199       3        0             0 httpd

[1079997.634944] [  518]    48   518   103539     2622     199       3        0             0 httpd

[1079997.638875] [  519]    48   519   103484     2517     199       3        0             0 httpd

[1079997.642467] [  520]    48   520   103541     2601     199       3        0             0 httpd

[1079997.646281] [  521]    48   521   103564     2562     199       3        0             0 httpd

[1079997.650000] [  522]    48   522   103254     2242     199       3        0             0 httpd

[1079997.654227] [  523]    48   523   103541     2601     199       3        0             0 httpd

[1079997.657861] [  524]    48   524   102968     2025     198       3        0             0 httpd

[1079997.662169] [  525]    48   525   103433     2411     199       3        0             0 httpd

[1079997.666436] [  526]    48   526   102968     2050     198       3        0             0 httpd

[1079997.670950] [  527]    48   527   103254     2252     199       3        0             0 httpd

[1079997.676349] [  528]    48   528   103281     2350     199       3        0             0 httpd

[1079997.680095] [  529]    48   529   103538     2561     199       3        0             0 httpd

[1079997.683947] [  530]    48   530   103173     2126     198       3        0             0 httpd

[1079997.687608] [  531]    48   531   103228     2262     199       3        0             0 httpd

[1079997.691466] [  532]    48   532   103475     2515     199       3        0             0 httpd

[1079997.694988] [  533]    48   533   103173     2119     198       3        0             0 httpd

[1079997.698754] [  534]    48   534   102975     1973     198       3        0             0 httpd

[1079997.702324] [  535]    48   535   103434     2409     199       3        0             0 httpd

[1079997.705935] [  536]     0   536   102872     1915     192       3        0             0 httpd

[1079997.709734] [  537]    48   537   102968     2032     198       3        0             0 httpd

[1079997.713367] [  538]    48   538   102873     1926     198       3        0             0 httpd

[1079997.717141] [  539]    48   539   102872     1922     196       3        0             0 httpd

[1079997.720839] [  540]    48   540   102975     1973     198       3        0             0 httpd

[1079997.724457] [  541]     0   541   102872     1915     192       3        0             0 httpd

[1079997.728106] [  542]    48   542   102872     1915     193       3        0             0 httpd

[1079997.731629] [  543]    48   543   102872     1915     193       3        0             0 httpd

[1079997.735163] [  544]    48   544   102872     1915     193       3        0             0 httpd

[1079997.738927] Out of memory: Kill process 32652 (httpd) score 15 or sacrifice child

[1079997.742215] Killed process 32652 (httpd) total-vm:509580kB, anon-rss:31536kB, file-rss:8kB

我确实相信这与给定日志的内存使用率低有关,但是将实例升级到更大的实例并不能防止问题的发生。此外,该站点的流量极低(< 50 个用户/周),因此我需要比t1-medium当前运行的服务器更大的服务器是没有意义的。


软件版本:

PHP ( php --version):

PHP 5.6.21 (cli) (built: May  2 2016 23:27:53) 
Copyright (c) 1997-2016 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2016 Zend Technologies

MySQL ( mysql --version):

mysql  Ver 14.14 Distrib 5.5.46, for Linux (x86_64) using readline 5.1

阿帕奇 ( apachectl -V):

Server version: Apache/2.4.18 (Amazon)
Server built:   Mar  7 2016 22:32:11
Server's Module Magic Number: 20120211:52
Server loaded:  APR 1.5.1, APR-UTIL 1.4.1
Compiled using: APR 1.5.1, APR-UTIL 1.4.1
Architecture:   64-bit
Server MPM:     prefork
  threaded:     no
    forked:     yes (variable process count)
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/etc/httpd"
 -D SUEXEC_BIN="/usr/sbin/suexec"
 -D DEFAULT_PIDLOG="/var/run/httpd/httpd.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="conf/mime.types"
 -D SERVER_CONFIG_FILE="conf/httpd.conf"

WordPress ( grep wp_version "$WPDIR/wp-includes/version.php"):

 * @global string $wp_version
$wp_version = '4.4.3';

free -m命令输出:

请注意,我已重新启动服务器,并且该错误当前不影响服务器。所以下面的输出不是错误发生时的输出,而是重启服务器后的输出。

             total       used       free     shared    buffers     cached
Mem:          2003        741       1261          0         21        469
-/+ buffers/cache:        250       1752
Swap:     

   0          0          0
mysql php wordpress apache-2.4 oom
  • 3 个回答
  • 1503 Views
Martin Hope
fiction
Asked: 2016-05-05 14:35:17 +0800 CST

Linux上怎么会有OOM场景(OOM杀手背后的启发式)?

  • 2

我知道虚拟内存的概念。通过按需分页(取决于 vm.overcommit_memory),您可以分配比可用 RAM 更多的内存。除非您“触摸”页面,否则不会真正发生任何事情。否则我猜有一个页面错误,然后将物理内存用于页面框架。但这不知何故意味着,如果系统内存紧张,它只会调出最近使用的东西并正常工作。

怎么可能需要杀死一个进程?是否会因为 mlock()-ed 内存过多而发生。垃圾过多后是否调用OOM?或者换个说法:触发 OOM 杀手背后的启发式方法到底是什么?

我读到您可以执行“echo 1 > memory.oom_control”或“echo -17 > /proc/PID/oom_adj”来禁用它。这意味着什么?机器可能会在一段时间内完全没有响应。但是,如果有问题的进程以某种方式检测到它没有取得进展,它也可能暂时停止消耗内存(那么快),最终一切都应该重新开始工作还是我弄错了?

在我的场景中,只有一个进程(具有巨大的内存缓存)。当然,该数据不是持久的,但我仍宁愿不重新启动该过程(并重新加载该数据)。

linux oom
  • 1 个回答
  • 993 Views
Martin Hope
Marki
Asked: 2014-11-14 14:10:19 +0800 CST

OOM 即使页面缓存高,未使用交换等

  • 1

你能帮我诊断一下这个OOM吗?

我的言论介于两者之间。

<4>[598175.284914] cifsd invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0

什么?GFP 掩码低字节 0xa 表示请求 highmem 中的空闲页面。这是一个 64 位系统,因此没有 highmem 区域。

<6>[598175.284919] cifsd cpuset=/ mems_allowed=0
<4>[598175.284921] Pid: 5529, comm: cifsd Tainted: G           E X 3.0.101-0.35-default #1
<4>[598175.284923] Call Trace:
<4>[598175.284934]  [<ffffffff81004935>] dump_trace+0x75/0x310
<4>[598175.284941]  [<ffffffff8145f2f3>] dump_stack+0x69/0x6f
<4>[598175.284947]  [<ffffffff810fc53e>] dump_header+0x8e/0x110
<4>[598175.284950]  [<ffffffff810fc8e6>] oom_kill_process+0xa6/0x350
<4>[598175.284954]  [<ffffffff810fce25>] out_of_memory+0x295/0x2f0
<4>[598175.284957]  [<ffffffff8110287e>] __alloc_pages_slowpath+0x78e/0x7d0
<4>[598175.284960]  [<ffffffff81102aa9>] __alloc_pages_nodemask+0x1e9/0x200
<4>[598175.284965]  [<ffffffff8113de60>] alloc_pages_vma+0xd0/0x1c0
<4>[598175.284969]  [<ffffffff81130bcd>] read_swap_cache_async+0x10d/0x160
<4>[598175.284972]  [<ffffffff81130c94>] swapin_readahead+0x74/0xd0
<4>[598175.284975]  [<ffffffff81120bfa>] do_swap_page+0xea/0x5f0
<4>[598175.284978]  [<ffffffff81121c21>] handle_pte_fault+0x1e1/0x230
<4>[598175.284982]  [<ffffffff81465bcd>] do_page_fault+0x1fd/0x4c0
<4>[598175.284985]  [<ffffffff814627e5>] page_fault+0x25/0x30
<4>[598175.285002]  [<00007f65a0891078>] 0x7f65a0891077

好的,它想从交换中获取某些东西但失败了,因为显然没有更多的物理内存。让我们继续...

<4>[598175.285003] Mem-Info:
<4>[598175.285004] Node 0 DMA per-cpu:
<4>[598175.285006] CPU    0: hi:    0, btch:   1 usd:   0
<4>[598175.285007] CPU    1: hi:    0, btch:   1 usd:   0
<4>[598175.285008] Node 0 DMA32 per-cpu:
<4>[598175.285010] CPU    0: hi:  186, btch:  31 usd:   9
<4>[598175.285011] CPU    1: hi:  186, btch:  31 usd:   7
<4>[598175.285012] Node 0 Normal per-cpu:
<4>[598175.285013] CPU    0: hi:  186, btch:  31 usd:  35
<4>[598175.285014] CPU    1: hi:  186, btch:  31 usd:  31
<4>[598175.285017] active_anon:218 inactive_anon:91 isolated_anon:0
<4>[598175.285018]  active_file:187788 inactive_file:451982 isolated_file:896
<4>[598175.285018]  unevictable:0 dirty:0 writeback:69 unstable:0
<4>[598175.285019]  free:21841 slab_reclaimable:8417 slab_unreclaimable:132175
<4>[598175.285020]  mapped:8168 shmem:4 pagetables:2639 bounce:0

在这里,我们看到使用了略高于 3G 的网络,尽管我无法准确地说出不同的条目是什么。

<4>[598175.285021] Node 0 DMA free:15880kB min:256kB low:320kB high:384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolat
ed(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0k
B pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
<4>[598175.285027] lowmem_reserve[]: 0 3000 4010 4010
<4>[598175.285029] Node 0 DMA32 free:54600kB min:50368kB low:62960kB high:75552kB active_anon:860kB inactive_anon:308kB active_file:600716kB inactive_file:1576184kB
 unevictable:0kB isolated(anon):0kB isolated(file):3328kB present:3072160kB mlocked:0kB dirty:0kB writeback:248kB mapped:26800kB shmem:16kB slab_reclaimable:23552kB
 slab_unreclaimable:412540kB kernel_stack:752kB pagetables:2412kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4169324 all_unreclaimable? yes
<4>[598175.285036] lowmem_reserve[]: 0 0 1010 1010
<4>[598175.285038] Node 0 Normal free:16884kB min:16956kB low:21192kB high:25432kB active_anon:12kB inactive_anon:56kB active_file:150436kB inactive_file:231744kB u
nevictable:0kB isolated(anon):0kB isolated(file):384kB present:1034240kB mlocked:0kB dirty:0kB writeback:28kB mapped:5872kB shmem:0kB slab_reclaimable:10116kB slab_
unreclaimable:116160kB kernel_stack:2848kB pagetables:8144kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:688103 all_unreclaimable? yes
<4>[598175.285044] lowmem_reserve[]: 0 0 0 0
<4>[598175.285046] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
<4>[598175.285051] Node 0 DMA32: 12620*4kB 3*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 54600kB
<4>[598175.285056] Node 0 Normal: 3195*4kB 1*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 16884kB

似乎有很多碎片。但是由于请求了订单 0 页 (4k)(在 highmem 中!?),并且有大量可用,那没关系,不是吗?

<4>[598175.285061] 375504 total pagecache pages

那是超过 1G 的页面缓存。它不应该在抛出 OOM 之前先降低缓存吗?

<4>[598175.285062] 268 pages in swap cache
<4>[598175.285064] Swap cache stats: add 1266107, delete 1265839, find 3666696/3838636
<4>[598175.285065] Free swap  = 4641856kB
<4>[598175.285066] Total swap = 5244924kB

几乎没有使用交换。在抛出OOM之前不应该交换吗?

<4>[598175.285066] 1030522 pages RAM

顺便说一句,这是一台具有 4G RAM 的机器。

哦,FWIW 来了进程列表

<6>[598175.285067] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
<6>[598175.285071] [  485]     0   485     4223       62   0     -17         -1000 udevd
<6>[598175.285073] [ 1434]     0  1434     1003       65   1       0             0 acpid
<6>[598175.285075] [ 1449]   100  1449     8585      112   0       0             0 dbus-daemon
<6>[598175.285077] [ 1475]     0  1475    36450      428   1       0             0 mono
<6>[598175.285079] [ 1772]     0  1772    21365      298   1       0             0 vmtoolsd
<6>[598175.285081] [ 1838]   101  1838    12322      180   0       0             0 hald
<6>[598175.285083] [ 1842]     0  1842    41067      187   1       0             0 console-kit-dae
<6>[598175.285085] [ 1843]     0  1843     4510       56   1       0             0 hald-runner
<6>[598175.285087] [ 1961]     0  1961     8691       17   0       0             0 hald-addon-inpu
<6>[598175.285107] [ 1984]     0  1984     8691       75   1       0             0 hald-addon-stor
<6>[598175.285109] [ 1992]   101  1992     9130        7   1       0             0 hald-addon-acpi
<6>[598175.285111] [ 1993]     0  1993     8691       77   0       0             0 hald-addon-stor
<6>[598175.285113] [ 2562]     0  2562    47184       78   1       0             0 httpstkd
<6>[598175.285115] [ 2581]     0  2581     5881      221   1       0             0 syslog-ng
<6>[598175.285117] [ 2584]     0  2584     1070       63   1       0             0 klogd
<6>[598175.285119] [ 2598]     0  2598    23796      104   1     -17         -1000 auditd
<6>[598175.285121] [ 2600]     0  2600    19995       87   1       0             0 audispd
<6>[598175.285123] [ 2621]     0  2621     2093       58   0       0             0 haveged
<6>[598175.285125] [ 2641]     0  2641     4728       81   1       0             0 rpcbind
<6>[598175.285127] [ 2680]     0  2680    77513      657   0       0             0 nsrexecd
<6>[598175.285129] [ 2753]     0  2753     4222       52   0     -17         -1000 udevd
<6>[598175.285131] [ 2832]     0  2832     2160       75   0       0             0 irqbalance
<6>[598175.285133] [ 2863]     0  2863     6778       53   1       0             0 mcelog
<6>[598175.285135] [ 3163]     0  3163    35027      170   1       0             0 gmond
<6>[598175.285137] [ 3177] 65534  3177    56670      185   0       0             0 gmetad
<6>[598175.285139] [ 3213]     0  3213    24991      107   1       0             0 sfcbd
<6>[598175.285141] [ 3214]     0  3214    16795        0   1       0             0 sfcbd
<6>[598175.285143] [ 3221]     0  3221    20445       78   1       0             0 sfcbd
<6>[598175.285145] [ 3222]     0  3222    41992      117   1       0             0 sfcbd
<6>[598175.285147] [ 3239]     0  3239    16092       58   1       0             0 pure-ftpd
<6>[598175.285149] [ 3240]     2  3240     6284       82   0       0             0 slpd
<6>[598175.285151] [ 3290]     0  3290    12855      120   0     -17         -1000 sshd
<6>[598175.285153] [ 3316]    74  3316     8070      152   0       0             0 ntpd
<6>[598175.285154] [ 3333]     0  3333    17945       90   1       0             0 cupsd
<6>[598175.285156] [ 3393]     0  3393    19365       31   1       0             0 sfcbd
<6>[598175.285158] [ 3395]     0  3395    21475      109   0       0             0 sfcbd
<6>[598175.285160] [ 3400]     0  3400    38331      129   1       0             0 sfcbd
<6>[598175.285162] [ 3479]     0  3479    38357      125   0       0             0 sfcbd
<6>[598175.285164] [ 3719]     0  3655   220311     2005   0       0             0 ndsd
<6>[598175.285166] [ 3893]    30  3893   177915      910   0       0             0 java
<6>[598175.285168] [ 3910]     0  3910    14968       97   1       0             0 nscd
<6>[598175.285170] [ 3961]     0  3961    47276      332   0       0             0 namcd
<6>[598175.285172] [ 4073]     0  4073    10998      104   0       0             0 master
<6>[598175.285174] [ 4099]    51  4099    14190      229   0       0             0 qmgr
<6>[598175.285176] [ 4135]     0  4135    33370       99   1       0             0 httpd2-prefork
<6>[598175.285178] [ 4136]    30  4136    35518       85   1       0             0 httpd2-prefork
<6>[598175.285180] [ 4137]    30  4137    35523      266   0       0             0 httpd2-prefork
<6>[598175.285182] [ 4138]    30  4138    35523      111   0       0             0 httpd2-prefork
<6>[598175.285184] [ 4139]    30  4139    35523      137   0       0             0 httpd2-prefork
<6>[598175.285186] [ 4140]    30  4140    35523      299   0       0             0 httpd2-prefork
<6>[598175.285188] [ 4168]     0  4168     5751       86   0       0             0 cron
<6>[598175.285190] [ 4349]     0  4349    43028      120   0       0             0 ndpapp
<6>[598175.285194] [ 4548]     0  4548    17722       33   0       0             0 adminusd
<6>[598175.285196] [ 4577]     0  4577    17136       26   1       0             0 jstcpd
<6>[598175.285198] [ 4580]     0  4580    12511        0   1       0             0 jstcpd
<6>[598175.285200] [ 4601]     0  4601    10976       42   1       0             0 vlrpc
<6>[598175.285202] [ 4621]     0  4621     4222       54   1     -17         -1000 udevd
<6>[598175.285204] [ 4672]     0  4672    21525       70   0       0             0 volmnd
<6>[598175.285206] [ 4693]     0  4693    48377      195   0       0             0 ncp2nss
<6>[598175.285208] [ 4942]    81  4942    40049       32   0       0             0 novell-xregd
<6>[598175.285210] [ 5195]     0  5195    90312      479   0       0             0 cifsd
<6>[598175.285212] [ 5240]     0  5240     9586        9   1       0             0 smdrd
<6>[598175.285214] [ 5279]     0  5279    55127      172   0       0             0 novfsd
<6>[598175.285216] [ 5327]   104  5327     9431       72   0       0             0 nrpe
<6>[598175.285218] [ 5337]     0  5337     3177       78   0       0             0 mingetty
<6>[598175.285219] [ 5338]     0  5338     3177       78   1       0             0 mingetty
<6>[598175.285221] [ 5339]     0  5339     3177       78   0       0             0 mingetty
<6>[598175.285223] [ 5340]     0  5340     3177       78   1       0             0 mingetty
<6>[598175.285225] [ 5341]     0  5341     3177       78   0       0             0 mingetty
<6>[598175.285227] [ 5342]     0  5342     3177       78   1       0             0 mingetty
<6>[598175.285229] [ 5520]     0  5520    67658       99   0       0             0 cifsd
<6>[598175.285231] [25139]     0 25139    17698      836   0       0             0 snmpd
<6>[598175.285233] [ 4842]    51  4842    14147      511   0       0             0 pickup
<6>[598175.285235] [ 7917]     0  7917    21027     2460   1       0             0 savepnpc
<3>[598175.285237] Out of memory: Kill process 3719 (ndsd) score 19 or sacrifice child
<3>[598175.285239] Killed process 3719 (ndsd) total-vm:881244kB, anon-rss:0kB, file-rss:8020kB

所以它发现了几乎 1G 的内存要释放,但它继续,许多进程跟随......

这有些不对劲。

这是一个内核 3.0.x 顺便说一句

oom
  • 1 个回答
  • 558 Views
Martin Hope
James Kingsbery
Asked: 2014-02-01 07:45:52 +0800 CST

如何计算内核 oom 分数?

  • 14

在 Google 上查看,找不到任何解释如何proc/<pid>/oom_score计算分数的内容。为什么使用这个分数而不是只使用使用的总内存?

oom
  • 1 个回答
  • 19766 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve