调优mongodb内存设置

 

从3.2开始,monogdb的默认存储引擎就是WiredTiger。mongodb会同时使用wt的内部缓存和文件系统缓存。从3.4开始,mongodb的wt引擎内存缓存的设置规则为:(RAM-1GB)*50% 或者256MB

你需要分析是否对默认的内存做调优。一条比较好的原则就是wt的缓存足够大,能够缓存整个应用的工作集。

查看wt的缓存统计信息:

db.serverStatus().wiredTiger.cache
{
application threads page read from disk to cache count : 9,
application threads page read from disk to cache time (usecs) : 17555,
application threads page write from cache to disk count : 1820,
application threads page write from cache to disk time (usecs) : 1052322,
bytes allocated for updates : 20043,
bytes belonging to page images in the cache : 46742,
bytes belonging to the history store table in the cache : 173,
bytes currently in the cache : 73044,
bytes dirty in the cache cumulative : 38638327,
bytes not belonging to page images in the cache : 26302,
bytes read into cache : 43280,
bytes written from cache : 20517382,
cache overflow score : 0,
checkpoint blocked page eviction : 0,
eviction calls to get a page : 5973,
eviction calls to get a page found queue empty : 4973,
eviction calls to get a page found queue empty after locking : 20,
eviction currently operating in aggressive mode : 0,
eviction empty score : 0,
eviction passes of a file : 0,
eviction server candidate queue empty when topping up : 0,
eviction server candidate queue not empty when topping up : 0,
eviction server evicting pages : 0,
eviction server slept, because we did not make progress with eviction : 735,
eviction server unable to reach eviction goal : 0,
eviction server waiting for a leaf page : 2,
eviction state : 64,
eviction walk target pages histogram - 0-9 : 0,
eviction walk target pages histogram - 10-31 : 0,
eviction walk target pages histogram - 128 and higher : 0,
eviction walk target pages histogram - 32-63 : 0,
eviction walk target pages histogram - 64-128 : 0,
eviction walk target strategy both clean and dirty pages : 0,
eviction walk target strategy only clean pages : 0,
eviction walk target strategy only dirty pages : 0,
eviction walks abandoned : 0,
eviction walks gave up because they restarted their walk twice : 0,
eviction walks gave up because they saw too many pages and found no candidates : 0,
eviction walks gave up because they saw too many pages and found too few candidates : 0,
eviction walks reached end of tree : 0,
eviction walks started from root of tree : 0,
eviction walks started from saved location in tree : 0,
eviction worker thread active : 4,
eviction worker thread created : 0,
eviction worker thread evicting pages : 902,
eviction worker thread removed : 0,
eviction worker thread stable number : 0,
files with active eviction walks : 0,
files with new eviction walks started : 0,
force re-tuning of eviction workers once in a while : 0,
forced eviction - history store pages failed to evict while session has history store cursor open : 0,
forced eviction - history store pages selected while session has history store cursor open : 0,
forced eviction - history store pages successfully evicted while session has history store cursor open : 0,
forced eviction - pages evicted that were clean count : 0,
forced eviction - pages evicted that were clean time (usecs) : 0,
forced eviction - pages evicted that were dirty count : 0,
forced eviction - pages evicted that were dirty time (usecs) : 0,
forced eviction - pages selected because of too many deleted items count : 0,
forced eviction - pages selected count : 0,
forced eviction - pages selected unable to be evicted count : 0,
forced eviction - pages selected unable to be evicted time : 0,
forced eviction - session returned rollback error while force evicting due to being oldest : 0,
hazard pointer blocked page eviction : 0,
hazard pointer check calls : 902,
hazard pointer check entries walked : 25,
hazard pointer maximum array length : 1,
history store key truncation calls that returned restart : 0,
history store key truncation due to mixed timestamps : 0,
history store key truncation due to the key being removed from the data page : 0,
history store score : 0,
history store table insert calls : 0,
history store table insert calls that returned restart : 0,
history store table max on-disk size : 0,
history store table on-disk size : 0,
history store table out-of-order resolved updates that lose their durable timestamp : 0,
history store table out-of-order updates that were fixed up by moving existing records : 0,
history store table out-of-order updates that were fixed up during insertion : 0,
history store table reads : 0,
history store table reads missed : 0,
history store table reads requiring squashed modifies : 0,
history store table remove calls due to key truncation : 0,
history store table writes requiring squashed modifies : 0,
in-memory page passed criteria to be split : 0,
in-memory page splits : 0,
internal pages evicted : 0,
internal pages queued for eviction : 0,
internal pages seen by eviction walk : 0,
internal pages seen by eviction walk that are already queued : 0,
internal pages split during eviction : 0,
leaf pages split during eviction : 0,
maximum bytes configured : 8053063680,
maximum page size at eviction : 376,
modified pages evicted : 902,
modified pages evicted by application threads : 0,
operations timed out waiting for space in cache : 0,
overflow pages read into cache : 0,
page split during eviction deepened the tree : 0,
page written requiring history store records : 0,
pages currently held in the cache : 24,
pages evicted by application threads : 0,
pages queued for eviction : 0,
pages queued for eviction post lru sorting : 0,
pages queued for urgent eviction : 902,
pages queued for urgent eviction during walk : 0,
pages read into cache : 20,
pages read into cache after truncate : 902,
pages read into cache after truncate in prepare state : 0,
pages requested from the cache : 33134,
pages seen by eviction walk : 0,
pages seen by eviction walk that are already queued : 0,
pages selected for eviction unable to be evicted : 0,
pages selected for eviction unable to be evicted as the parent page has overflow items : 0,
pages selected for eviction unable to be evicted because of active children on an internal page : 0,
pages selected for eviction unable to be evicted because of failure in reconciliation : 0,
pages walked for eviction : 0,
pages written from cache : 1822,
pages written requiring in-memory restoration : 0,
percentage overhead : 8,
tracked bytes belonging to internal pages in the cache : 5136,
tracked bytes belonging to leaf pages in the cache : 67908,
tracked dirty bytes in the cache : 493,
tracked dirty pages in the cache : 1,
unmodified pages evicted : 0
}

这里关于WiredTiger缓存的统计有很多,但是我们应该重点关注以下内容:

1.wiredTiger.cache.maximum bytes configured

当前缓存的最大值

2.wiredTiger.cache.bytes currently in the cache

当前在缓存中的数据。通常这个值是你配置的wt的缓存的80%,在加上一些尚未写入磁盘的脏数据。这个统计信息的值不能超过分配给wt的最大值。如果该值接近或超过分配给wt的最大值,则表示应该横向扩展了(scaled out)

3.wiredTiger.cache.tracked dirty bytes in the cache

这个值表示的是wt的缓存中脏数据的大小。应该小于你配置的wt的缓存的5%。也可以作为是否需要进行横向扩展的一个指标。一旦超过5%,就可能会移除更多的缓存数据,某些场景下为了让客户端成功写入数据,会驱除掉缓存数据。

4.wiredTiger.cache.pages read into cache

读入缓存的页数,可以使用该指标来判断每秒读入缓存的数据量。

如果数值较大,表明是读负载比较高的应用。如果一直比较大,那增加内存的大小,可能会增加你的系统的性能。

5.wiredTiger.cache.pages written from cache

从缓存写入磁盘的数据页数。发生检查点的时候,该值会比较大。如果该值继续增大,你的检查点就会变长。

 

分析这几个统计数据,我们就可以知道是否需要增加内存。

此外,我们也该查看WiredTiger的并发写ticket和读ticket的使用情况。如果数量继续增加,接近cores的数量,那么你的cpu将达到饱和。

 

可以使用pmm来检查并发写ticket和读ticket的使用情况。

 

也可以使用以下命令:

db.serverStatus().wiredTiger.concurrentTransactions
{
write : {
out : 0,
available : 128,
totalTickets : 128
},
read : {
out : 1,
available : 127,
totalTickets : 128
}
}

 

在我们这个例子中:

·wiredTiger.cache.maximum bytes configured = 8053063680,即7.5 GB

·wiredTiger.cache.bytes currently in the cache = 8053063680,即7.5 GB

·wiredTiger.cache.tracked dirty bytes in the cache = 4294967296,即4 GB

·wiredTiger.cache.pages read into cache = ~ 512 MB

·wiredTiger.cache.pages written from cache = ~ 1 GB

 

我们使用的是默认配置,所以可以知道我们的内存是16GB。基于上面的分析,我们知道

我们使用默认的wiredTiger缓存大小,因此我们知道系统中有16gb内存(0.5 * (16-1))= 7.5 GB。根据我们对(假想的)应用的了解,我们知道工作集是16gb,所以我们想要比这个数字更大。由于我们的工作集只会继续增长,为了具备额外的增长空间,我们可以将服务器的RAM从16gb调整到48gb。如果我们仍然使用默认设置,这将增加我们的WiredTiger缓存到23.5 GB=(0.5 * (48-1)) 。这将为操作系统及其文件系统缓存留下24.5 GB的RAM。

如果我们想要增加WiredTiger缓存的大小,我们需要storage.wiredTiger.engineConfig.cacheSizeGB设置为我们想要的值。例如,我们想要分配30gb给wiredTiger缓存,以真正避免在短期内从磁盘读取任何数据,而留给操作系统及其文件系统缓存的是18gb。我们将把以下内容添加到mongod.conf文件中:

storage:
   wiredTiger:
       engineConfig:
           cacheSizeGB: 30

对于默认设置或特定设置,要能识别增加的内存并生效,我们将需要重新启动mongod进程。

还要注意,与许多其他数据库系统(数据库缓存的大小通常接近系统内存的80-90%)不同,MongoDB的最佳位置在50-70%之间。这是因为MongoDB只对未压缩的页面使用WiredTiger缓存,而操作系统缓存压缩的页面并将其写入数据库文件。通过将空闲内存留给操作系统,我们增加了从操作系统缓存中获取页面的可能性,而不需要进行磁盘读取。