Malloc has a lock while it is active in the heap. If there is more than one thread, and malloc finds the lock in use, then it avoids waiting on the lock by creating a new 'arena' to hold its heap. My understanding is that a process with multiple threads which are all active users of malloc will eventually have an arena per thread. If you limit the number of arenas, you may suffer delays waiting on locks.
But this needs performance testing. My experience was with C++, not with a JVM. I would be interested to know if setting MALLOC_ARENA_MAX=2 makes a difference to performance.
Post by Kevin RisdenI haven't looked at reproducing this locally, but since it seems like
there haven't been any new ideas decided to share this in case it
I noticed in Travis CI [1] they are adding the environment variable
MALLOC_ARENA_MAX=2 and so I googled what that configuration did. To my
surprise, I came across a stackoverflow post [2] about how glibc could
actually be the case and report memory differently. I then found a
Hadoop issue HADOOP-7154 [3] about setting this as well to reduce
virtual memory usage. I found some more cases where this has helped as
well [4], [5], and [6]
[1]
https://docs.travis-ci.com/user/build-environment-updates/2017-09-06/#Added
[2]
https://stackoverflow.com/questions/10575342/what-would-cause-a-java-process-to-greatly-exceed-the-xmx-or-xss-limit
[3]
https://issues.apache.org/jira/browse/HADOOP-7154?focusedCommentId=14505792
[4] https://github.com/cloudfoundry/java-buildpack/issues/320
[5] https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior
[6]
https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en
Kevin Risden
On Thu, Aug 24, 2017 at 10:19 AM, Markus Jelsma
Post by Markus JelsmaHello Bernd,
According to the man page, i should get a list of stuff in shared
memory if i invoke it with just a PID. Which shows a list of libraries
that together account for about 25 MB's shared memory usage. Accoring
to ps and top, the JVM uses 2800 MB shared memory (not virtual), that
leaves 2775 MB unaccounted for. Any ideas? Anyone else to reproduce it
on a freshly restarted node?
Post by Markus JelsmaThanks,
Markus
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
Post by Markus Jelsma18901 markus 20 0 14,778g 4,965g 2,987g S 891,1 31,7 20:21.63
java
Post by Markus Jelsma0x000055b9a17f1000 6K
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
Post by Markus Jelsma0x00007fdf1d314000 182K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libsunec.so
Post by Markus Jelsma0x00007fdf1e548000 38K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libmanagement.so
Post by Markus Jelsma0x00007fdf1e78e000 94K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnet.so
Post by Markus Jelsma0x00007fdf1e9a6000 75K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnio.so
Post by Markus Jelsma0x00007fdf5cd6e000 34K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libzip.so
Post by Markus Jelsma0x00007fdf5cf77000 46K
/lib/x86_64-linux-gnu/libnss_files-2.24.so
Post by Markus Jelsma0x00007fdf5d189000 46K
/lib/x86_64-linux-gnu/libnss_nis-2.24.so
Post by Markus Jelsma0x00007fdf5d395000 90K /lib/x86_64-linux-gnu/libnsl-2.24.so
0x00007fdf5d5ae000 34K
/lib/x86_64-linux-gnu/libnss_compat-2.24.so
Post by Markus Jelsma0x00007fdf5d7b7000 187K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libjava.so
Post by Markus Jelsma0x00007fdf5d9e6000 70K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libverify.so
Post by Markus Jelsma0x00007fdf5dbf8000 30K /lib/x86_64-linux-gnu/librt-2.24.so
0x00007fdf5de00000 90K /lib/x86_64-linux-gnu/libgcc_s.so.1
0x00007fdf5e017000 1063K /lib/x86_64-linux-gnu/libm-2.24.so
0x00007fdf5e320000 1553K
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
Post by Markus Jelsma0x00007fdf5e6a8000 15936K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
Post by Markus Jelsma0x00007fdf5f5ed000 139K
/lib/x86_64-linux-gnu/libpthread-2.24.so
Post by Markus Jelsma0x00007fdf5f80b000 14K /lib/x86_64-linux-gnu/libdl-2.24.so
0x00007fdf5fa0f000 110K /lib/x86_64-linux-gnu/libz.so.1.2.11
0x00007fdf5fc2b000 1813K /lib/x86_64-linux-gnu/libc-2.24.so
0x00007fdf5fff2000 58K
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
Post by Markus Jelsma0x00007fdf60201000 158K /lib/x86_64-linux-gnu/ld-2.24.so
-----Original message-----
Sent: Thursday 24th August 2017 15:39
Subject: Re: Solr uses lots of shared memory!
Just an idea, how about taking a dump with jmap and using
MemoryAnalyzerTool to see what is going on?
Regards
Bernd
Post by Markus JelsmaHello Shalin,
Yes, the main search index has DocValues on just a few fields,
they are used for facetting and function queries, we started using
DocValues when 6.0 was released. Most fields are content fields for
many languages. I don't think it is going to be DocValues because the
max shared memory consumption is reduced my searching on fields fewer
languages, and by disabling highlighting, both not using DocValues.
Post by Markus JelsmaPost by Markus JelsmaBut it tried the option regardless, and because i didn't know
about it. But it appears the option does exactly nothing. First is
without any configuration for preload, second is with preload=true,
third is preload=false
Post by Markus JelsmaPost by Markus Jelsma14220 markus 20 0 14,675g 1,508g 62800 S 1,0 9,6
0:36.98 java
Post by Markus JelsmaPost by Markus Jelsma14803 markus 20 0 14,674g 1,537g 63248 S 0,0 9,8
0:34.50 java
Post by Markus JelsmaPost by Markus Jelsma15324 markus 20 0 14,674g 1,409g 63152 S 0,0 9,0
0:35.50 java
Post by Markus JelsmaPost by Markus Jelsma <directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
Post by Markus JelsmaPost by Markus Jelsma <bool name="preload">false</bool>
</directoryFactory>
NRTCachingDirectoryFactory implies MMapDirectory right?
Thanks,
Markus
-----Original message-----
Sent: Thursday 24th August 2017 5:51
Subject: Re: Solr uses lots of shared memory!
Very interesting. Do you have many DocValue fields? Have you
always
Post by Markus JelsmaPost by Markus Jelsmahad them i.e. did you see this problem before you turned on
DocValues?
Post by Markus JelsmaPost by Markus JelsmaThe DocValue fields are in a separate file and they will be
memory
Post by Markus JelsmaPost by Markus Jelsmamapped on demand. One thing you can experiment with is to use
preload=true option on the MMapDirectoryFactory which will mmap
all
Post by Markus JelsmaPost by Markus Jelsmaindex files on startup [1]. Once you do this, and if you still
notice
Post by Markus JelsmaPost by Markus Jelsmashared memory leakage then it may be a genuine memory leak that
we
http://lucene.apache.org/solr/guide/6_6/datadir-and-directoryfactory-in-solrconfig.html#DataDirandDirectoryFactoryinSolrConfig-SpecifyingtheDirectoryFactoryForYourIndex
Post by Markus JelsmaPost by Markus JelsmaOn Wed, Aug 23, 2017 at 7:02 PM, Markus Jelsma
Post by Markus JelsmaI do not think it is a problem of reporting after watching top
after restart of some Solr instances, it dropped back to `normal`,
around 350 MB, which i think it high to but anyway.
shared memory consumption to about 1500 MB now. I don't understand why
shared memory usage should/would increase slowly over time, it makes
little sense to me and i cannot remember Solr doing this in the past
ten years.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaBut it seems to correlate to index size on disk, these main text
search nodes have an index of around 16 GB and up 3 GB of shared memory
after a few days. Logs nodes up to 800 MB index size and 320 MB of
shared memory, the low latency nodes have four different cores that
make up just over 100 MB index size, shared memory consumption is just
22 MB, which seems more reasonable for the case of shared memory.
queries to it. My freshly restarted local node used 68 MB shared memory
at startup. Two minutes and 25.000 queries later it was already 2748
MB! At first there is a very sharp increase to 2000, then it takes
almost two minutes more to increase to 2748. I can decrease the maximum
shared memory usage to 1200 if i query (via edismax) only on fields of
one language instead of 25 orso. I can decrease it as well further if i
disable highlighting (HUH?) but still query on all fields.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus Jelsma* We have tried patching Java's ByteBuffer [1] because it seemed
to fit the problems, it does not fix it.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus Jelsma* We have also removed all our custom plugins, so it has become
a vanilla Solr 6.6 just with our stripped down schema and solrconfig,
it neither fixes it.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaWhy does it slowly increase over time?
Why does it appear to correlate to index size?
Is anyone else seeing this on their 6.6 cloud production or
local machines?
Post by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaThanks,
Markus
[1]: http://www.evanjones.ca/java-bytebuffer-leak.html
-----Original message-----
Sent: Tuesday 22nd August 2017 17:32
Subject: Re: Solr uses lots of shared memory!
Post by Markus JelsmaI have never seen this before, one of our collections, all
nodes eating tons of shared memory!
2511:46 java
shared memory. Virtual is equal to RSS and index size on disk. For two
other collections, the nodes use shared memory as expected, in the MB
range.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaHow can Solr, this collection, use so much shared memory? Why?
I've seen this on my own servers at work, and when I add up a
subset of
more
Post by Markus JelsmaPost by Markus JelsmaPost by Markus Jelsmamemory than I even have in the server.
I suspect there is something odd going on in how Java reports
memory
Java's memory
correctly. I
Post by Markus JelsmaPost by Markus JelsmaPost by Markus Jelsmado not know if the change came about because of a Solr upgrade,
because
three were
I noticed
https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0
41.45GB of
memory is
machine only
usage of
situation
deduct that
Post by Markus JelsmaPost by Markus JelsmaPost by Markus Jelsma11GB of SHR from the RES value, then all the numbers work.
The screenshot was almost 3 years ago, so I do not know what
machine it
size was.
Post by Markus JelsmaPost by Markus JelsmaPost by Markus JelsmaI think it was about 6GB -- the difference between RES and SHR.
I have
The
noticeable
that the
OS) are
Sorry for being brief. Alternate email is rickleir at yahoo dot com