Eric Bus
2013-10-22 08:00:18 UTC
Hi,
I've been running a SolrCloud setup running SOLR 4.4 consisting of 3 nodes for some time. The cloud is hosting about 40 small collections that receive updates once a day. The collections are using different shard and replication configurations (varying from 2 shards without replication to 2 shard with 3 replicas).
After running Tomcat for a couple of weeks, I notice the number of open files is dramatically increasing. Most of those files are deleted tlog files that SOLR keeps open:
***@node1:/ # lsof -np 16810 | grep deleted | wc -l
36345
Those files are no longer on disk, but SOLR still has a handle open. My disk use is going through the roof. 6GB is currently 'in use' by deleted but still open files. When I restart Tomcat, the space is freed and it starts all over again. All of my nodes experience this behavior.
First I thought it had something to do with the lack of commits. But it happens on all my collections, even the ones with fast autoCommit:
<autoCommit>
<maxDocs>5000</maxDocs>
<maxTime>120000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
My update process always triggers a commit or rollback and updates are showing up correctly.
I read something about SOLR having TCP connections in CLOSE_WAIT. The only CLOSE_WAIT connection I see are between the nodes. And there are only about 10 of them. Those connections can't be causing 36k open files, right?
Any suggestions/tips? At the moment, I have to restart my leader every couple of weeks and that's not really something I would like to do :)
Best regards,
Eric Bus
I've been running a SolrCloud setup running SOLR 4.4 consisting of 3 nodes for some time. The cloud is hosting about 40 small collections that receive updates once a day. The collections are using different shard and replication configurations (varying from 2 shards without replication to 2 shard with 3 replicas).
After running Tomcat for a couple of weeks, I notice the number of open files is dramatically increasing. Most of those files are deleted tlog files that SOLR keeps open:
***@node1:/ # lsof -np 16810 | grep deleted | wc -l
36345
Those files are no longer on disk, but SOLR still has a handle open. My disk use is going through the roof. 6GB is currently 'in use' by deleted but still open files. When I restart Tomcat, the space is freed and it starts all over again. All of my nodes experience this behavior.
First I thought it had something to do with the lack of commits. But it happens on all my collections, even the ones with fast autoCommit:
<autoCommit>
<maxDocs>5000</maxDocs>
<maxTime>120000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
My update process always triggers a commit or rollback and updates are showing up correctly.
I read something about SOLR having TCP connections in CLOSE_WAIT. The only CLOSE_WAIT connection I see are between the nodes. And there are only about 10 of them. Those connections can't be causing 36k open files, right?
Any suggestions/tips? At the moment, I have to restart my leader every couple of weeks and that's not really something I would like to do :)
Best regards,
Eric Bus