Discussion:
solr 4.4 splitshard query
ashoknix
2018-12-05 12:14:22 UTC
Permalink
Hi,

I have a legacy app which runs on solr 4.4 - I have 4 nodes solr cloud
with 3 zookeepers.

curl -v
'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'

<lst name="responseHeader"><int name="status">500</int><int
name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
the collection time out:300s</str><str
name="trace">org.apache.solr.common.SolrException: splitshard the collection
time out:300s
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
at
org.apache.solr.handler.admin.CollectionsHandler.handleSplitShardAction(CollectionsHandler.java:322)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:136)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

I have few questions:

1. Currently index size is around 40GB.
2. Right now it has single shard - we observe query times high.
3. Does SPLITSHARD helps here with query times? Since docs gets
distributed

Please advise..

Thanks,
Ash



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Kelly, Frank
2018-12-05 20:41:01 UTC
Permalink
Whenever I hit a problem with SPLITSHARDS it's usually because I run out of disk as effectively your doubling the disk space used by the shard.

However for large indexes (and 40GB is pretty large) take a look at https://issues.apache.org/jira/browse/SOLR-5324
If that's the problem one possible workaround is to reduce the number of replicas before splitting the shard - although that will likely increase your query times even more.

-Frank

On 12/5/18, 7:14 AM, "ashoknix" <***@gmail.com> wrote:

Hi,

I have a legacy app which runs on solr 4.4 - I have 4 nodes solr cloud
with 3 zookeepers.

curl -v
'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'

<lst name="responseHeader"><int name="status">500</int><int
name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
the collection time out:300s</str><str
name="trace">org.apache.solr.common.SolrException: splitshard the collection
time out:300s
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
at
org.apache.solr.handler.admin.CollectionsHandler.handleSplitShardAction(CollectionsHandler.java:322)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:136)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

I have few questions:

1. Currently index size is around 40GB.
2. Right now it has single shard - we observe query times high.
3. Does SPLITSHARD helps here with query times? Since docs gets
distributed

Please advise..

Thanks,
Ash



--
Sent from: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flucene.472066.n3.nabble.com%2FSolr-User-f472068.html&amp;data=01%7C01%7C%7Cba17b59994454e140f7408d65aab347a%7C6d4034cd72254f72b85391feaea64919%7C1&amp;sdata=bTL8LMG7F57nD6scyFzl
Shawn Heisey
2018-12-10 17:40:51 UTC
Permalink
Post by ashoknix
curl -v
'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'
<lst name="responseHeader"><int name="status">500</int><int
name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
the collection time out:300s</str><str
name="trace">org.apache.solr.common.SolrException: splitshard the collection
time out:300s
<snip>
Post by ashoknix
1. Currently index size is around 40GB.
2. Right now it has single shard - we observe query times high.
3. Does SPLITSHARD helps here with query times? Since docs gets
distributed
You're trying to make the call async.  This is a good idea... but async
capability for the collections API was added in Solr 4.8.

https://issues.apache.org/jira/browse/SOLR-5477

Which means that in version 4.4, any collections API action that takes
longer than your collections API timeout is going to return this error. 
Your timeout appears to be 300 seconds.  I do not know whether the
splitshard will continue to operate on the server in this situation or not.

Once you have successfully split your index, the following will apply: 
Increasing the shard count will increase the amount of work that Solr
must do to execute a query.  If your query rate is very low and your
system has idle CPUs, then the query might complete faster.  If your
query rate is high or you do not have idle CPUs, then splitting shards
will make your queries take longer.

Because the latest version of Solr is 7.5.0, I would not recommend
running any 4.x version.  There is zero possibility of bugs in 4.x
getting developer attention.  Bugs in 6.6.x MIGHT get attention, but
mostly only bugs in the current major release will be addressed.

Thanks,
Shawn

Loading...