Discussion:
no servers hosting shard
patrick conant
2014-01-07 16:57:54 UTC
Permalink
In our Solr instance we have two shards each running on two servers. The
server that was the leader for one of the shards ran into a problem, and
when we restarted the service, Solar is no longer electing a leader for the
shard.

The stack traces from the logs are below, and the 'Cloud Dump' from the
Solr console is attached. We're running Solr 4.4.0. Any guidance on how
to recover from this? Restarting or redeploying the service doesn't seem
to make any difference.

Thanks,
Pat.


2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore
- org.apache.solr.common.SolrException: no servers hosting shard:
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore
- org.apache.solr.common.SolrException: No registered leader was found,
collection:customerOrderSearch slice:shard1
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
patrick conant
2014-01-07 17:20:42 UTC
Permalink
After a full bounce of Tomcat, I'm now getting a new exception (below). I
can browse the Zookeeper config in the Solr admin UI, and can confirm that
there's a node for '/collections/customerOrderSearch/leaders/shard2', but
no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any
ideas or guidance on how to recover would be appreciated. We've restarted
all three zookeeper instances and both Solr instances, but that hasn't made
any appreciable difference.

--p.




2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR
org.apache.solr.core.CoreContainer -
null:org.apache.solr.common.cloud.ZooKeeperException:
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309)
at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error getting leader from
zk for shard shard1
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864)
at org.apache.solr.cloud.ZkController.register(ZkController.java:773)
at org.apache.solr.cloud.ZkController.register(ZkController.java:723)
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286)
... 11 more
Caused by: org.apache.solr.common.SolrException: Could not get leader props
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875)
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839)
... 14 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889)
... 16 more
Post by patrick conant
In our Solr instance we have two shards each running on two servers. The
server that was the leader for one of the shards ran into a problem, and
when we restarted the service, Solar is no longer electing a leader for the
shard.
The stack traces from the logs are below, and the 'Cloud Dump' from the
Solr console is attached. We're running Solr 4.4.0. Any guidance on how
to recover from this? Restarting or redeploying the service doesn't seem
to make any difference.
Thanks,
Pat.
2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore
- org.apache.solr.common.SolrException: No registered leader was found,
collection:customerOrderSearch slice:shard1
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
patrick conant
2014-01-07 17:40:40 UTC
Permalink
We found a way to recover. This sequence allowed everything to start up
successfully.

- Stop all Solr instances
- Stop all Zookeeper instances
- Start all Zookeeper instances
- Start Solr instances one at a time.

Restarting the first Solr instance took several minutes, but the subsequent
instances started up much more quickly.

Cheers,
Pat.
Post by patrick conant
After a full bounce of Tomcat, I'm now getting a new exception (below). I
can browse the Zookeeper config in the Solr admin UI, and can confirm that
there's a node for '/collections/customerOrderSearch/leaders/shard2', but
no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any
ideas or guidance on how to recover would be appreciated. We've restarted
all three zookeeper instances and both Solr instances, but that hasn't made
any appreciable difference.
--p.
2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR
org.apache.solr.core.CoreContainer -
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309)
at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error getting leader from
zk for shard shard1
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864)
at org.apache.solr.cloud.ZkController.register(ZkController.java:773)
at org.apache.solr.cloud.ZkController.register(ZkController.java:723)
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286)
... 11 more
Caused by: org.apache.solr.common.SolrException: Could not get leader props
at
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875)
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839)
... 14 more
KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
at
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889)
... 16 more
Post by patrick conant
In our Solr instance we have two shards each running on two servers. The
server that was the leader for one of the shards ran into a problem, and
when we restarted the service, Solar is no longer electing a leader for the
shard.
The stack traces from the logs are below, and the 'Cloud Dump' from the
Solr console is attached. We're running Solr 4.4.0. Any guidance on how
to recover from this? Restarting or redeploying the service doesn't seem
to make any difference.
Thanks,
Pat.
2014-01-07 00:00:10,754 [http-8080-62] ERROR
org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: no
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
2014-01-07 09:38:33,701 [http-8080-21] ERROR
org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: No
registered leader was found, collection:customerOrderSearch slice:shard1
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Loading...