Discussion:
SolrJ does not use HTTP proxy anymore in 7.5.0 after update from 6.6.5
Andreas Hubold
2018-10-01 12:54:07 UTC
Permalink
Hi,

SolrJ 6.6.5 used org.apache.http.impl.client.SystemDefaultHttpClient
under the hood, which took system properties for HTTP proxy config into
account (http.proxyHost and http.proxyPort).

The deprecated SystemDefaultHttpClient class was replaced as part of
SOLR-4509. And with Solr 7.5.0 I'm now unable to use an HTTP proxy with
SolrJ at all (not using Solr Cloud here). SolrJ 7.5 uses
org.apache.http.impl.client.HttpClientBuilder#create to create an
HttpClient, but it does not call #useSystemProperties on the builder.
Because of that, the proxy configuration from system properties is ignored.

Is there some other way to configure an HTTP proxy, e.g. with
HttpSolrClient.Builder? I don't want to create an Apache HttpClient
instance myself but the builder from Solrj (HttpSolrClient.Builder).

Thanks in advance,
Andreas
ahubold
2018-10-10 13:14:42 UTC
Permalink
I've now created https://issues.apache.org/jira/browse/SOLR-12848 for this
problem.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Shawn Heisey
2018-10-10 14:31:22 UTC
Permalink
Post by Andreas Hubold
Is there some other way to configure an HTTP proxy, e.g. with
HttpSolrClient.Builder? I don't want to create an Apache HttpClient
instance myself but the builder from Solrj (HttpSolrClient.Builder).
Unless you want to wait for a fix for SOLR-12848, you have two options:

1) Use a SolrJ client from 6.6.x, before the fix for SOLR-4509.  If
you're using HttpSolrClient rather than CloudSolrClient, a SolrJ major
version that's different than your Solr major version won't be a big
problem.  Large version discrepancies can be very problematic with the
Cloud client.

2) Create a custom HttpClient instance with the configuration you want
and use that to build your SolrClient instances.  If you're using the
Solr client in a multi-threaded manner, you'll want to be sure that the
HttpClient is defined to allow enough threads -- it defaults to two.

I do think this particular problem is something we should fix.  But that
doesn't help you in the short term.  It could take several weeks (or
maybe longer) for a fix from us to arrive in your hands, unless you're
willing to compile from source.

Thanks,
Shawn
Andreas Hubold
2018-10-10 14:50:05 UTC
Permalink
Thank you, Shawn. I'm now using a custom HttpClient that I create in a
similar manner as SolrJ, and it works quite well.

Of course, a fix in a future release would be great, so that we can
remove the workaround eventually.

Thanks,
Andreas
Post by Andreas Hubold
Is there some other way to configure an HTTP proxy, e.g. with
HttpSolrClient.Builder? I don't want to create an Apache HttpClient
instance myself but the builder from Solrj (HttpSolrClient.Builder).
1) Use a SolrJ client from 6.6.x, before the fix for SOLR-4509. If
you're using HttpSolrClient rather than CloudSolrClient, a SolrJ major
version that's different than your Solr major version won't be a big
problem.  Large version discrepancies can be very problematic with the
Cloud client.
2) Create a custom HttpClient instance with the configuration you want
and use that to build your SolrClient instances.  If you're using the
Solr client in a multi-threaded manner, you'll want to be sure that
the HttpClient is defined to allow enough threads -- it defaults to two.
I do think this particular problem is something we should fix. But
that doesn't help you in the short term.  It could take several weeks
(or maybe longer) for a fix from us to arrive in your hands, unless
you're willing to compile from source.
Thanks,
Shawn
Michael Joyner
2018-10-12 15:36:49 UTC
Permalink
Would you supply the snippet for the custom HttpClient to get it to
honor/use proxy?

Thanks!
Post by Andreas Hubold
Thank you, Shawn. I'm now using a custom HttpClient that I create in a
similar manner as SolrJ, and it works quite well.
Of course, a fix in a future release would be great, so that we can
remove the workaround eventually.
Thanks,
Andreas
Post by Andreas Hubold
Is there some other way to configure an HTTP proxy, e.g. with
HttpSolrClient.Builder? I don't want to create an Apache HttpClient
instance myself but the builder from Solrj (HttpSolrClient.Builder).
1) Use a SolrJ client from 6.6.x, before the fix for SOLR-4509. If
you're using HttpSolrClient rather than CloudSolrClient, a SolrJ
major version that's different than your Solr major version won't be
a big problem.  Large version discrepancies can be very problematic
with the Cloud client.
2) Create a custom HttpClient instance with the configuration you
want and use that to build your SolrClient instances.  If you're
using the Solr client in a multi-threaded manner, you'll want to be
sure that the HttpClient is defined to allow enough threads -- it
defaults to two.
I do think this particular problem is something we should fix. But
that doesn't help you in the short term.  It could take several weeks
(or maybe longer) for a fix from us to arrive in your hands, unless
you're willing to compile from source.
Thanks,
Shawn
Andreas Hubold
2018-10-15 13:01:39 UTC
Permalink
Hi Michael,

sure. The important call is HttpClientBuilder#useSystemProperties which
is also what Shawn added in his patch to
https://issues.apache.org/jira/browse/SOLR-12848

For my workaround, I've just followed the code from method
org.apache.solr.client.solrj.impl.HttpClientUtil#createClient(org.apache.solr.common.params.SolrParams),
took the statements that could be relevant for my setup (which is just
for a test setup, to be honest), simplified the code a bit and added the
#useSystemProperties call. You should read the original code, if you
want to make sure that don't forget some important setting. But this is
what works for me:

  private static CloseableHttpClient createClient()  {
    // code derived from
org.apache.solr.client.solrj.impl.HttpClientUtil, simplified and removed
irrelevant config
    Registry<ConnectionSocketFactory> schemaRegistry =
HttpClientUtil.getSchemaRegisteryProvider().getSchemaRegistry();
    PoolingHttpClientConnectionManager cm = new
PoolingHttpClientConnectionManager(schemaRegistry);
    cm.setMaxTotal(10000);
    cm.setDefaultMaxPerRoute(10000);
    cm.setValidateAfterInactivity(3000);

    RequestConfig.Builder requestConfigBuilder = RequestConfig
            .custom()
.setConnectTimeout(HttpClientUtil.DEFAULT_CONNECT_TIMEOUT)
            .setSocketTimeout(HttpClientUtil.DEFAULT_SO_TIMEOUT);

    HttpClientBuilder httpClientBuilder = HttpClientBuilder
            .create()
            .setKeepAliveStrategy((response, context) -> -1)
            .evictIdleConnections(50000, TimeUnit.MILLISECONDS)
            .setDefaultRequestConfig(requestConfigBuilder.build())
            .setRetryHandler(new SolrHttpRequestRetryHandler(3))
            .disableContentCompression()
            .useSystemProperties()
            .setConnectionManager(cm);

    return httpClientBuilder.build();
  }

You can create a HttpSolrClient then with

new HttpSolrClient.Builder(url).withHttpClient(httpClient).build()

Note that you should close the HttpClient yourself after calling
HttpSolrClient#close(), because externally created HttpClient instances
are not closed automatically by HttpSolrClient.

Cheers,
Andreas
Post by Michael Joyner
Would you supply the snippet for the custom HttpClient to get it to
honor/use proxy?
Thanks!
Post by Andreas Hubold
Thank you, Shawn. I'm now using a custom HttpClient that I create in
a similar manner as SolrJ, and it works quite well.
Of course, a fix in a future release would be great, so that we can
remove the workaround eventually.
Thanks,
Andreas
Post by Andreas Hubold
Is there some other way to configure an HTTP proxy, e.g. with
HttpSolrClient.Builder? I don't want to create an Apache HttpClient
instance myself but the builder from Solrj (HttpSolrClient.Builder).
1) Use a SolrJ client from 6.6.x, before the fix for SOLR-4509. If
you're using HttpSolrClient rather than CloudSolrClient, a SolrJ
major version that's different than your Solr major version won't be
a big problem.  Large version discrepancies can be very problematic
with the Cloud client.
2) Create a custom HttpClient instance with the configuration you
want and use that to build your SolrClient instances.  If you're
using the Solr client in a multi-threaded manner, you'll want to be
sure that the HttpClient is defined to allow enough threads -- it
defaults to two.
I do think this particular problem is something we should fix. But
that doesn't help you in the short term.  It could take several
weeks (or maybe longer) for a fix from us to arrive in your hands,
unless you're willing to compile from source.
Thanks,
Shawn
Loading...