Discussion:
Soft commit and new replica types
Vadim Ivanov
2018-12-08 10:42:45 UTC
Permalink
Before 7.x all replicas in SolrCloud were NRT type.
And following rules were applicable:
https://stackoverflow.com/questions/45998804/when-should-we-apply-hard-commit-and-soft-commit-in-solr
and
https://lucene.apache.org/solr/guide/7_5/updatehandlers-in-solrconfig.html#commit-and-softcommit

But having new TLOG and PULL replica types causing some mess in that explanations.
From Ref guide we have:
" NRT is the only type of replica that supports soft-commits..."
"If TLOG replica does become a leader, it will behave the same as if it was a NRT type of replica."
Does it mean, that if we do not have NRT replicas in the cluster then
autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG leader)?

<autoSoftCommit>
<maxTime>60000</maxTime>
</autoSoftCommit>

Should we say that in autoCommit section openSearcher is always true in that case?

<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>30000</maxTime>
<maxSize>512m</maxSize>
<openSearcher>false</openSearcher>
</autoCommit>

Does it mean that new Searcher always starts on all replicas when hard commit happens on leader?
Some words in Ref Guide about new replica types in section #commit-and-softcommit seems to be usefull.
--
Vadim
Edward Ribeiro
2018-12-08 21:42:14 UTC
Permalink
Some insights in the new replica types below:

On Sat, December 8, 2018 08:42, Vadim Ivanov <
Post by Vadim Ivanov
" NRT is the only type of replica that supports soft-commits..."
"If TLOG replica does become a leader, it will behave the same as if it
was a NRT type of replica."
Does it mean, that if we do not have NRT replicas in the cluster then
autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG leader)?
No, not completely. Both TLOG and PULL nodes will periodically poll the
leader for changes in index segments' files and download those segments
from the leader. If hard commit max time is defined in solrconfig.xml the
polling interval of each replica will be half that value. Or else if the
soft commit max time is defined then the replicas will use half the soft
commit max time as the interval. If neither are defined then the poll
interval will be 3 seconds (hard coded). See here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/org/apache/solr/cloud/ReplicateFromLeader.java#L68-L77

If the TLOG is the leader it will index locally and append the doc to
transaction log as a NRT node would do as well as it will synchronously
replicate the data to other TLOG replicas' transaction logs (PULL nodes
don't have transaction logs). But TLOG/PULL replicas doesn't support soft
commits nor real time gets, afaik.
Post by Vadim Ivanov
<autoSoftCommit>
<maxTime>60000</maxTime>
</autoSoftCommit>
Should we say that in autoCommit section openSearcher is always true in that case?
<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>30000</maxTime>
<maxSize>512m</maxSize>
<openSearcher>false</openSearcher>
</autoCommit>

Does it mean that new Searcher always starts on all replicas when hard
commit happens on leader?


Nope. Or at least, the searcher is not synchronously created. Each non
leader replica will periodically fetch the index changes from the leader
and open a new searcher to reflect those changes as seen here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L653
But it's important to note that the potential delay between the leader's
hard commit and the other replicas fetching those changes from the leader
and opening a new searcher to reflect latest changes.

PS: I am still digging these new replica types so I can have misunderstood
or missed some aspect of it.

Regards,
Edward
Erick Erickson
2018-12-10 16:40:39 UTC
Permalink
bq. but not every poll attempt they fetch new segment from the leader

Ah, right. Ignore my comment. Commit will only occur on the followers
when there are new segments to pull down, so your'e right, roughly
every second poll would commit find things to bring down and open a
new searcher.........
Hi Vadim,
There is no commit on TLOG/PULL follower replicas, only on the leader.
Followers fetch the segments and **reload the core** every 150 seconds (if
there were new segments, I suppose). Yeah, followers don't pay the CPU
price of indexing, but there are still cache invalidation, autowarming,
etc, in addition to network and IO demand. Is that ritht, Erick?
Besides that, Erick is pointing out that under a heavy indexing workload
1. Very large transaction logs;
2. Very large numbers of segments. If that is the case, you could have the
2.1. follower replica downloads segment A and B from leader;
2.2 leader merges segments A + B into C;
2.3. follower replicas discard A and B and download C on next poll;
Under the second condition followers needlessly downloaded segments that
would eventually be merged.
IMO, you should carefully evaluate if the use of TLOG/PULL is really
recommended for your cluster setup, plus indexing and querying workload.
You can very much stay with a NRT setup if it suits you better. The videos
below provide a nice set of hints for when to choose between NRT or some
combination of TLOG and PULL.



Regards,
Edward
If hard commit max time is 300 sec then commit happens every 300 sec on
tlog leader. And new segments pop up on the leader every 300 sec, during
indexing. Polling interval on other replicas 150 sec, but not every poll
attempt they fetch new segment from the leader, afaiu. Erick, do you mean
that on all other tlog replicas(not leaders) commit occurs every poll?
воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
Not quite, 600000. The polling interval is half the commit interval....
This has always bothered me a little bit, I wonder at the utility of a
config param. We already have old-style replication with a
configurable polling interval. Under very heavy indexing loads, it
seems to me that either the tlogs will grow quite large or we'll be
pulling a lot of unnecessary segments across the wire, segments
that'll soon be merged away and the merged segment re-pulled.
Apparently, though, nobody's seen this "in the wild", so it's
theoretical at this point.
On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
Thanks, Edward, for clues.
What bothers me is newSearcher start, warming, cache clear... all that
CPU consuming stuff in my heavy-indexing scenario.
With NRT I had autoSoftCommit: 300000 .
So I had new Searcher no more than every 5 min on every replica.
To have more or less the same effect with TLOG - PULL collection,
I suppose, I have to have : 300000
(yes, I understand that newSearchers start asynchronously on leader and
replicas)
Am I right?
--
Vadim
-----Original Message-----
Sent: Sunday, December 09, 2018 12:42 AM
Subject: Re: Soft commit and new replica types
On Sat, December 8, 2018 08:42, Vadim Ivanov <
Post by Vadim Ivanov
" NRT is the only type of replica that supports soft-commits..."
"If TLOG replica does become a leader, it will behave the same as if it
was a NRT type of replica."
Does it mean, that if we do not have NRT replicas in the cluster then
autoSoftCommit section in solconfig.xml Ignored completely (even on
TLOG
Post by Vadim Ivanov
leader)?
No, not completely. Both TLOG and PULL nodes will periodically poll the
leader for changes in index segments' files and download those segments
from the leader. If hard commit max time is defined in solrconfig.xml
the
polling interval of each replica will be half that value. Or else if the
soft commit max time is defined then the replicas will use half the soft
commit max time as the interval. If neither are defined then the poll
https://github.com/apache/lucene-
solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
If the TLOG is the leader it will index locally and append the doc to
transaction log as a NRT node would do as well as it will synchronously
replicate the data to other TLOG replicas' transaction logs (PULL nodes
don't have transaction logs). But TLOG/PULL replicas doesn't support
soft
commits nor real time gets, afaik.
Post by Vadim Ivanov
60000
Should we say that in autoCommit section openSearcher is always true in
that case?
10000
30000
512m
false
Does it mean that new Searcher always starts on all replicas when hard
commit happens on leader?
Nope. Or at least, the searcher is not synchronously created. Each non
leader replica will periodically fetch the index changes from the leader
https://github.com/apache/lucene-
solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
rg/apache/solr/handler/IndexFetcher.java#L653
But it's important to note that the potential delay between the leader's
hard commit and the other replicas fetching those changes from the
leader
and opening a new searcher to reflect latest changes.
PS: I am still digging these new replica types so I can have
misunderstood
or missed some aspect of it.
Regards,
Edward
Tomás Fernández Löbbe
2018-12-10 23:14:52 UTC
Permalink
I think this is a good point. The tricky part is that if TLOG replicas
don't replicate often, their transaction logs will get too big too, so you
want the replication interval of TLOG replicas to be tied to the
auto(hard)Commit interval (by default at least). If you are using them for
search, you may also not want to open a searcher for each fetch... for PULL
replicas, maybe the best way is to use the autoSoftCommit interval to
define the polling interval. That said, I'm not sure using different
configurations is a good idea, some people may be mixing TLOG and PULL and
querying them both alike.

In the meantime, if you have different hosts for TLOG and PULL replicas,
one workaround you can have is to define the autoCommit time with a system
property, and use different properties for TLOGs vs PULL nodes.
There is no commit on TLOG/PULL follower replicas, only on the leader.
Followers fetch the segments and **reload the core** every 150 seconds
Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
you seeing reloads?
bq. but not every poll attempt they fetch new segment from the leader
Ah, right. Ignore my comment. Commit will only occur on the followers
when there are new segments to pull down, so your'e right, roughly
every second poll would commit find things to bring down and open a
new searcher.........
Hi Vadim,
There is no commit on TLOG/PULL follower replicas, only on the leader.
Followers fetch the segments and **reload the core** every 150 seconds
(if
there were new segments, I suppose). Yeah, followers don't pay the CPU
price of indexing, but there are still cache invalidation, autowarming,
etc, in addition to network and IO demand. Is that ritht, Erick?
Besides that, Erick is pointing out that under a heavy indexing workload
1. Very large transaction logs;
2. Very large numbers of segments. If that is the case, you could have
the
2.1. follower replica downloads segment A and B from leader;
2.2 leader merges segments A + B into C;
2.3. follower replicas discard A and B and download C on next poll;
Under the second condition followers needlessly downloaded segments that
would eventually be merged.
IMO, you should carefully evaluate if the use of TLOG/PULL is really
recommended for your cluster setup, plus indexing and querying workload.
You can very much stay with a NRT setup if it suits you better. The
videos
below provide a nice set of hints for when to choose between NRT or some
combination of TLOG and PULL.
http://youtu.be/XIb8X3MwVKc
http://youtu.be/dkWy2ykzAv0
http://youtu.be/XqfTjd9KDWU
Regards,
Edward
If hard commit max time is 300 sec then commit happens every 300 sec
on
tlog leader. And new segments pop up on the leader every 300 sec,
during
indexing. Polling interval on other replicas 150 sec, but not every
poll
attempt they fetch new segment from the leader, afaiu. Erick, do you
mean
that on all other tlog replicas(not leaders) commit occurs every poll?
вПскресеМье, 09 Ўекабря 2018г., 19:21 +03:00 Пт Erick Erickson
Not quite, 600000. The polling interval is half the commit
interval....
This has always bothered me a little bit, I wonder at the utility of a
config param. We already have old-style replication with a
configurable polling interval. Under very heavy indexing loads, it
seems to me that either the tlogs will grow quite large or we'll be
pulling a lot of unnecessary segments across the wire, segments
that'll soon be merged away and the merged segment re-pulled.
Apparently, though, nobody's seen this "in the wild", so it's
theoretical at this point.
On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
Thanks, Edward, for clues.
What bothers me is newSearcher start, warming, cache clear... all
that
CPU consuming stuff in my heavy-indexing scenario.
With NRT I had autoSoftCommit: 300000 .
So I had new Searcher no more than every 5 min on every replica.
To have more or less the same effect with TLOG - PULL collection,
I suppose, I have to have : 300000
(yes, I understand that newSearchers start asynchronously on leader
and
replicas)
Am I right?
--
Vadim
-----Original Message-----
Sent: Sunday, December 09, 2018 12:42 AM
Subject: Re: Soft commit and new replica types
On Sat, December 8, 2018 08:42, Vadim Ivanov <
Post by Vadim Ivanov
" NRT is the only type of replica that supports soft-commits..."
"If TLOG replica does become a leader, it will behave the same as
if it
Post by Vadim Ivanov
was a NRT type of replica."
Does it mean, that if we do not have NRT replicas in the cluster
then
Post by Vadim Ivanov
autoSoftCommit section in solconfig.xml Ignored completely (even on
TLOG
Post by Vadim Ivanov
leader)?
No, not completely. Both TLOG and PULL nodes will periodically poll
the
leader for changes in index segments' files and download those
segments
from the leader. If hard commit max time is defined in
solrconfig.xml
the
polling interval of each replica will be half that value. Or else
if the
soft commit max time is defined then the replicas will use half the
soft
commit max time as the interval. If neither are defined then the
poll
https://github.com/apache/lucene-
solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
If the TLOG is the leader it will index locally and append the doc
to
transaction log as a NRT node would do as well as it will
synchronously
replicate the data to other TLOG replicas' transaction logs (PULL
nodes
don't have transaction logs). But TLOG/PULL replicas doesn't support
soft
commits nor real time gets, afaik.
Post by Vadim Ivanov
60000
Should we say that in autoCommit section openSearcher is always
true in
Post by Vadim Ivanov
that case?
10000
30000
512m
false
Does it mean that new Searcher always starts on all replicas when
hard
commit happens on leader?
Nope. Or at least, the searcher is not synchronously created. Each
non
leader replica will periodically fetch the index changes from the
leader
https://github.com/apache/lucene-
solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
rg/apache/solr/handler/IndexFetcher.java#L653
But it's important to note that the potential delay between the
leader's
hard commit and the other replicas fetching those changes from the
leader
and opening a new searcher to reflect latest changes.
PS: I am still digging these new replica types so I can have
misunderstood
or missed some aspect of it.
Regards,
Edward
Continue reading on narkive:
Loading...