tstusr
2017-07-17 22:26:57 UTC
Hi
We want to use a copy field as a source for another copy field or some kind
of post processing of a field.
The problem is here. We have a field from a text that is captured by a
field, like this:
<copyField source="attr_content*" dest="species"/>
which has (at the end of the processing) just the words in a field.
<field name="species" type="species_type" stored="true" indexed="true"
termVectors="true" termPositions="true" termOffsets="true"/>
<fieldType name="species_type" class="solr.TextField"
positionIncrementGap="0">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping/mapping-ISOLatin1Accent.txt"/>
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[0-9]+|(\-)(\s*)" replacement=""/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="true"/>
<filter class="solr.KeepWordFilterFactory" words="species.txt"
ignoreCase="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="false"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
So, what we want to do now is to implement a faceting according to some post
processing of this field by using this as a source for another field.
<copyField source="species" dest="genus"/>
<fieldType name="genus_type" class="solr.TextField"
positionIncrementGap="0">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeepWordFilterFactory" words="genus.txt"
ignoreCase="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
As soon as I understand. We don't have a value on genus because the chain is
ended. Nevertheless, we are also not available to make two processings to
first, capture the words on species and then make a new capture for the
genus.
As an example imagine we have on species
abies durangensis
abies flinckii
so, after post processing, we expect to have only
abies
which is a word in genus files
I was as clear as possible with the problem, but maybe there are some black
holes in the explanation.
Hope you can help me.
--
View this message in context: http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425.html
Sent from the Solr - User mailing list archive at Nabble.com.
We want to use a copy field as a source for another copy field or some kind
of post processing of a field.
The problem is here. We have a field from a text that is captured by a
field, like this:
<copyField source="attr_content*" dest="species"/>
which has (at the end of the processing) just the words in a field.
<field name="species" type="species_type" stored="true" indexed="true"
termVectors="true" termPositions="true" termOffsets="true"/>
<fieldType name="species_type" class="solr.TextField"
positionIncrementGap="0">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping/mapping-ISOLatin1Accent.txt"/>
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[0-9]+|(\-)(\s*)" replacement=""/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="true"/>
<filter class="solr.KeepWordFilterFactory" words="species.txt"
ignoreCase="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3"
outputUnigrams="false"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
So, what we want to do now is to implement a faceting according to some post
processing of this field by using this as a source for another field.
<copyField source="species" dest="genus"/>
<fieldType name="genus_type" class="solr.TextField"
positionIncrementGap="0">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeepWordFilterFactory" words="genus.txt"
ignoreCase="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
As soon as I understand. We don't have a value on genus because the chain is
ended. Nevertheless, we are also not available to make two processings to
first, capture the words on species and then make a new capture for the
genus.
As an example imagine we have on species
abies durangensis
abies flinckii
so, after post processing, we expect to have only
abies
which is a word in genus files
I was as clear as possible with the problem, but maybe there are some black
holes in the explanation.
Hope you can help me.
--
View this message in context: http://lucene.472066.n3.nabble.com/Copy-field-a-source-of-copy-field-tp4346425.html
Sent from the Solr - User mailing list archive at Nabble.com.