David Hastings
2018-12-07 14:18:12 UTC
Hey there, I have a field type defined as such:
<fieldType name="skw2" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ManagedStopFilterFactory" managed="english"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2"
outputUnigrams="false" fillerToken="" maxShingleSize="2"/>
</analyzer>
</fieldType>
but whats happening is the shingles being returned are often times "
nonstopword"
with the space being defined as the filter token. I was hoping that the
ManagedStopFilterFactory would have removed the stop words completely
before going to the shingle factory, and would have returned "nonstopword1
nonstopword2" with an indexed value of
"nonstopword1 stopword1 stopword2 nonstopword2" but obviously isnt the
case. is there a way to force it as such?
Thanks, David
<fieldType name="skw2" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ManagedStopFilterFactory" managed="english"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2"
outputUnigrams="false" fillerToken="" maxShingleSize="2"/>
</analyzer>
</fieldType>
but whats happening is the shingles being returned are often times "
nonstopword"
with the space being defined as the filter token. I was hoping that the
ManagedStopFilterFactory would have removed the stop words completely
before going to the shingle factory, and would have returned "nonstopword1
nonstopword2" with an indexed value of
"nonstopword1 stopword1 stopword2 nonstopword2" but obviously isnt the
case. is there a way to force it as such?
Thanks, David