Discussion:
Delete field in Solr
Gurfan
2014-04-14 11:50:23 UTC
Permalink
Hi,

We have a setup of SolrCloud 4.6. The fields Stored value is true.
Now I want to delete a field from indexed document. Is there any way from
which we can delete the field??
Field which we are trying to delete(extracted from schema.xml):

<field name="SField2" type="string" indexed="true" stored="true"
omitNorms="false" termVectors="false" />

We comment out this field(SField2) entry from schema.xml and reload/optimize
index from solr admin UI.

Commit the solr index.
curl http://<<IP>>:8983/solr/update/json?commit=true

Again fired query for the same but the removed field(SField2) is back
showing in query result.

We followed below link to target the requirement:

https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
https://wiki.apache.org/solr/UpdateJSON


we tried another commands to delete the document ID:

1> For Deletion:

curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d
'
[
{
"delete":{
"id":"c7d30e6850c54429b888794f7433e3c5"
}
}
]'

Output: {"responseHeader":{"status":400,"QTime":0},"error":{"msg":"Document
is missing mandatory uniqueKey field: id","code":400}}


2> to set null in existing indexed field:

curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d
'
[
{"id" : "c7d30e6850c54429b888794f7433e3c5","SField2":{"set":null} }
]'

output:
{"responseHeader":{"status":500,"QTime":0},"error":{"msg":"For input string:
\"8888888888\"","trace":"java.lang.NumberFormatException: For input string:
\"8888888888\"\n\tat
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)


Please Help !





--
View this message in context: http://lucene.472066.n3.nabble.com/Delete-field-in-Solr-tp4131003.html
Sent from the Solr - User mailing list archive at Nabble.com.
Shawn Heisey
2014-04-14 13:43:01 UTC
Permalink
Post by Gurfan
We have a setup of SolrCloud 4.6. The fields Stored value is true.
Now I want to delete a field from indexed document. Is there any way from
which we can delete the field??
<field name="SField2" type="string" indexed="true" stored="true"
omitNorms="false" termVectors="false" />
We comment out this field(SField2) entry from schema.xml and reload/optimize
index from solr admin UI.
Commit the solr index.
curl http://<<IP>>:8983/solr/update/json?commit=true
Again fired query for the same but the removed field(SField2) is back
showing in query result.
I would guess based on your experience that the schema is only used at
those moments that field analysis might be required. This means it is
probably only used for two things: 1) Figuring out how to index a field
when you do an update request, and 2) deciding whether a search field is
allowed in a query and how to analyze the terms before the query is
processed.

When results are being determined for the response, it sounds like the
schema is NOT consulted -- I think the code simply reads what's in the
Lucene index and applies the "fl" parameter to decide which fields are
returned. The index doesn't change when you change the schema -- Lucene
does not use a schema. That's something Solr brings to the table.

You'll need to reindex to remove this field from every document that
currently contains it.

http://wiki.apache.org/solr/HowToReindex
Post by Gurfan
curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d
'
[
{
"delete":{
"id":"c7d30e6850c54429b888794f7433e3c5"
}
}
]'
Output: {"responseHeader":{"status":400,"QTime":0},"error":{"msg":"Document
is missing mandatory uniqueKey field: id","code":400}}
I can tell you that the "id" string in the above is *not* a field name.
It refers to entries in the uniqueKey field, which your response seems
to indicate actually is called "id".

One difference that I noted is that you have the entire command
surrounded by square brackets. The "multiple commands" section on the
wiki shows curly braces. I would recommend removing the square brackets
entirely -- you already have the curly braces.
Post by Gurfan
curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d
'
[
{"id" : "c7d30e6850c54429b888794f7433e3c5","SField2":{"set":null} }
]'
\"8888888888\"\n\tat
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
The value there is null ... and it's not surrounded by quotes. I don't
know JSON very well, but I am pretty sure that this means JSON thinks
it's a number, and that string cannot be parsed as a number. I don't
think JSON understands null as a concept.

Even if the atomic update request were to pass JSON parsing, the field
no longer exists, so the index request would most likely fail.

If your search code is not able to simply ignore a field in the response
that it isn't using, then you'll have to reindex. Alternatively, you
could use the fl parameter to limit the results to only certain fields.

Thanks,
Shawn
Shawn Heisey
2014-04-14 14:09:28 UTC
Permalink
Post by Shawn Heisey
When results are being determined for the response, it sounds like the
schema is NOT consulted -- I think the code simply reads what's in the
Lucene index and applies the "fl" parameter to decide which fields are
returned. The index doesn't change when you change the schema -- Lucene
does not use a schema. That's something Solr brings to the table.
It could be argued that this is a bug. It may not be a bug that's easy
to fix for all response writers.

It's still worth an issue in Jira. If it turns out that such an issue
already exists, then it will be marked as a duplicate and closed ...
which means that either way, it will receive some attention.

I haven't looked into the code very far, but I did note that the schema
is already available in BinaryResponseWriter. It does not seem to be
available in the XML or JSON response writers, though. The fix may not
be easy.

Thanks,
Shawn
Chris Hostetter
2014-04-14 18:45:09 UTC
Permalink
: we tried another commands to delete the document ID:
:
: 1> For Deletion:
:
: curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d
: '
: [

You're use of square brackets here is triggering the syntax-sugar that
let's you add documents as objects w/o needing the "add" keyword. Solr
things you are trying to add a document containing one field named
"delete"


just send something like this as your entire HTTP request body...

{ "delete": "c7d30e6850c54429b888794f7433e3c5" }

https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-JSONFormattedIndexUpdates


-Hoss
http://www.lucidworks.com/
Gurfan
2014-04-17 15:08:17 UTC
Permalink
Thanks for the quick response. Delete works fine as suggested.

Still we are facing an issue in updating document rest command:

Field description in schema.xml:

/curl http://192.168.213.52:8983/solr/update -H
'Content-type:application/json' -d '
[
{"id": "969dcba7c3ec49cf9d61017afe8a2768","SField2":{"set":"Bahama"}
}
]'/



{"responseHeader":{"status":500,"QTime":2},"error":{"msg":"For input string:
\"8888888888\"","trace":"java.lang.NumberFormatException: For input string:
\"8888888888\"\n\tat
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

*Server Log:*

2069175 [qtp17275133-18] INFO
org.apache.solr.update.processor.LogUpdateProcessor – [collection1]
webapp=/solr path=/update params={} {} 0 2
2069176 [qtp17275133-18] ERROR org.apache.solr.core.SolrCore –
java.lang.NumberFormatException: For input string: "8888888888"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:495)
at java.lang.Integer.valueOf(Integer.java:582)


On the other hand we are able to update from
UI(http://<<IP>>:8983/solr/#/collection1/documents). Please find attached
snap.
<Loading Image...>

While updating a field vaule(From Solr UI Screen) in a document for example:

As shown in the attached screenshot we are able to update the value of
SField2 to Bahama but when we fire query with the same id we only see the
latest value of SField2 in the response, however we have other fields
defined in schema as stored=true.

Though we were expecting that our query with same id should give all stored
attribute including Sfield2 with latest value.

Please let us know if we are missing some thing here.

Thanks,
--Gurfan












--
View this message in context: http://lucene.472066.n3.nabble.com/Delete-field-in-Solr-tp4131003p4131751.html
Sent from the Solr - User mailing list archive at Nabble.com.
Gurfan
2014-04-17 16:01:16 UTC
Permalink
*Sorry, missed to add Filed entry from schema.xml *

<field name="SField2" type="string" indexed="true" stored="true"
omitNorms="false" termVectors="false" />

Thanks.




--
View this message in context: http://lucene.472066.n3.nabble.com/Delete-field-in-Solr-tp4131003p4131769.html
Sent from the Solr - User mailing list archive at Nabble.com.

Loading...