solr

geospatial – What SQL datatype should be used to populate a Solr location (spatial) field when using a DataImportHandler? – Stack Overflow – Mozilla Firefox

Link Posted on Updated on

geospatial – What SQL datatype should be used to populate a Solr location (spatial) field when using a DataImportHandler? – Stack Overflow – Mozilla Firefox

I have a Solr schema which contains a location field (using the default solr.LatLonType ):

<field name="latlng" type="location" indexed="true" stored="true"/>

And I am trying to populate it using a DataImportHandler. Currently I SELECT the value as nvarchar in the format of 17.74628,-64.70725; however it is not populating the Solr field (it remains empty).

What type and format should this column be in to update the location field in Solr?

Answer:

solr.LatLonType is a multi-dimensional type; You can define the field type as:

<fieldType name="location" subFieldSuffix="_coordinate"/>

Using your field name of “latlng” the schema for the coordinate fields will look like this (notice the “subFieldSuffix” used for the 2 dimensional field type solr.LatLonType):

<field name="latlng" type="location" indexed="true" stored="true" />
<field name="latlng_0_coordinate" type="double" indexed="true" stored="true" />
<field name="latlng_1_coordinate" type="double" indexed="true" stored="true" />

“latlng_0_coordinate” should be the latitude and “latlng_1_coordinate” should be the longitude. Your select statement should load “latlng_0_coordinate” and “latlng_1_coordinate” as doubles.

the previous answer works since you’re manually creating the fields that Solr uses to store the lat and long individually, however there’s a dynamic field for that purpose.

<!-- Type used to index the lat and lon components for the "location" FieldType --> <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false" />

If you check field type location, you might find it uses the suffix _coordinate for their values:

<!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. -->

<fieldType name="location" subFieldSuffix="_coordinate"/>

that works for me in Solr 4 beta, and I believe is present since Solr 3.6 or even older. Anyway, just another solution!

[solr] Indexed and Stored field configuration

Posted on

In a nutshell – an indexed field is searchable, and a stored field has its content stored in the index so it is retrievable. Here are some examples that will hopefully give you a feel for how to set the indexed and stored options: 

indexed=”true” stored=”true” 
Use this for information you want to search on and also display in search results – for example, book title or author. 

indexed=”false” stored=”true” 
Use this for fields that you want displayed with search results but that don’t need to be searchable – for example, destination URL, file system path, time stamp, or icon image. 

indexed=”true” stored=”false” 
Use this for fields you want to search on but don’t need to get their values in search results. Here are some of the common reasons you would want this: 

Large fields and a database: Storing a field makes your index larger, so set stored to false when possible, especially for big fields. For this case a database is often used, as the previous responder said. Use a separate identifier field to get the field’s content from the database. 

Ordering results: Say you define field name=”bookName” type=”text” indexed=”true” stored=”true” that is tokenized and used for searching. If you want to sort results based on book name, you could copy the field into a separate nonretrievable, nontokenized field that can be used just for sorting – 
field name=”bookSort” type=”string” indexed=”true” stored=”false” 
copyField source=”bookName” dest=”bookSort” 

Easier searching: If you define the field <field name=”text” type=”text” indexed=”true” stored=”false” multiValued=”true”/> you can use it as a catch-all field that contains all of the other text fields. Since solr looks in a default field when given a text query without field names, you can support this type of general phrase query by making the catch-all the default field. 

indexed=”false” stored=”false” 
Use this when you want to ignore fields. For example, the following will ignore unknown fields that don’t match a defined field rather than throwing an error by default. 
fieldtype name=”ignored” stored=”false” indexed=”false” 
dynamicField name=”*” type=”ignored” 



Here is a summary of available options on a field, broken down by use case. A true or false indicates that the option must be set to the given value for the use case to function correctly.
use case
indexed
stored
multiValued
omitNorms
termVectors
termPositions
termOffsets

search within field
true

retrieve contents
true

use as unique key
true
false

sort on field
true
false
true [1]

use field boosts
false

document boosts affect searches within field
false

highlighting
true[3]
true
[2]
[2]
[2]

faceting
true

add multiple values, maintaining order
true

field length affects doc score
false

true[5]
true[5]

term frequency[4]
true

document frequency[4]
true

tf*idf[4]
true

term postitions[4]
true
true
true

term offsets[4]
true
true
true
Notes:
  1. recommended but not necessary
  2. stored must always be true for highlighting. If you also add both termVectors and termOffsets, this can be used to boost performance. (Without termVectors/termOffsets, Solr needs to reanalyze the whole field to perform highlighting.) If you furthermore add termPositions, additional speedup may be possible. Note, that you must index the field in order to be able to use termVectors, termOffsets and termPositions.
  3. a tokenizer must be defined for the field, but it doesn’t need to be indexed
  4. For use with the TermVectorComponent
  5. Uses the term vector if present, otherwise the stored field. Reanalyzes the document if using the stored field.

For further considerations for faceting, see also SolrFacetingOverview. For more information on term frequency, positions, offsets etc. see TermVectorComponent.