dbms

[solr] Indexed and Stored field configuration

Posted on

In a nutshell – an indexed field is searchable, and a stored field has its content stored in the index so it is retrievable. Here are some examples that will hopefully give you a feel for how to set the indexed and stored options: 

indexed=”true” stored=”true” 
Use this for information you want to search on and also display in search results – for example, book title or author. 

indexed=”false” stored=”true” 
Use this for fields that you want displayed with search results but that don’t need to be searchable – for example, destination URL, file system path, time stamp, or icon image. 

indexed=”true” stored=”false” 
Use this for fields you want to search on but don’t need to get their values in search results. Here are some of the common reasons you would want this: 

Large fields and a database: Storing a field makes your index larger, so set stored to false when possible, especially for big fields. For this case a database is often used, as the previous responder said. Use a separate identifier field to get the field’s content from the database. 

Ordering results: Say you define field name=”bookName” type=”text” indexed=”true” stored=”true” that is tokenized and used for searching. If you want to sort results based on book name, you could copy the field into a separate nonretrievable, nontokenized field that can be used just for sorting – 
field name=”bookSort” type=”string” indexed=”true” stored=”false” 
copyField source=”bookName” dest=”bookSort” 

Easier searching: If you define the field <field name=”text” type=”text” indexed=”true” stored=”false” multiValued=”true”/> you can use it as a catch-all field that contains all of the other text fields. Since solr looks in a default field when given a text query without field names, you can support this type of general phrase query by making the catch-all the default field. 

indexed=”false” stored=”false” 
Use this when you want to ignore fields. For example, the following will ignore unknown fields that don’t match a defined field rather than throwing an error by default. 
fieldtype name=”ignored” stored=”false” indexed=”false” 
dynamicField name=”*” type=”ignored” 



Here is a summary of available options on a field, broken down by use case. A true or false indicates that the option must be set to the given value for the use case to function correctly.
use case
indexed
stored
multiValued
omitNorms
termVectors
termPositions
termOffsets

search within field
true

retrieve contents
true

use as unique key
true
false

sort on field
true
false
true [1]

use field boosts
false

document boosts affect searches within field
false

highlighting
true[3]
true
[2]
[2]
[2]

faceting
true

add multiple values, maintaining order
true

field length affects doc score
false

true[5]
true[5]

term frequency[4]
true

document frequency[4]
true

tf*idf[4]
true

term postitions[4]
true
true
true

term offsets[4]
true
true
true
Notes:
  1. recommended but not necessary
  2. stored must always be true for highlighting. If you also add both termVectors and termOffsets, this can be used to boost performance. (Without termVectors/termOffsets, Solr needs to reanalyze the whole field to perform highlighting.) If you furthermore add termPositions, additional speedup may be possible. Note, that you must index the field in order to be able to use termVectors, termOffsets and termPositions.
  3. a tokenizer must be defined for the field, but it doesn’t need to be indexed
  4. For use with the TermVectorComponent
  5. Uses the term vector if present, otherwise the stored field. Reanalyzes the document if using the stored field.

For further considerations for faceting, see also SolrFacetingOverview. For more information on term frequency, positions, offsets etc. see TermVectorComponent.
Advertisements

[mysql] Show all column from all tables

Posted on

Sometime when we work on a large database structure that contains many tables, we don’t know the relationship between tables. If the architect never write any documentation, we may have difficulties to trace the relationship.

Luckily, most of the Database system allow us to query the columns and tables name in the database.
Following is the SQL script to select all tables from a MySQL database.

select * from information_schema.columns where table_schema = ‘l2gs’
 order by table_name,ordinal_position

So when we are frustrated on how  a table is related to another table, we can easily filter by the column name, hopefully the database architect create the same column name for the related table

select * from information_schema.columns where table_schema = ‘l2gs’and column_name like ‘%REASON%’ order by table_name,ordinal_position




Hope it helps all the developers who have same problem