The fundamental premise of
Solr is simple. You give it a lot of information, then later you can ask it
questions and find the piece of information you want. The part where you feed
in all the information is called indexing or updating. When you ask a question,
it's called a query.
Solr allows you to build an index with many
different fields, or types of entries.
The schema is the place where you tell Solr how
it should build indexes from input documents.
Solr's basic unit of information is a document, which
is a set of data that describes something
In the Solr universe, documents are composed of
fields, which are more specific pieces of information.
You can tell Solr about the kind of data a field
contains by specifying its field type. The field type tells Solr how to
interpret the field and how it can be queried
When you add a document, Solr takes the
information in the document's fields and adds that information to an index.
When you perform a query, Solr can quickly consult the index and return the
Fields and Schema
Field analysis tells Solr what to do with
incoming data when building an index. A more accurate name for this process
would be processing or even digestion, but the official name is analysis.
Field analysis is an important part of a field
type. Understanding Analyzers, Tokenizers, and Filters is a detailed
description of field analysis.
You should also have JDK 6 or above installed.
tar -zxvf solr-5.3.0.tgz
Creating a core
solr create -c wbcatalog -d basic_configs
Advanced full-text search - Users can search for
one or more words or phrases in the content of documents, specific fields, or
combinations of one or more fields, thus providing results that match user's
Faceted Search - User can narrow down the search
results further by applying filters on the fields (numeric, date fields, unique
fields) if the user wishes to drill down. Thus providing categorized search.
Sort - User can prioritize the search results
based on field count.
Pagination - User can display the search results
in pages of fixed size.
Hit-Term Highlighting - Provides highlighting of
the search keyword in the document.
It is optimized for high-volume web traffic.
It supports rich Document Parsing and Indexing
(PDF, Word, HTML, etc.)
Admin UI - It has a very simple and
user-friendly interface for designing and executing queries over the data.
Caching - It caches the results of filter
queries, thus delivering faster search operations.
Solr has a mechanism for making copies of fields
so that you can apply several distinct field types to a single piece of
If the text destination field has data of its
own in the input documents, the contents of the cat field will be added as
additional values – just as if all of the values had originally been specified
by the client. Remember to configure your fields as
multivalued="true" if they will ultimately get multiple values
(either from a multivalued source, or multiple copyField directives, etc...)
In Apache Solr elements for navigational
purposes are named facets
facet queries only provide information (count of
documents) and do not change the result documents.
It is ‘filter queries for future queries’. So
define a facet query and see how much documents I can expect if I would apply
the related filter query.