An in-memory full text search engine library. It lets you
run full-text queries on a collection of your documents.
Features:
Keyword queries and auto-complete/auto-suggest queries.
Can search over any type of "document".
(You explain how to extract search terms from them.)
Supports documents with multiple fields
(e.g. title, body)
Supports documents with non-term features
(e.g. quality score, page rank)
Uses the state of the art BM25F ranking function
Adjustable ranking parameters (including field weights
and non-term feature scores)
In-memory but quite compact. It does not keep a copy of
your original documents.
Quick incremental index updates, making it possible to
keep your text search in-sync with your data.
It is independent of the document type, so you have to
write the document-specific parts: extracting search terms
and any stop words, case-normalisation or stemming. This
is quite easy using libraries such as
tokenize and
snowball.
The source package includes a demo to illustrate how to
use the library. The demo is a simplified version of how
the library is used in the
hackage-server
where it provides the backend for the package search feature.