Typeahead and autosuggest with pure Solr and Nginx

A long time ago I was writing about a very simple technic which can be used to quickly provide auto-suggest for websites with the support of Solr: Incredibly fast Solr autosuggest . This was using the terms function of Solr which enables us to search for terms, surprisingly.

This solution is working well if you have mostly single term searches or most of your queryables are strongly related.

But what happens if you have a lot of different domains in your websites? Let’s say your selling electronics and clothes also. Weird suggestions can arise. Like typing ‘widescreen t’ in some cases may return ‘widescreen top’ or ‘widescreen trousers’. Is this relevant or close to what you were looking for? Possibly not. Most likely won’t even produce result. You want your previously typed full words to effect the suggestion. Like the image on the right.

Typeahead with relevancy match

So what we need is a typeahead where previous words are being taken into consideration and only provide autosuggest (or typeahead) which makes sense: further filter the resultset and relevant to the already typed words. Like the example below.

autocomp-article-image01

Autosuggest relevant to the already typed words

To remedy the situation we can tune our previous query a bit without sacrifycing any of the awesomeness of Solr.

We can treat the previous words (if any) as existing search where the last word (that we are typing at the moment) is one of the possible facets on the same full-text field. Think like this:

Search: widescreen

Facets on full-text terms:

  • tv (312)
  • monitor (27)
  • tablet (12)
  • dvd (3)

So all you have to do is filter the facets which match your already type prefix. Solr has this feature build in with the so called facet.prefix parameter.

Parameter Value Description
q ‘widescreen’ the query string except the last word, in case only one word was (partially) typed this should be ‘*’
facet ‘on’ Turning faceting on
facet.prefix ‘t’ The fragment of the last word being typed
facet.field ‘text’ The name of the full text field that you’re querying against
wt ‘json’ Make the output JSON for better parsability
omitHeader ‘true’ We don’t need all the crap
facet.limit 5 Limit the facets (suggestions) to 5
rows 0 Limit the results to 0 because we are not interested in the results at the moment
facet.mincount 1 Only return facets which will actually have result if searched for

In a url this looks like this

Setting up Nginx rules

Same as we did before we can setup nginx location to proxy our query to Solr.

Please note we have used two parameters here. One is the query (the full words), second is the fragment or prefix.

You might like these too

Solr benchmark – first blood This is a quick impression about the freshly installed Solr 3.5 server. Enviroment The base system is a Amazon Microinstance equivalent virtual mach...
Incredibly fast Solr autosuggest Autosuggest is always a deal breaker in web applications. Normally users interact with the site once in every 20-30 seconds or even less frequent. Tha...
Solr Stats component is available in sunburnt StatsComponent is now available with stats function in the suburnt Solr python client library. More info: http://wiki.apache.org/solr/StatsComponen...

About charlesnagy

I'm out of many things mostly automation expert, database specialist, system engineer and software architect with passion towards data, searching it, analyze it, learn from it. I learn by experimenting and this blog is a result of these experiments and some other random thought I have time to time.
Bookmark the permalink.