Incredibly fast Solr autosuggest

Autosuggest is always a deal breaker in web applications. Normally users interact with the site once in every 20-30 seconds or even less frequent. That means if we have 30 concurrent visitors on our site we will have 1 request/sec on the webserver. This amount of traffic can be handled by a small EC2 instance on AWS. But what will happen if the site has an autosuggest field in the main search form?

Problem

So our visitors want to search of course. They start typing. An average user types 33 word per minute which means he/she can hit 2-3 keys per second. So our server must be able to handle 90(!) times more burst traffic than in common scenario where people just browse the site.

Solution

Fortunately there are many solution for that. Here I want to share my favourite one. I like to use Solr for searching because this is a very fast, robust, scalable and extensible search server using Lucene in the background. Most of my sites running with Nginx so I’m going to provide nginx examples but this can be done in all webserver.

Step 1# Configure Solr

What we’re going to do first is setup Solr to be able to server our autosuggest values. For this we just need to add the following in the solrconfig.xml:

  <searchComponent name="terms" class="solr.TermsComponent"/>
  <!-- A request handler for demonstrating the terms component -->
  <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
     <lst name="defaults">
      <bool name="terms">true</bool>
    </lst>     
    <arr name="components">
      <str>terms</str>
    </arr>
  </requestHandler>

</lst>

<str>terms</str>

</arr>

</requestHandler>

With this config in place after restart Solr we can now see the autosuggest results: http://[SOLR_HOST]:8983/solr/terms/?terms.prefix=[SEARCH]&terms.fl=[FIELD_NAME] That’s it.

Step #2 setup Nginx to serve directly from Solr:

Because having a direct connection from the clients to Solr is not secure and not recommended we need nginx to proxy our request from the visitors browser to the Solr. An extra benefit for using nginx is that we can hide the url from the user and use fancy urls like ‘/suggest/[SEARCH]’.

We need to define an upstream server group. I did this in nginx.conf in http section but can be in a separate file which is included somewhere.

upstream solr-search {
    server [SOLR_HOST]:8983;
}

upstream solr-search {

server [SOLR_HOST]:8983;

}

Next thing is to create a location in our server section. (Usually I have sites-enabled and site-available directory in nginx and I have all the sites in different file.)

location ~ ^/suggest/ {
    rewrite /suggest/(.*) /solr/terms/?terms.prefix=$1&terms.fl=[FIELD_NAME]&wt=json&omitHeader=true break;
    proxy_pass        http://solr-search;
}

location ~ ^/suggest/ {

rewrite /suggest/(.*) /solr/terms/?terms.prefix=$1&terms.fl=[FIELD_NAME]&wt=json&omitHeader=true break;

proxy_pass http://solr-search;

}

Here I used wt=json and omitHeader=true parameters to make it easier to parse the result in javascript. Now we’re almost done. You can check your autosuggest values in your site on the url http://[SITE-URL]/suggest/[SEARCH].

This took at most 5 minutes and now you have a fully functional, fast as hell, native autosuggest on your site. Pretty cool, isn’t it?

Step #3 create the JS code

Now you have to integrate this in your site. I won’t give full example here because as many people as many different approach and style. But you should write something like this:

$('#ID_OF_INPUT').bind('onkeyup',function(){
	$.getJSON("http://[SITE-URL]/suggest/" + $(this).val(), function(resp){
		for (data in resp.terms.[FIELD_NAME]){
			// ... do something with the data 
		}
	});
});

$('#ID_OF_INPUT').bind('onkeyup',function(){

$.getJSON("http://[SITE-URL]/suggest/" + $(this).val(), function(resp){

for (data in resp.terms.[FIELD_NAME]){

// ... do something with the data

}

});

Enjoy!

ps.: In my test environment this autosuggest is able to server 90+ requests/sec without any proxy caching.

Incredibly fast Solr autosuggest

Problem

Solution

Step 1# Configure Solr

Step #2 setup Nginx to serve directly from Solr:

Step #3 create the JS code

You may also like...

Categories

Links

Recent comments

Tweets

Incredibly fast Solr autosuggest

Problem

Solution

Step 1# Configure Solr

Step #2 setup Nginx to serve directly from Solr:

Step #3 create the JS code

You may also like...

Stream cypher (encrypt) with digital envelope in Python

Python MySQLdb vs mysql-connector query performance

Split or leave frequently updated column in PostgreSQL

Categories

Links

Recent comments

Tweets