Incredibly fast Solr autosuggest

Autosuggest is always a deal breaker in web applications. Normally users interact with the site once in every 20-30 seconds or even less frequent. That means if we have 30 concurrent visitors on our site we will have 1 request/sec on the webserver. This amount of traffic can be handled by a small EC2 instance on AWS. But what will happen if the site has an autosuggest field in the main search form?

Problem

So our visitors want to search of course. They start typing. An average user types 33 word per minute which means he/she can hit 2-3 keys per second. So our server must be able to handle 90(!) times more burst traffic than in common scenario where people just browse the site.

Solution

Fortunately there are many solution for that. Here I want to share my favourite one. I like to use Solr for searching because this is a very fast, robust, scalable and extensible search server using Lucene in the background. Most of my sites running with Nginx so I’m going to provide nginx examples but this can be done in all webserver.

Step 1# Configure Solr

What we’re going to do first is setup Solr to be able to server our autosuggest values. For this we just need to add the following in the solrconfig.xml:

With this config in place after restart Solr we can now see the autosuggest results: http://[SOLR_HOST]:8983/solr/terms/?terms.prefix=[SEARCH]&terms.fl=[FIELD_NAME] That’s it.

Step #2 setup Nginx to serve directly from Solr:

Because having a direct connection from the clients to Solr is not secure and not recommended we need nginx to proxy our request from the visitors browser to the Solr. An extra benefit for using nginx is that we can hide the url from the user and use fancy urls like ‘/suggest/[SEARCH]’.

We need to define an upstream server group. I did this in nginx.conf in http section but can be in a separate file which is included somewhere.

Next thing is to create a location in our server section. (Usually I have sites-enabled and site-available directory in nginx and I have all the sites in different file.)

Here I used wt=json and omitHeader=true parameters to make it easier to parse the result in javascript. Now we’re almost done. You can check your autosuggest values in your site on the url http://[SITE-URL]/suggest/[SEARCH].

This took at most 5 minutes and now you have a fully functional, fast as hell, native autosuggest on your site. Pretty cool, isn’t it?

Step #3 create the JS code

Now you have to integrate this in your site. I won’t give full example here because as many people as many different approach and style. But you should write something like this:

Enjoy!

ps.: In my test environment this autosuggest is able to server 90+ requests/sec without any proxy caching.

About charlesnagy

I'm out of many things mostly automation expert, database specialist, system engineer and software architect with passion towards data, searching it, analyze it, learn from it. I learn by experimenting and this blog is a result of these experiments and some other random thought I have time to time.
Bookmark the permalink.
  • Anon

    Veeeery cool post! There’s hardly anything online about using solr, this helped me a lot especially because I’m also using nginx.

    Thanks 🙂

  • Indeed, very nice post, I was looking for exactly this. But I’m trying to find out if I can use nginx to lessen the burdon on a solr replication master. The slave is really hitting it hard.