Typeahead and autosuggest with pure Solr and Nginx

A long time ago I was writing about a very simple technic which can be used to quickly provide auto-suggest for websites with the support of Solr: Incredibly fast Solr autosuggest . This was using the terms function of Solr which enables us to search for terms, surprisingly.

Python MySQLdb vs mysql-connector query performance

Query times for random PK using MySQLdb and mysql-connector

There are a lot of python driver available for MySQL and two stand out the most. The one, traditionally everybody's choice, sort of industrial standard MySQLdb. It uses a C module to link to MySQL's client library. Oracle's mysql-connector on the other hand is pure python so no MySQL libraries and

Multi variant AB testing vs Multi-Armed bandit

I've read a lot of discussions lately around different versions of experimenting/testing. People are seemed to be very religious about the subject and two cardinal questions are separating the groups. Frequentist vs. Bayesian approach and AB testing vs. Multi-armed bandit solutions. Right now I'm mostly interested in the second problem

5 steps to self-managing server infrastructure

Self-managing servers

Managing servers is a tedious work we all have to do to some extent. But it doesn't have to fill our whole day. What if I tell you that you can build a self-managing system with some discipline and effort? I went through implementing a self-managing database infrastructure of thousands of

Group by limit per group in PostgreSQL

In web applications it’s very common to try to limit the results by group. For example showing all the new posts with the the two latest comments on them. Or have the best selling categories in an e-commerce website showing the 3 most popular products in those categories.

Split or leave frequently updated column in PostgreSQL

Make you postgresql run

I have a database migrated from MySQL to PostgreSQL (I had my good reasons but this post is not about that). In MySQL because of MVCC behaviour it makes sense and it's actually a recommendation to split frequently updated columns from large tables especially if using a web framework like Django

Pushing logs to loggly with fluentd

Setting up Fluentd log publisher to Loggly is straightforward thanks to the detailed tutorials can be found online. Some useful readings:

One gotcha: numeric fields in loggly

Python timeit – when speed matters – SQL IN query with cursor.execute

Although there are always multiple ways to solve a single problem not always the most elegant is the best performant. Python gives a perfect tool to check primitives (or even a bit more complex) structure's speed. This comes really handy when trying to figure out of a loop or a