Typeahead and autosuggest with pure Solr and Nginx

A long time ago I was writing about a very simple technic which can be used to quickly provide auto-suggest for websites with the support of Solr: Incredibly fast Solr autosuggest . This was using the terms function of Solr which enables us to search for terms, surprisingly.

This solution … Continue reading

Python MySQLdb vs mysql-connector query performance

Query times for random PK using MySQLdb and mysql-connector

There are a lot of python driver available for MySQL and two stand out the most. The one, traditionally everybody’s choice, sort of industrial standard MySQLdb. It uses a C module to link to MySQL’s client library. Oracle’s mysql-connector on the other hand is pure python so no MySQL libraries and … Continue reading

Multi variant AB testing vs Multi-Armed bandit

I’ve read a lot of discussions lately around different versions of experimenting/testing. People are seemed to be very religious about the subject and two cardinal questions are separating the groups. Frequentist vs. Bayesian approach and AB testing vs. Multi-armed bandit solutions. Right now I’m mostly interested in the second problem … Continue reading

5 steps to self-managing server infrastructure

Self-managing servers

Managing servers is a tedious work we all have to do to some extent. But it doesn’t have to fill our whole day. What if I tell you that you can build a self-managing system with some discipline and effort? I went through implementing a self-managing database infrastructure of thousands of … Continue reading

Group by limit per group in PostgreSQL

In web applications it’s very common to try to limit the results by group. For example showing all the new posts with the the two latest comments on them. Or have the best selling categories in an e-commerce website showing the 3 most popular products in those categories.

In MySQL … Continue reading

Split or leave frequently updated column in PostgreSQL

Make you postgresql run

I have a database migrated from MySQL to PostgreSQL (I had my good reasons but this post is not about that). In MySQL because of MVCC behaviour it makes sense and it’s actually a recommendation to split frequently updated columns from large tables especially if using a web framework like Django … Continue reading

Pushing logs to loggly with fluentd

Setting up Fluentd log publisher to Loggly is straightforward thanks to the detailed tutorials can be found online. Some useful readings:

One gotcha: numeric fields in loggly

By default everything … Continue reading

Python timeit – when speed matters – SQL IN query with cursor.execute

Although there are always multiple ways to solve a single problem not always the most elegant is the best performant. Python gives a perfect tool to check primitives (or even a bit more complex) structure’s speed. This comes really handy when trying to figure out of a loop or a … Continue reading