MySQL Benchmark – updates by primary vs secondary keys

(Note: when I’m talking about MySQL I usually assume InnoDB storage engine. Any other case I explicitly tell this is MyISAM or Memory etc.)

I’ve heared an interesting aproach of using Master-slave replication in MySQL.

Thesis

So the theory was that since updates by primary keys are fast and by secondary keys are slow the slave has to be queried for the primary key and then run the updates by the fetched primary keys. To make this in context and more understandable:

Original query

This query get splitted to two different query. First query has to be run on the slave to fetch the primary keys:

When we have the values we can go to the master and update the necessary records.

Antithesis

What’s my problem with this? First of all this creates unnessessary complexity to the application and extra load on the slave server(s). This is the typical case when the developer tries to outsmart MySQL. I saw sometimes working examples but those are the exceptions not the rule. How MySQL and InnoDB works primary key lookups are indeed lightning fast. But aren’t secondary key lookups good enough? I don’t think so. Especially with the adaptive hash buffer (will go into details in another post) if the memory is comparable with the size of the dataset or at least with total size of the data hot spots the secondary key lookups with covering index is as good as primary keys.

Synthesis

The test

So I (again) decided to create a benchmark suite specifically for testing this. It can be found on github.

Schema

 Testing

After loading data to the table I started 16 threads to update random rows with random data in random time. During the test occasionally I executed some huge query in the MySQL too (usually with full scan needed). Some of them on the same table, some of them on the same database but other tables and some of them on a different database. My goal was to model a usual pattern of a database server which is exposed to heavy duty use.

 Results

The results are standing for themselves.

  • Qtype 0 is the primary key update
  • Qtype 1 is the secondary key update

So the average execution time for updates by primary key lookups  is 59.46 milisecs while updates by secondary keys took in average 56.37 milisecs.

Final words

This was an empirical experiment. I didn’t want to solve any other issue than prove that secondary keys cost no more than primary key for updates which I managed to do.

You might like these too

How to store boolean in MySQL I always hear newer and newer exotic way to store different types of data in MySQL. People are trying to solve the problem of storing complex and not ...
Sleeping connections Why is it so important to close connections to databases if there's no explicit need to keep them open (which usually the case)? Symptoms Icinga ...
Varchar vs Char in MySQL : understanding trailing ... MySQL is not a strictly typed database. Given that and the different character types some tricky situation might arise. Especially when it comes to tr...