OOM-KILLER on Linux: Protect MySQL

In a typical configuration for a PHP CMS or ecommerce, several services with high memory demands usually coexist on the same server depending on the situation. Mainly these are PHP and MySQL. Whether PHP is in Apache as a module or as FPM is beside the point and each administrator will seek their best configuration according to requirements and needs. Then there are those who opt to separate services on different servers to improve security and fault resilience, but that is another story.

In any case, whenever a system is configured, one tries to parameterize the different services in the best possible way so that no greater memory usage occurs than what is available on the server. And this is not an easy task. On one hand you have to try to think of extreme cases of maximum load where all services are at the maximum of their allowed and/or necessary configurations, on the other, you have to take into account the cost of memory, even more so today with the VPS/Cloud trend where we go back to the era of old dedicated servers where the amount of available RAM was scarce and expensive and it is not convenient to waste it. As you well know it is a difficult calculation and generally one plays very close to or beyond the limit since 99% of the time one is never at 100% memory usage of each service. This could give rise to many lines of opinions, ideas and possible configurations and strategies, but I will focus on the objective of this post: when you reach the dreaded OOM-KILLER.

This acronym means a kernel protection that when it finds itself about to run out of memory pulls a “simple” algorithm out of its sleeve to kill the first application it thinks is using the most memory and is expendable. And which process in a typical PHP+MySQL configuration is usually the best candidate? Yes, MySQL is generally the process that uses the most memory at any given moment and the one with all the chances of winning. And what is the problem? that an OOM-killer is not at all subtle and when it kills you, it kills you, it does not wait for you to close files or finish any intermediate operation and this for a database can be catastrophic. The problem will not be that they call you (or that you should have detected before the client) that the website has gone down and you go and restart MySQL, the look of panic will come when you see that some fundamental table is corrupted and in such a bad way that it cannot be easily repaired.

Interesting, right? So let us look for a solution.

Preventing the kernel from launching the dreaded OOM-killer is impossible and we cannot avoid it, what we can do is play with the algorithm to take away the idea of killing MySQL. Here is an example cron task that can save you from several anguishing moments:

* * * * * pgrep -f mysqld| while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done

The example is simple, we look for the PID of the running MySQL process every minute and assign it a priority in the oom_score_adj attribute so that the algorithm gives it the lowest possible priority. As I said this does not disable the OOM-killer only makes it look for another candidate. Generally the next on the candidate list is usually a PHP or Apache process (or Redis, or ES or…) but these, although they may bring down a client purchase transaction, are not as critical as corrupting the database and leaving the site completely KO for possibly hours if you are not quick and have bad luck.

Leave a Comment

Este formulario guarda los datos que indiques de nombre, email y comentario para poder realizar un seguimiento de los comentarios dejados en cada entrada.