Improve MariaDB Performance using Query Profiling

by Apr 25, 2018

Query profiling is a useful technique for analyzing the overall performance of a database. Considering that a single mid-to-large sized application can execute numerous queries each and every second, query profiling is an important part of database tuning, both as a proactive measure and in diagnosing problems.  In fact, it can become difficult to determine the exact sources and causes of bottlenecks and sluggish performance without employing some sort of query profiling techniques. This blog will present a few useful query profiling techniques that exploit MariaDB server’s own built-in tools: the Slow Query Log and the Performance Schema.

MariaDB vs. MySQL

Needless to say, the techniques that we’ll be covering here today are likely to be equally applicable to MySQL, due to the close relationship between the two products.

The day that Oracle announced the purchase of Sun back in 2010, Michael "Monty" Widenius forked MySQL and launched MariaDB, taking a swath of MySQL developers with him in the process.  His goal was for the relational database management system (DBMS) to remain free under the GNU GPL.

Today, MariaDB is a drop-in replacement for MySQL, one with more features and better performance.

MariaDB used to be based on the corresponding version of MySQL, where one existed. For example, MariaDB 5.1.53 was based on MySQL 5.1.53, with some added bug fixes, additional storage engines, new features, and performance improvements.  As of this writing, the latest version of MariaDB is 10.2.x. Meanwhile, MySQL 8 is still in RC (Release Candidate) mode.

The Slow Query Log

One feature shared by both MariaDB and MySQL is the slow query log.  Queries that are deemed to be slow and potentially problematic are recorded in the log.  A “slow” query is defined as a query that takes longer than the long_query_time global system variable value (of 10 seconds by default) to run. Microseconds are allowed for file logging, but not for table logging.  

Configuring the Slow Query Log via Global System Variables

Besides the long_query_time global system variable mentioned above, there are a few other variables that determine the behavior of the slow query log.

The slow query log is disabled by default. To enable it, set the slow_query_log system variable to 1. The log_output server system variable determines how the output will be written, and can also disable it. By default, the log is written to file, but it can also be written to table.  

Valid values for the log_output server system variable are TABLE, FILE or NONE.  The default name of the file is host_name-slow.log, but can also be set using the –slow_query_log_file=file_name option. The table used is the slow_log table in the mysql system database.

These variables are best set in the my.cnf or mariadb.cnf configuration files, typically stored in the /etc/mysql/ directory on Linux and in the Windows System Directory, usually C:WindowsSystem, on Windows.  (See Configuring MariaDB with my.cnf for all of the possible locations.)  The following settings, appended in the [mysqld] section, will:

  1. Enable the slow query log.
  2. Set time in seconds/microseconds defining a slow query.
  3. Provide the name of the slow query log file.
  4. Log queries that don’t use indexes.

[1] slow_query_log = 1
[2] long_query_time = 5
[3] slow_query_log_file = /var/log/mysql/slow-query.log
[4] log_queries_not_using_indexes

Settings will take effect after a server restart.

Viewing the Slow Query Log

Slow query logs written to file can be viewed with any text editor.  Here are some sample contents:

# Time: 150109 11:38:55

# User@Host: root[root] @ localhost []

# Thread_id: 40  Schema: world  Last_errno: 0  Killed: 0

# Query_time: 0.012989  Lock_time: 0.000033  Rows_sent: 4079  Rows_examined: 4079  Rows_affected: 0  Rows_read: 4079

# Bytes_sent: 161085

# Stored routine: world.improved_sp_log

SET timestamp=1420803535;

SELECT * FROM City;

# User@Host: root[root] @ localhost []

# Thread_id: 40  Schema: world  Last_errno: 0  Killed: 0

# Query_time: 0.001413  Lock_time: 0.000017  Rows_sent: 4318  Rows_examined: 4318  Rows_affected: 0  Rows_read: 4318

# Bytes_sent: 194601

# Stored routine: world.improved_sp_log

SET timestamp=1420803535;

The only drawback to viewing the slow query log with text editor is that it could (and does!) soon grow to such a size that it becomes increasingly difficult to parse through all the data.  Hence, there is a risk that problematic queries will get lost in the log contents. MariaDB offers the mysqldumpslow tool to simplify the process by summarizing the information.  The executable is bundled with MariaDB. To use it, simply run the command and pass in the log path. The resulting rows are more readable as well as grouped by query:

mysqldumpslow /tmp/slow_query.log

Reading mysql slow query log from /tmp/slow_query.log

Count: 1 Time=23.99s (24s) Lock=0.00s (0s) Rows_sent=1.0 (1), Rows_examined=0.0 (0), Rows_affected=0.0 (0), root[root](#)@localhost
     SELECT * from large_table

Count: 6 Time=6.83s (41s) Lock=0.00s (0s) Rows_sent=1.0 (6), Rows_examined=0.0 (0), Rows_affected=0.0 (0), root[root](#)@localhost
     SELECT * from another_large_table

There are various parameters that can be used with the command to help customize the output. In the next example, the top 5 queries sorted by the average query time will be displayed:

mysqldumpslow -t 5 -s at /var/log/mysql/localhost-slow.log

Working with the slow_log Table

Slow query logs written to table can be viewed by querying the slow_log table.

It contains the following fields:

Field Type Default Description
start_time timestamp(6) CURRENT_TIMESTAMP(6) Time the query began.
user_host mediumtext NULL User and host combination.
query_time time(6) NULL Total time the query took to execute.
lock_time time(6) NULL Total time the query was locked.
rows_sent int(11) NULL Number of rows sent.
rows_examined int(11) NULL Number of rows examined.
db varchar(512) NULL Default database.
last_insert_id int(11) NULL last_insert_id.
insert_id int(11) NULL Insert id.
server_id int(10) unsigned NULL The server's id.
sql_text mediumtext NULL Full query.
thread_id bigint(21) unsigned NULL Thread id.
rows_affected int(11) NULL Number of rows affected by an UPDATE or DELETE (as of MariaDB 10.1.2)

Here are some sample results of a SELECT ALL against the slow_log table:

SELECT * FROM mysql.slow_logG

*************************** 2. row ***************************
    start_time: 2014-11-11 07:56:28.721519

     user_host: root[root] @ localhost []

    query_time: 00:00:12.000215

     lock_time: 00:00:00.000000

     rows_sent: 1

 rows_examined: 0

            db: test

last_insert_id: 0

     insert_id: 0

     server_id: 1

      sql_text: SELECT * FROM large_table

     thread_id: 74

 rows_affected: 0

...

Ordering Slow Query Log Rows

If you want to emulate the Linux “tail -100 log-slow.log” command with the slow_log table, which lists the latest queries in the end, you can issue the following query:

SELECT * 

FROM (SELECT * 
      FROM slow_log 
      ORDER BY start_time 
      DESC LIMIT 100) sl 

ORDER BY start_time;

That will list the last 100 queries in the table.
Rather than typing the same SELECT statement every time you want to list the latest queries last, I would recommend creating a stored procedure, something like SHOW_LATEST_SLOW_QUERIES
and use it instead. The number of queries to show can be passed to your proc as an input parameter.

Testing the Slow Query Log

Before attempting to scour through the slow query log in a production environment, it’s a good idea to test its operation by executing a few test queries; perhaps some that should trigger logging and others that should not.

As mentioned previously, when logging is enabled, queries that takes longer than the long_query_time global system variable value to run are recorded in the slow log file or the slow_log table, based on the log_output variable’s value.

You can certainly work with your data to construct SELECT queries that take variable amounts of time to execute, but perhaps an easier approach is to employ the sleep() function:

SLEEP(duration)

The sleep() function pauses query execution for the number of seconds given by the duration argument, then returns 0.  If SLEEP() is interrupted, it returns 1. The duration may have a fractional part given in microseconds, but there is really no need for the purposes of slow log testing.  Here’s an example:

SELECT sleep(5);
+------------+
| sleep(5)   |
+------------+
|          0 |
+------------+
1 row in set (5.0 sec)

Suppose that the long_query_time global system variable has not been explicitly assigned a value. In that instance it would have the default value of 10 seconds. Therefore, the following SELECT statement would be recorded to the slow log:
SELECT SLEEP(11);

Query Profiling with Performance Schema

Another tool that we can use to monitor server performance is the Performance Schema. Introduced in MariaDB 5.5, the Performance Schema is implemented as a storage engine, and so will appear in the list of storage engines available:

SHOW ENGINES;
+--------------------+---------+-------------------------------+--------------+------+------------+
| Engine             | Support | Comment                       | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------+--------------+------+------------+
| InnoDB             | DEFAULT | Default engine                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema            | NO           | NO   | NO         |
| ...                |         |                               |              |      |            |
+--------------------+---------+-------------------------------+--------------+------+------------+

It is disabled by default for performance reasons, but it can easily be enabled as follows:
First, add the following line in your my.cnf or my.ini file, in the [mysqld] section:
performance_schema=on

The performance schema cannot be activated at runtime – it must be set when the server starts, via the configuration file.

The Performance Schema storage engine contains a database called performance_schema, which in turn consists of a number of tables that can be queried with regular SQL statements for a wide range of performance information.

In order to collect data, you need to set up all consumers (starting the collection of data) and instrumentations (what data to collect).  These may be set either on server startup or at runtime.

The following statements set up consumers and instrumentations at runtime:

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES';

UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES';

You can decide what to enable/disable with WHERE NAME like "%what_to_enable"; conversely, you can disable instrumentations by setting ENABLED to "NO".
The following enables all instrumentation of all stages (computation units) in the configuration file:

[mysqld]

performance_schema=ON

performance-schema-instrument='stage/%=ON'

performance-schema-consumer-events-stages-current=ON

performance-schema-consumer-events-stages-history=ON

performance-schema-consumer-events-stages-history-long=ON

With regards to Query Profiling:

  1. Ensure that statement and stage instrumentation is enabled by updating the setup_instruments table as follows:

UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE '%statement/%';

UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE '%stage/%';

      2. Enable the events_statements_* and events_stages_* consumers:

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES'
WHERE NAME LIKE '%events_statements_%';

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES'
WHERE NAME LIKE '%events_stages_%';

Once you’ve narrowed down what you are interested in, there are two ways to start monitoring:

  1. View raw data in the summary views.
    This gives you an overall picture of usage on the instance.
  2. Snapshot data, and compute deltas over time.
    This gives you an idea of the rates of changes for events.

Let’s start with viewing raw summary data:

  1. Run the statement(s) that you want to profile. For example:
SELECT * FROM acme.employees WHERE emp_no = 99;

+--------+------------+------------+-----------+--------+------------+
| emp_no | birth_date | first_name | last_name | gender | hire_date  |
+--------+------------+------------+-----------+--------+------------+
|     99 | 1979-01-29 | Bill       | Bixby     | M      | 2006-06-05 |
+--------+------------+------------+-----------+--------+------------+

Identify the EVENT_ID of the statement by querying the events_statements_history_long table. This step is similar to running SHOW PROFILES to identify the Query_ID. The following query produces output similar to SHOW PROFILES:

SELECT EVENT_ID, 

       TRUNCATE(TIMER_WAIT/1000000000000,6) as Duration, 

       SQL_TEXT

FROM performance_schema.events_statements_history_long 

WHERE SQL_TEXT like '%99%';

+----------+----------+--------------------------------------------------------+
| event_id | duration | sql_text                                               |
+----------+----------+--------------------------------------------------------+
|       22 | 0.021470 | SELECT * FROM acme.employees WHERE emp_no = 99         |
+----------+----------+--------------------------------------------------------+

Query the events_stages_history_long table to retrieve the statement's stage events. Stages are linked to statements using event nesting. Each stage event record has a NESTING_EVENT_ID column that contains the EVENT_ID of the parent statement.

SELECT event_name AS Stage, 

       TRUNCATE(TIMER_WAIT/1000000000000,6) AS Duration

FROM performance_schema.events_stages_history_long 

WHERE NESTING_EVENT_ID=22;

+--------------------------------+----------+
| Stage                          | Duration |
+--------------------------------+----------+
| stage/sql/starting             | 0.000080 |
| stage/sql/checking permissions | 0.000005 |
| stage/sql/Opening tables       | 0.027759 |
| stage/sql/init                 | 0.000052 |
| stage/sql/System lock          | 0.000009 |
| stage/sql/optimizing           | 0.000006 |
| stage/sql/statistics           | 0.000082 |
| stage/sql/preparing            | 0.000008 |
| stage/sql/executing            | 0.000000 |
| stage/sql/Sending data         | 0.000017 |
| stage/sql/end                  | 0.000001 |
| stage/sql/query end            | 0.000004 |
| stage/sql/closing tables       | 0.000006 |
| stage/sql/freeing items        | 0.000272 |
| stage/sql/cleaning up          | 0.000001 |
+--------------------------------+----------+

Conclusion

This blog presented a few useful query profiling techniques that employ a couple of MariaDB server’s built-in tools: the Slow Query Log and the Performance Schema.
The Slow Query Log records queries that are deemed to be slow and potentially problematic, that is, queries that take longer than the long_query_time global system variable value to run.
The slow query log may be viewed with any text editor. Alternatively, MariaDB’s mysqldumpslow tool can simplify the process by summarizing the information. The resulting rows are more readable as well as grouped by query.
The Performance Schema is a storage engine that contains a database called performance_schema, which in turn consists of a number of tables that can be queried with regular SQL statements for a wide range of performance information. It may be utilized to view raw data in the summary views as well as review performance over time.