Per-database information in performance_schema, please!

by Nov 27, 2011

This is a follow-up om my first Blog on performance_schema that appeared here:

One particular request that we had frequently from our users is to have status variables exposed per database and maybe even per table. Currently this is not possible. Not even with MySQL 5.6.3 with the P_S additions found there. I have frequently experienced that users get frustrated why 'such a simple thing' is not possible and I have the impression that they sometimes tend not to believe me!

Examples: the (GLOBAL) status variables 'slow_queries', 'com_update' etc. has no database/table -level information. For those MySQL users that run a lot of different applications (both standard applications – Forums software, Blogging software etc. – as well as applications dedicated to the particular business of the organization) on same server this would be extremely usable. It is not always easy to identify the reason for a sudden or periodic increase of a status variable. If the database context was exposed it would in most scenarios be possible to identify the application causing this and then it would be much easier to analyze.

It is to some extent possible to get such information from logs of course, but enabling general log has a penalty and parsing logs is complicated alone by the fact that sometimes a 'fully qualified' table name is used and sometimes a USE statement + a non-'fully qualified' table name is used. The USE statement may appear in the general log Gigabytes before statements using it! Also use of log tables (for general log in particular) has penalties.

Retrieving the information from P_S could be framed like this:

SELECT database, table, query, query_time FROM performance_schema.queries WHERE slow_query = 1 and com_insert = 0 ORDER BY query_time DESC;

Also aggregation like ".. GROUP BY database .." could be useful. If memory is a problem the P_S tables could have aggregated information only. Even that would be helpful in some cases.

I don't know how much trouble it would be to add such instrumentation. I realize that queries (JOIN's, UNION's, subqueries) may reference more tables and even different tables in more databases, but even that should be possible to solve somehow (record it twice if there are two tables referenced, for instance).

I would find a feature/instrumentation like this in P_S extremely useful. More useful (for me) than much of the stuff you'll find there now.