“MySQL server has gone away” Part 2: session timeout.

by Sep 2, 2009

This is the second and last article of a small series on reasons for the error message: “MySQL server has gone away”. The first appeared here describing that "if a client sends a SQL-statement longer than the server max_allowed_packet setting, the server will simply disconnect the client."

This article will deal with another common reason: the connection has timed out. This is in principle not difficult to understand. But there is a common misunderstanding and lots of confusions about the two server variables wait_timeout and interactive_timeout and how they work.

MySQL 5.0 documentation has the two passages:

1) "interactive_timeout … Both …"
('Both' means that 'interactive_timeout' can be both a GLOBAL and a SESSION variable)

2) "On thread startup, the session wait_timeout value is initialized from the global
wait_timeout value or from the global interactive_timeout value, depending on the type of
client (as defined by the CLIENT_INTERACTIVE connect option to mysql_real_connect())."

Both doc starements are correct – server behaves like that – but behaviour is inconsistent and confusing, I think. What is listed in 2) was always the case. 1) is an addition and a change with recent servers.

The interactive_timeout variable was introduced in very early MySQL days primarily  in order to make it possible to have a low timeout setting for server-side scripts and a higher setting for interactive clients operated by a human. The server scripts will use the (typically low) wait_timeout setting and the interactive clients will use the (typically higher) interactive_timeout setting. We have seen wait_timeout settings as low as 5 seconds – probably implemented by the server admin in order to stop threads running if somebody forgot to close the connection in an application. A human may not be able to compose queries as fast as the wait_timeout setting, so in its original form the implementation made good sense. However there were only 2 different settings possible what was a limitation of course.

With the introduction of SESSION variables you can SET both wait_timeout and interactive_timeout for the SESSION. But the problem is that "SET interactive_timeout = somevalidnumber" has no effect at all – except that the value of the SESSION variable will change. Except for this it changes nothing as SESSION interactive_timeout is never used for any purpose. So setting interactive_timeout changes the value of the 'interactive_timeout' SESSION variable *only*. It has no effect at all on how and when the server will timeout the session. Actually the docs state it here "the session wait_timeout value is initialized from the global wait_timeout value or from the global interactive_timeout value". Keyword here is *initialized*!

It is ALWAYS the current value of SESSION wait_timeout that matters. Nothing else.

So we have reached the first conclusion: simply forget about interactive_timeout. It is largely uninteresting now. Simply SET wait_timeout for the SESSION if you need. Actually I think it is a bug that interactive_timeout was implemented as a SESSION variable. Not at all a critical bug of course, but anyway somewhat confusing. It looks like same confusion existed in the developer team too at the time of implementing interactive_timeout as a SESSION variables.

Actually you may Google and you will practically only find discussions and advices that do not reflect the implementation of wait_timeout as a SESSION variable at all. MySQL 3.23/4.0 'wisdom' still thrives!

Now .. some people will probably tell here that this is not really worth worrying about. You may set the reconnect flag when connecting and should session timeout the client will simply reconnect transparently for user – and with a reasonable connection user will most often not even notice.

It should work like that and normally does too, but it is our experience that it does not always. It is not a bug with MySQL, I believe. It is some network issue (maybe related to specific hardware/firmware) that we do not claim to fully understand. But we listed in the release notes for SQLyog 8.13: "However we have reports of situations where the network takes very long time to process such reconnect requests."

We have seen situations where a reconnect took up to 2 minutes and a 'fresh' connect was practically instantaneous. In one case a user reported this very slow reconnect when he was connecting through a wireless router, but not when connecting with a traditional wired Ethernet connection to the same server from the same client machine on same desk in the same office! Sometimes when our users have reported this and they gave us access to their server it was reproducible for us and sometimes it was not. There is no pattern we can identify – except that this problem seems to have grown over the last 1-2 years. We had never seen it two years ago. SQLyog reconnected promptly if connection had timed out. But after that we started getting the reports that "if I leave SQLyog open then after a while it will hang next time I try to use it". It did not 'hang' – it waited for connection to server to be re-established. The extreme reconnect time we have experienced was around 2 minutes, but we have seen 15-30 seconds quite a lot of times. I believe the total number of such reports is around 30.

So the statement "network takes very long time to process such reconnect requests" in the release notes is not really a proven fact. It is a theory or an assumption (but the best we have). However indications are so strong (and number of user reports so many) that something seems to have happened with networking systems lately and we cannot ignore it. That is the reason why we introduced an option to define the session timeout in SQLyog 8.13 – simply in order to make it possible for user to prevent that timeout will occur at all – as an option saved for the connection so that user will not need to SET wait_timeout manually every time.

Should somebody have same experience or more information about same we would like to hear about it.