Fun with ANALYZE TABLE

MySQL has been naughty for me lately.

First, I ran into a neat little issue on FreshBooks’ production servers last week involving the table cache and an O(n) algorithm for selecting a table to close. I wrote up a little explanation over on the FreshBooks blog that you might find interesting if you find any of this interesting. The short version is that if you’re going to be running with a full table cache and still opening tables regularly, you’ll be better off with a much smaller table cache, because finding the least-recently-used table to close is big-O of the size of the table cache. Smaller table cache = fewer tables to determine the LRU.

And then last night, out of the blue, a web forum about tinwhistles that I host hit a wall. About 8:30, my mostly-idle Linode went heavily IO-bound — as in one of the four CPUs spinning in diskwait all the time. What had originally been complex but fast (and common) queries were suddenly taking minutes and minutes to run: things like “get a list of topics in a forum”, and especially “get a list of posts for a forum’s RSS feed”.

There’s a lot of EXPLAIN output here, so I’d better put this behind a cut.

I took a look at the RSS feed query, which seemed the worst not only because it took a long time — 600+ seconds — but because it read-locked every important table in the database. Here’s the query, a big inner join:

SELECT t.topic_title, t.topic_last_post_id, t.forum_id,
    f.forum_name, p.post_time, pt.post_text, pt.bbcode_uid,
    u.username, u.user_id
FROM phpbb_topics t, phpbb_posts p, phpbb_posts_text pt,
    phpbb_users u, phpbb_forums f
WHERE t.topic_status != 1
  AND p.post_id = t.topic_last_post_id
  AND pt.post_id = p.post_id
  AND u.user_id = p.poster_id
  AND t.forum_id = f.forum_id
ORDER BY t.topic_last_post_id DESC
LIMIT 0, 15;

Here’s the mk-visual-explain output. I’ve replaced the table aliases with readable things:

Filesort
+- TEMPORARY
   table          temporary(forums,topics,posttext,posts,users)
   +- JOIN
      +- Bookmark lookup
      |  +- Table
      |  |  table          users
      |  |  possible_keys  PRIMARY
      |  +- Unique index lookup
      |     key            users->PRIMARY
      |     possible_keys  PRIMARY
      |     key_len        3
      |     ref            chiffbb.posts.poster_id
      |     rows           1
      +- JOIN
         +- Bookmark lookup
         |  +- Table
         |  |  table          posts
         |  |  possible_keys  PRIMARY,poster_id
         |  +- Unique index lookup
         |     key            posts->PRIMARY
         |     possible_keys  PRIMARY,poster_id
         |     key_len        3
         |     ref            chiffbb.topics.topic_last_post_id
         |     rows           1
         +- JOIN
            +- Bookmark lookup
            |  +- Table
            |  |  table          posttext
            |  |  possible_keys  PRIMARY
            |  +- Unique index lookup
            |     key            posttext->PRIMARY
            |     possible_keys  PRIMARY
            |     key_len        3
            |     ref            chiffbb.topics.topic_last_post_id
            |     rows           1
            +- JOIN
               +- Filter with WHERE
               |  +- Bookmark lookup
               |     +- Table
               |     |  table          topics
               |     |  possible_keys  forum_id,topic_status,topic_last_post_id
               |     +- Index lookup
               |        key            topics->forum_id
               |        possible_keys  forum_id,topic_status,topic_last_post_id
               |        key_len        2
               |        ref            chiffbb.forums.forum_id
               |        rows           2579
               +- Table scan
                  rows           23
                  +- Table
                     table          forums
                     possible_keys  PRIMARY

See that temporary table at the top that gets used in a filesort? Well…

The whole thing was multiple joins which were then ORDERed and LIMITed. So that meant that it had to find all posts to the forum, ever, and shove them in a temporary table, sort that, and take the 15 most recent posts.

“All posts to the forum, ever” is about 500MB of data. That made the temporary table big enough to go to disk. So every time this query ran and couldn’t be answered from the query cache, it had to write that 500MB file. And the cached query was invalidated whenever someone posted to the forum, which is pretty often.

The problem in this case wasn’t (entirely) the SQL. MySQL was optimizing the query poorly because the key distribution statistics were off. An ANALYZE TABLE on the affected tables fixed that, and gave us:

JOIN
+- Bookmark lookup
|  +- Table
|  |  table          forums
|  |  possible_keys  PRIMARY
|  +- Unique index lookup
|     key            forum->PRIMARY
|     possible_keys  PRIMARY
|     key_len        2
|     ref            chiffbb.topics.forum_id
|     rows           1
+- JOIN
   +- Bookmark lookup
   |  +- Table
   |  |  table          users
   |  |  possible_keys  PRIMARY
   |  +- Unique index lookup
   |     key            users->PRIMARY
   |     possible_keys  PRIMARY
   |     key_len        3
   |     ref            chiffbb.posts.poster_id
   |     rows           1
   +- JOIN
      +- Bookmark lookup
      |  +- Table
      |  |  table          posttext
      |  |  possible_keys  PRIMARY
      |  +- Unique index lookup
      |     key            posttext->PRIMARY
      |     possible_keys  PRIMARY
      |     key_len        3
      |     ref            chiffbb.posttext.post_id
      |     rows           1
      +- JOIN
         +- Bookmark lookup
         |  +- Table
         |  |  table          posts
         |  |  possible_keys  PRIMARY,poster_id
         |  +- Unique index lookup
         |     key            posts->PRIMARY
         |     possible_keys  PRIMARY,poster_id
         |     key_len        3
         |     ref            chiffbb.topics.topic_last_post_id
         |     rows           1
         +- Filesort
            +- Filter with WHERE
               +- Bookmark lookup
                  +- Table
                  |  table          topics
                  |  possible_keys  forum_id,topic_status,topic_last_post_id
                  +- Index range scan
                     key            topics->topic_status
                     possible_keys  forum_id,topic_status,topic_last_post_id
                     key_len        1
                     rows           57912

There’s still a filesort, but it’s now a filesort of a single 57k-row table that’s already been filtered. That table is about 5MB, and fits in tmp_table_size, so doesn’t go to disk. The joins all stack, and the ORDER BY just follows that one-table filesort. The query takes about 0.15s now, or about 4000x as fast.

Incidentally, it can still be improved: that “filter with WHERE” is because of the “WHERE t.topic_status != 1″ in the query, and that means “where the topic is not locked”. The idea was that locked topics aren’t going to appear in the last-15-posts anyhow, so may as well exclude them. But if they’re not going to appear because of the sorting, and since you’re sorting anyhow, unless there are a LOT of locked posts that doesn’t matter. Taking out that restriction gets us:

JOIN
+- Bookmark lookup
|  +- Table
|  |  table          users
|  |  possible_keys  PRIMARY
|  +- Unique index lookup
|     key            users->PRIMARY
|     possible_keys  PRIMARY
|     key_len        3
|     ref            chiffbb.posts.poster_id
|     rows           1
+- JOIN
   +- Bookmark lookup
   |  +- Table
   |  |  table          posts
   |  |  possible_keys  PRIMARY,poster_id
   |  +- Unique index lookup
   |     key            posts->PRIMARY
   |     possible_keys  PRIMARY,poster_id
   |     key_len        3
   |     ref            chiffbb.topics.topic_last_post_id
   |     rows           1
   +- JOIN
      +- Bookmark lookup
      |  +- Table
      |  |  table          posttext
      |  |  possible_keys  PRIMARY
      |  +- Unique index lookup
      |     key            posttext->PRIMARY
      |     possible_keys  PRIMARY
      |     key_len        3
      |     ref            chiffbb.topics.topic_last_post_id
      |     rows           1
      +- JOIN
         +- Bookmark lookup
         |  +- Table
         |  |  table          forums
         |  |  possible_keys  PRIMARY
         |  +- Unique index lookup
         |     key            forums->PRIMARY
         |     possible_keys  PRIMARY
         |     key_len        2
         |     ref            chiffbb.t.forum_id
         |     rows           1
         +- Bookmark lookup
            +- Table
            |  table          topics
            |  possible_keys  forum_id,topic_last_post_id
            +- Index scan
               key            topics->topic_last_post_id
               possible_keys  forum_id,topic_last_post_id
               key_len        3
               rows           59900

And with that there isn’t even a filesort and the query finishes in <0.01 seconds, 60000x as fast as the original problem and 15x as fast as the post-ANALYZE optimization. Nice.

Comments 1