SQL_CALC_FOUND_ROWS Performance Issue
If you are working on MySQL (<=5.5) for a long time, you have an experience with performance issue on SQL_CALC_FOUND_ROWS with LIMIT OFFSET.
The core cause is MySQL generates full result set as a temporary table even if user only requested first 20 rows in the query.
This kind of situation often happens when you implement paging functionality.
Please see following example.
SELECT SQL_CALC_FOUND_ROWS tbl_a.a_id, tbl_b.*, tbl_c.*, FROM tbale_a AS tbl_a INNER JOIN table_b AS tbl_b ON tbl_b.a_id = tbl_a.a_id INNER JOIN table_c AS tbl_c ON tbl_c.b_id = tbl_b.b_id LEFT JOIN table_d AS tbl_d ON tbl_d.a_id = tbl_a.a_id WHERE tbl_b.field1 = "ABC" LIMIT 100, 20;In this example, a temporary table which has full result set is generated.
If you configure SQL_CACHE, the second time query might be quite faster.
However it is not a good choice because the query performance depends on status of tables, and not stable.
Assume if tbl_b is frequently updated in the above example, SQL_CACHE does not improve performance (or even worse from the entire system perspective).
Deferred Join
According to High Performance MySQL, the solution is Deferred Join.The strategy of deferred join is simple, 1) only pick up required id column and join other tables based on the ids and create result set.
Or 2) only pick up required id column and create result set by other query based on the picked up ids
. If the query contains "SQL_CALC_FOUND_ROWS", you cannot use the strategy 1) because you need to get result row count after all.
SELECT SQL_CALC_FOUND_ROWS tbl_a.a_id FROM tbale_a AS tbl_a INNER JOIN table_b AS tbl_b ON tbl_b.a_id = tbl_a.a_id WHERE tbl_b.field1 = "ABC" LIMIT 100, 20; SELECT tbl_a.a_id, tbl_b.*, tbl_c.*, FROM tbale_a AS tbl_a INNER JOIN table_b AS tbl_b ON tbl_b.a_id = tbl_a.a_id INNER JOIN table_c AS tbl_c ON tbl_c.b_id = tbl_b.b_id LEFT JOIN table_d AS tbl_d ON tbl_d.a_id = tbl_a.a_id WHERE tbl_a.a_id IN (...picked up ids from the first query...);
Maybe you have a question "MySQL create temporary table which contains the full result set in the first query after all, isn't it? So why deferred join strategy works more effective than original query?".
The answer is characteristics of MySQL.
- The size of temporary table generated in first query. There is only single column in the temporary table. That means the temporary is small and it can be on memory.
- Connection/Disconnection cost of MySQL is really low. That means sometimes small separated query is more effective than a large single query.
コメント