技術日誌

DB,Java,セキュリティ,機械学習など。興味のあることを雑多に学ぶ

SQLのCOUNT,MAX関数はソートを発生させるのか MySQL編

昔々に書いたQiitaの投稿に来た指摘に、今はわからないと回答しました。 https://qiita.com/gooddoog/items/21b221c57ede30f8669a

この記事を書くにあたって参考にした書籍の著者の一人にご質問する機会があったのですが、 結論としてDBごとに違いが大きそうなので検証して見ました。

準備

検証に仕様したデータ

id age country
0001 18 JP
0002 23 US
0003 56 SK
0004 99 SK
0005 11 US
0006 34 JP
create database test;

create table test.people(
  id char(4) not null primary key,
  age int(3) not null,
  country char(2) not null
);

-- test.people
INSERT INTO test.people(id, age, country) VALUES('0001', '18', 'JP');
INSERT INTO test.people(id, age, country) VALUES('0002', '23', 'US');
INSERT INTO test.people(id, age, country) VALUES('0003', '56', 'SK');
INSERT INTO test.people(id, age, country) VALUES('0004', '99', 'SK');
INSERT INTO test.people(id, age, country) VALUES('0005', '11', 'US');
INSERT INTO test.people(id, age, country) VALUES('0006', '34', 'JP');

検証

mysqlでクエリにソートが発生したかどうかを見るには、extraにUsing filesortが表示されているかを見れば良いらしい。

参考:http://nippondanji.blogspot.jp/2009/03/using-filesort.html

mysql> explain select * from test.people order by age; --明示的にソート
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | people | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | Using filesort |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

COUNT(sort発生)

mysql> select country, count(*) from test.people group by country;
+---------+----------+
| country | count(*) |
+---------+----------+
| JP      |        2 |
| SK      |        2 |
| US      |        2 |
+---------+----------+
3 rows in set (0.00 sec)

mysql> explain select country, count(*) from test.people group by country;
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | SIMPLE      | people | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | Using temporary; Using filesort |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

MAX(sort発生)

mysql> select country, max(age) from test.people group by country;
+---------+----------+
| country | max(age) |
+---------+----------+
| JP      |       34 |
| SK      |       99 |
| US      |       23 |
+---------+----------+
3 rows in set (0.00 sec)

mysql> explain select country, max(age) from test.people group by country;
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | SIMPLE      | people | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | Using temporary; Using filesort |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

集約(GROUP BY)なしのMAX関数(sort発生せず)

GROUP BYなしでも試して見ました

mysql> select max(age) from test.people;
+----------+
| max(age) |
+----------+
|       99 |
+----------+
1 row in set (0.00 sec)

mysql> explain select max(age) from test.people;
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | people | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | NULL  |
+----+-------------+--------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

感想

結局集約なしのmaxでは最大値をどのように選定しているかということは実行計画からは読み取れないと思う。一応MySQLオープンソースだし、実装を根気よく読めばわかるかもしれないが。。。

GitHub - mysql/mysql-server: MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

結論

とりあえず

  • group by句と一緒に使う→ソート発生
  • group by句なしに使う→ソート発生せず

となりました。

おうちで学べるデータベースのきほん

おうちで学べるデータベースのきほん