MySQL Events scenario (and a funky subquery)

December 15th, 2006. Tagged: mysql

Coming back to those MySQL events with a sort of more practical example. In case you've missed the previous post, events in brief: turns out that in MySQL starting from 5.1.6. you can setup and schedule cronjobs in the DB server itself. These are called events. An intro article and MySQL manual entry.

What these can be used for? OK, so you have a blog, which gets dugg and slashdotted often and is in the top 10 Google results for "Paris Hilton", in other words you get tons of visitors. You want it to load fast and you want visitor stats as real-time as possible, so that you can display "This posts was viewed 12345678 times". It would be best if you can get all data about a post with a simple super fast SELECT ... WHERE id=123.

To keep track of the stats you have a table `post_stats` with two fields - post ID and timestamp of the visit. This way later you can run all sorts of reports by date, grouped my hour, week, etc. For performance reasons every visit to a post results in one quick SELECT from the `posts` table and one INSERT into the stats, that's it. Doing a JOIN and COUNT()-ing the huuuge `post_stats` table when selecting from `posts` is unthinkable for your busy blog. So you add a field to the `posts` table called `hits` that has the count, this way a simple select is enough. OK, but how do you update the `hits`. Doing an
UPDATE posts SET hits = hits + 1
is not a good idea, because you lock the table to do the update and doing it for every hit will put the other visitors on hold.

The solution would be to have a cron job that counts `post_stats` and updates the `hits` value in `posts`. And now you can do this within MySQL, no need to setup a Unix cronjob and to write a PHP script for this simple task. Let's see.

Connecting to our `blog` DB.

C:\\Development\\Apache\\MySQL\\bin>mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.14-beta-community-nt MySQL Community Server (GPL)

Type 'help;' or '\\h' for help. Type '\\c' to clear the buffer.

mysql> \\u blog
Database changed

Hmm, what do we have here?

mysql> SHOW TABLES;
+----------------+
| Tables_in_blog |
+----------------+
| post_stats     |
| posts          |
+----------------+
2 rows in set (0.00 sec)

Posts? What's in there?

mysql> SELECT * FROM posts;
+----+---------+--------+------+
| id | title   | body   | hits |
+----+---------+--------+------+
|  1 | title 1 | body 1 |    0 |
|  2 | title 2 | body 2 |    0 |
|  3 | title 3 | body 3 |    0 |
|  4 | title 4 | body 4 |    0 |
|  5 | title 5 | body 5 |    0 |
|  6 | title 6 | body 6 |    0 |
+----+---------+--------+------+
6 rows in set (0.04 sec)

The field `hits` was just added and it has all zeros.

What's in `post_stats`?

mysql> SHOW FIELDS FROM post_stats;
+---------+----------+------+-----+---------+-------+
| Field   | Type     | Null | Key | Default | Extra |
+---------+----------+------+-----+---------+-------+
| post_id | int(11)  | NO   | MUL |         |       |
| ts      | datetime | NO   | MUL |         |       |
+---------+----------+------+-----+---------+-------+
2 rows in set (0.00 sec)

Good, let's pretend we have visitors that cause INSERTs into the stats table.

mysql> INSERT INTO post_stats VALUES
    ->  (1,NOW()), (1,NOW()), (1,NOW());
Query OK, 3 rows affected (0.03 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> INSERT INTO post_stats VALUES
    ->  (2,NOW()), (2,NOW()), (2,NOW())
    -> , (2,NOW()), (2,NOW()), (2,NOW());
Query OK, 6 rows affected (0.02 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> INSERT INTO post_stats VALUES
    ->  (3,NOW()), (3,NOW());
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

How many hits per post?

mysql> SELECT post_id, COUNT(*) FROM post_stats
    -> GROUP BY post_id;
+---------+----------+
| post_id | COUNT(*) |
+---------+----------+
|       1 |        3 |
|       2 |        6 |
|       3 |        2 |
+---------+----------+
3 rows in set (0.01 sec)

Now let's write up an UPDATE that updates the hits in `posts`. Hold your breath, this is fancy stuff - a subquery. An UPDATE that uses an aggregate SELECT that COUNTs. This way we update all posts in one shot.

mysql> UPDATE LOW_PRIORITY posts
    -> SET hits = (
    ->  SELECT COUNT(*)
    ->  FROM post_stats
    ->  WHERE posts.id = post_stats.post_id
    -> );
Query OK, 6 rows affected (0.02 sec)
Rows matched: 6  Changed: 6  Warnings: 0

LOW_PRIORITY because this an unobtrusive statement that is as polite as it is powerful and waits for everybody to finish with their SELECTs.

Does it work?

mysql> SELECT * FROM posts;
+----+---------+--------+------+
| id | title   | body   | hits |
+----+---------+--------+------+
|  1 | title 1 | body 1 |    3 |
|  2 | title 2 | body 2 |    6 |
|  3 | title 3 | body 3 |    2 |
|  4 | title 4 | body 4 |    0 |
|  5 | title 5 | body 5 |    0 |
|  6 | title 6 | body 6 |    0 |
+----+---------+--------+------+
6 rows in set (0.00 sec)

Kool! Let's reset the hits.

mysql> UPDATE posts SET hits = 0;
Query OK, 6 rows affected (0.03 sec)
Rows matched: 6  Changed: 6  Warnings: 0

Now let's create the event already. It's scheduled to start right away, to run once an hour (is that close enough to real-time? No? We can alter it after that) and to DO that funky update query every time it fires.

mysql> CREATE EVENT stats_update
    ->     ON SCHEDULE EVERY 1 HOUR
    ->     STARTS CURRENT_TIMESTAMP
    ->     DO
    ->       UPDATE LOW_PRIORITY posts
    ->       SET hits = (
    ->         SELECT COUNT(*)
    ->         FROM post_stats
    ->         WHERE posts.id = post_stats.post_id
    ->       );
Query OK, 0 rows affected (0.00 sec)

Let's see what we did.

mysql> SHOW EVENTS\\G;
*************************** 1. row *********
            Db: blog
          Name: stats_update
       Definer: root@localhost
          Type: RECURRING
    Execute at: NULL
Interval value: 1
Interval field: HOUR
        Starts: 2006-12-14 20:29:17
          Ends: NULL
        Status: ENABLED
1 row in set (0.00 sec)

Now let's ALTER the event a little, to ease the testing.

mysql> ALTER EVENT stats_update
    ->   ON SCHEDULE every 1 MINUTE;
Query OK, 0 rows affected (0.00 sec)

If event_scheduler is not already ON by configuration, let's turn it ON.

mysql> SET GLOBAL event_scheduler = "ON";
Query OK, 0 rows affected (0.00 sec)

Initial state:

mysql> SELECT * FROM posts;
+----+---------+--------+------+
| id | title   | body   | hits |
+----+---------+--------+------+
|  1 | title 1 | body 1 |    0 |
|  2 | title 2 | body 2 |    0 |
|  3 | title 3 | body 3 |    0 |
|  4 | title 4 | body 4 |    0 |
|  5 | title 5 | body 5 |    0 |
|  6 | title 6 | body 6 |    0 |
+----+---------+--------+------+
6 rows in set (0.00 sec)

After a minute:

mysql> SELECT * FROM posts;
+----+---------+--------+------+
| id | title   | body   | hits |
+----+---------+--------+------+
|  1 | title 1 | body 1 |    3 |
|  2 | title 2 | body 2 |    6 |
|  3 | title 3 | body 3 |    2 |
|  4 | title 4 | body 4 |    0 |
|  5 | title 5 | body 5 |    0 |
|  6 | title 6 | body 6 |    0 |
+----+---------+--------+------+
6 rows in set (0.00 sec)

Let's insert some more stats data.

mysql> INSERT INTO post_stats VALUES
    ->  (4,NOW()), (4,NOW()), (4,NOW());
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

... and check after a while.

mysql> SELECT * FROM posts;
+----+---------+--------+------+
| id | title   | body   | hits |
+----+---------+--------+------+
|  1 | title 1 | body 1 |    3 |
|  2 | title 2 | body 2 |    6 |
|  3 | title 3 | body 3 |    2 |
|  4 | title 4 | body 4 |    3 |
|  5 | title 5 | body 5 |    0 |
|  6 | title 6 | body 6 |    0 |
+----+---------+--------+------+
6 rows in set (0.00 sec)

mysql> wicked! awesome!
    -> ;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near 'wicke
d! awesome!' at line 1
mysql> w00t! roxor
    -> ;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near 'w00t!
 roxor' at line 1
mysql>

In case you've missed the previous "I [heart] MySQL" posts:

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov