Sunday, August 12, 2007

Boost your SQL with Proxy loops

By now, everybody knows that I am quite fond of MySQL Proxy and with reason. Using a Proxy you can add features without need of delving into the intricacies of MySQL server code.
One of the features that you can create with the Proxy, once you are familiar with its basic architecture (if you still aren't, you may check Getting started with MySQL Proxy) is a loop.
So many times I catch myself doing things like
$ for N in `seq 1 10` ; do mysql -e "CREATE TABLE t$N (id int)"; done
or something like that. Why, oh why there is no FOR command in MySQL syntax?

Some time ago, I created a working solution using stored routines, but it is less than perfect. It requires installation, and the routines are limited to one database, thus requiring an unnecessary amount of detail to make it work.

Proxy loops with a counter.

Enter the Proxy, and the solution is surprisingly slim.
What you need to do is start the proxy with the loop module from MySQL Forge. Then, from your client you can do loops:
mysql> set @x =0;
Query OK, 0 rows affected (0.02 sec)

mysql> for 1 10 select @x := @x + 1 as x;
+------+
| x |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+------+
10 rows in set (0.42 sec)
This is the simplest syntax. It requires a command (FOR) a start counter (1) a stop counter (10) and a query (select @x := @x + 1 as x).
The Proxy does the rest, sending the query as many times you required, and collecting the results in practical tabular form. You can actually refer to a counter in the query itself.
mysql> for 1 5 create table t$N (id int);
+--------+--------+---------------+
| inj_id | result | affected rows |
+--------+--------+---------------+
| 1 | 0 | 0 |
| 2 | 0 | 0 |
| 3 | 0 | 0 |
| 4 | 0 | 0 |
| 5 | 0 | 0 |
+--------+--------+---------------+
5 rows in set (0.02 sec)

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
| t3 |
| t4 |
| t5 |
+----------------+
5 rows in set (0.00 sec)
The $N variables refers to the current counter. The Proxy script will do the change before sending the query to the server. The result is shown in a dataset. A '0' (zero) means no error. When an actual error occurs, the loop is stopped, and the results up to that point are reported.
mysql> drop table t1;
Query OK, 0 rows affected (0.00 sec)

mysql> for 1 5 create table t$N (id int);
+--------+--------+---------------+
| inj_id | result | affected rows |
+--------+--------+---------------+
| 1 | 0 | 0 |
| 2 | -1 | 0 |
+--------+--------+---------------+
2 rows in set (0.01 sec)

mysql> show errors;
+-------+------+---------------------------+
| Level | Code | Message |
+-------+------+---------------------------+
| Error | 1050 | Table 't2' already exists |
+-------+------+---------------------------+
1 row in set (0.00 sec)
I removed one table (so the first command in the loop succeds) and when the second query is executed, an error is reported. The loop is immediately interrupted and the result is relayed to the client.

Proxy loops with a list.

Counters are useful, but they are not always what you need. Sometimes, you need to loop though a list. The above mentioned module supports a second syntax, using a list of comma separated items instead of a counter.
FOR user_var (items, in, a, list) your query
Easy to use. Really. Want to create three tables with the same structure but different names? Presto!
mysql> FOR mytable (the_first, the_second, the_third, something_else) CREATE TABLE $mytable (id int);
+--------+--------+---------------+
| inj_id | result | affected rows |
+--------+--------+---------------+
| 1 | 0 | 0 |
| 2 | 0 | 0 |
| 3 | 0 | 0 |
| 4 | 0 | 0 |
+--------+--------+---------------+
4 rows in set (0.05 sec)

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| something_else |
| the_first |
| the_second |
| the_third |
+----------------+
4 rows in set (0.00 sec)
Even if you don't declare a counter, nonetheless you can use one in your query, if needed. Suppose you want to create the tables in such a way that the first one has a CHAR(1) column, the second one a CHAR(2), and so on.
It's a lame example, but you can do it, using the now familiar $N variable.
mysql> FOR mytable (the_first, the_second, the_third, something_else) CREATE TABLE $mytable (name char($N ) );
+--------+--------+---------------+
| inj_id | result | affected rows |
+--------+--------+---------------+
| 1 | 0 | 0 |
| 2 | 0 | 0 |
| 3 | 0 | 0 |
| 4 | 0 | 0 |
+--------+--------+---------------+
4 rows in set (0.05 sec)

mysql> desc the_first;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| name | char(1) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
1 row in set (0.01 sec)

mysql> desc the_second;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| name | char(2) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
1 row in set (0.01 sec)
As usual, enjoy, and give it a try!

3 comments:

Scott said...

Some folks on Freenode have asked about building a table containing every date within a range, in order to left join some other table. So if I have a table of orders, and on some days there were no orders, my result set can fill in the missing days with 0. We can of course just build a table containing all dates from now until the distant future, but that feels kinda kludgy.

So, can the results of the loop show up as a derived table? SELECT * FROM (FOR 1 5 SELECT $N) AS range...

Giuseppe Maxia said...

Scott,
The macro in this example is a simple syntax enhancement recognized in the proxy.
The server knows nothing about it.
What you ask is feasible, although it would require a level of parsing much deeper and more complex than the one currently being used.
Tables with all the dates from a distant past to a foreseeable future are not a silly thing. For example, Anthony Molinaro's SQL Cookbook uses them cleverly.
I use them as well, and I wrote a post about creating them quickly.

John Hirbour said...

@Scott I use the sproc by Ron Savage here:

http://stackoverflow.com/questions/510012/get-a-list-of-dates-between-two-dates

to do something like what you mentioned.