Archive for the 'mdb2' Category

Performance tunning with PEAR::DB

Tuesday, January 16th, 2007

If you use PEAR::MDB2, you can set a custom debug handler and collect all the queries you execute for debugging and performance tunning purposes, as shown before. But what if you're using PEAR::DB? Well, since PEAR::DB doesn't allow you such a functionality out of the box, you can hack it a bit to get similar results.

Simple app

Let's say you have a simple app:

<?php 
require_once 'DB.php';
 
$dsn = 'mysql://root@localhost/test';
$db =& DB::connect($dsn);
$db->setFetchMode(DB_FETCHMODE_ASSOC);
 
$sql = 'SELECT * FROM zipcodes';
$result = $db->getAll($sql);
$result = $db->getOne($sql);
$result = $db->getCol($sql);
$result = $db->getAll($sql);
$sql = 'SELECT zipcode FROM zipcodes';
$result = $db->getAll($sql);
$result = $db->getAll($sql);
$sql = 'SELECT CONCAT(zipcode, " - ", city) FROM zipcodes';
$result = $db->getAll($sql);
?>

Of course, this is an oversimplified example, usually you have more included files, class libraries and such, and it's not difficult to lose track of the database work as your app grows in complexity and size.

Now let's debug this app to see what type of database work it does.

Hacking PEAR::DB

In my case, I'm using MySQL, so I need to find the DB/mysql.php file in my PEAR directory. I open that file and find the simpleQuery() method. That's where all queries and up, sooner or later. I find this piece of code:

<?php
if (!$this->options['result_buffering']) {
    $result = @mysql_unbuffered_query($query, $this->connection);
} else {
    $result = @mysql_query($query, $this->connection);
}
?>

Then I hack this piece of code, adidng some lines before and after it. The result:

<?php
// start
$start_time = array_sum(explode(' ',microtime()));
// end
 
if (!$this->options['result_buffering']) {
    $result = @mysql_unbuffered_query($query, $this->connection);
} else {
    $result = @mysql_query($query, $this->connection);
}
 
// start
$query_took = array_sum(explode(' ',microtime())) - $start_time;
@$GLOBALS['global_query_counter']++;
@$GLOBALS['all_the_queries'][$GLOBALS['global_query_counter'] . ' - ' . $query] = $query_took;
//end
?>

Now as my app's pages are executed, I'll collect invaluable DB information.

Reporting

Let's see what we've collected.

You can add different types of reports in the footer of your application, or better yet, you can register a shutdown function to do the same. Here are some reporting ideas:

<?php
// report 1.
echo "<pre>All the queries, by the order they are executed:\\n";
print_r($GLOBALS['all_the_queries']);
echo '</pre>';
 
// report 2.
echo "<pre>All the queries, ordered by the time they took, descending:\\n";
arsort($GLOBALS['all_the_queries']);
print_r($GLOBALS['all_the_queries']);
echo '</pre>';
 
// report 3.
$sum = 0;
foreach ($GLOBALS['all_the_queries'] AS $t) {
    $sum += $t;
}
echo '<pre>';
echo 'Total number of queries:   ' . $GLOBALS['global_query_counter'] . "\\n";
echo 'Total time spend querying: ' . $sum;
echo '</pre>';
 
 
// report 4.
$distinct = array();
foreach ($GLOBALS['all_the_queries'] AS $q=>$t) {
    $parts = explode(' - ', $q);
    unset($parts[0]);
    $query = implode(' - ', $parts);
    @$distinct[$query]++;
}
echo "<pre>How many duplications:\\n";
arsort($distinct);
print_r($distinct);
echo '</pre>';
?>

Report results

Let's see what these reports gives us.

All the queries, by the order they are executed:
Array
(
    [1 - SELECT * FROM zipcodes] => 0.00626707077026
    [2 - SELECT * FROM zipcodes] => 0.00730204582214
    [3 - SELECT * FROM zipcodes] => 0.00796985626221
    [4 - SELECT * FROM zipcodes] => 0.00654602050781
    [5 - SELECT zipcode FROM zipcodes] => 0.0058650970459
    [6 - SELECT zipcode FROM zipcodes] => 0.0239379405975
    [7 - SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 0.00581502914429
)

All the queries, ordered by the time they took, descending:
Array
(
    [6 - SELECT zipcode FROM zipcodes] => 0.0239379405975
    [3 - SELECT * FROM zipcodes] => 0.00796985626221
    [2 - SELECT * FROM zipcodes] => 0.00730204582214
    [4 - SELECT * FROM zipcodes] => 0.00654602050781
    [1 - SELECT * FROM zipcodes] => 0.00626707077026
    [5 - SELECT zipcode FROM zipcodes] => 0.0058650970459
    [7 - SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 0.00581502914429
)

Total number of queries:   7
Total time spend querying: 0.0637030601501

How many duplications:
Array
(
    [SELECT * FROM zipcodes] => 4
    [SELECT zipcode FROM zipcodes] => 2
    [SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 1
)

Thanks for reading!

Any comments or suggestions are very welcome!

 

DB-2-MDB2 in Portuguese

Tuesday, January 16th, 2007

Through a trackback I found out that Walter Cruz has translated my DB-2-MDB2 article in a language I was led to believe is Brazilian Portuguese.

Thanks very much Walter, this is very flattering!

Thanks to my buddy Isidoro who enlightened me that the language was Portugeese!

 

Reusing an existing database connection with MDB2

Thursday, January 4th, 2007

This is a follow up to a question posted by Sam in my DB-2-MDB2 post. The question was if you can reuse an exisitng database connection you've already established and not have MDB2 creating a second connection.

When using a non-persistent connection

No worries in this case. No new connection will be established. As the PHP manual states:

If a second call is made to mysql_connect() with the same arguments, no new link will be established, but instead, the link identifier of the already opened link will be returned.

That is, if you don't set the fourth parameter to mysql_connect() to true. This parameter forces a new connection. BTW, in MDB2 if you do want to force a new connection, you have to set new_link in the DSN string to true

Bottom line, if you don't do anything special, the existing connection will be reused by MDB2. You can always verify that this is the case by calling phpinfo(INFO_MODULES); and looking in the "mysql" section.

When using a persistent connection

When using a persistent connection you have to do some additional steps to ensure that the same persistent connection is used by MDB2.

  • Tell MDB2 that you want a persistent connection - $mdb2->setOption('persistent', true);
  • Tell MDB2 which connection you want to use - $mdb2->connection = $link;, where $link is your existing connection
  • Set $mdb2->opened_persistent = true;

Here's an example:

<?php
// somewhere you've established a connection
$link = mysql_pconnect('localhost', 'root', '');
mysql_select_db('test', $link);
echo $link; // e.g. Resource id #5
 
// Create MDB2 object
require_once 'MDB2.php';
$dsn = 'mysql://root@localhost/test';
$mdb2 =& MDB2::factory($dsn);
 
// reuse your connection
$mdb2->setOption('persistent', true);
$mdb2->opened_persistent = true;
$mdb2->connection = $link;
 
// connect
$mdb2->connect();
echo $mdb2->connection; // Resource id #5
 
// check the "mysql" part to be sure
phpinfo(INFO_MODULES);
?>
 

Performance tuning with MDB2

Saturday, December 9th, 2006

This is a follow-up to Lars' comment about the PEAR book. In the MDB2 chapter I showed an example how you can create custom debug handlers in MDB2 and then gave a suggestion about a useful application of this functionality for performance tuning. Basically the idea is that your custom debug handler collects all queries that are executed during the life of a given script. Then, once the script finishes execution, the debug handler reports the stats that it has collected. In the book, the example is how you count the number of times each distinct query is executed, this way you can spot problems caused by the OO abstraction. For example, say you have a come class Users that has a method loadUser(), which abstracts the database work. While debugging with the custom error handler, you might figure out that without noticing, you're calling this method in a few places and it makes the same repeating query(queries) over and over again. So you can now optimize/cache results and so on.

The suggestion I made in the book is that in addition to counting, you might want to try executing all SELECTs again, just to see how much time they take and you can execute them once again, prepending them with EXPLAIN to get some details on possible room for improvement.

Now here's one solution to this suggestion. What you can see in this script is:

  • Setting up MDB2
  • Declaring a custom debug handler class
  • "Attaching" it to the MDB2 instance
  • Registering it for execution at the end of each script
  • Testing it (creating a DB, table, some queries)

I hope you like it and try it out.

Here's the result of executing this script, you can see what you get back.

Room for improvement

Obviously, the method dumpInfo() can be improved. First, it can print out a nice table, instead of lazy print_r(). Then, it can include some logic, my idea is for it to "understand" the EXPLAIN results and to give you a hint by using colors, for exampe green background for queries that are OK, yellow for warnings and red for queries that definitelly need some work. Could be nice, no?

Test script

Kinda longish, but I hope I added enough comments. I also hope I didn't introduce any syntax errors while formatting it for posting here, chopping long lines, etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
<?php
 
// PEAR error handling setup
require_once 'PEAR.php';
function pearError ($e)
{
  echo '<pre>';
  echo $e->getMessage().': '.$e->getUserinfo();
  echo '</pre>';
}
PEAR::setErrorHandling(
  PEAR_ERROR_CALLBACK,
  'pearError'
);
 
// creating MDB2 instance
require_once 'MDB2.php';
$dsn = 'mysql://root:test@localhost';
$mdb2 =& MDB2::factory($dsn);
$mdb2->setFetchMode(MDB2_FETCHMODE_ASSOC);
 
// The custom error handler
//
// It will collect all the queries being executed
// in the script, the collection is done by the
// collectInfo() method.
// Once the script finishes executing, we'll call
// the method executeAndExplain() which will
// execute all unique SELECTs once again
// in order to give us an info of how much time
// each query takes.
// Then executeAndExplain() will execute again
// all SELECTs, this time prepending an EXPLAIN
// so that we can get valuable
// optimization-related information
// Not only that but instead of simple EXPLAIN,
// we can use EXPLAIN EXTENDED and after that
// we can call SHOW WARNINGS -
// this will give us even more optimization hints
//
// http://dev.mysql.com/doc/refman/5.1/en/explain.html
// http://dev.mysql.com/doc/refman/5.1/en/show-warnings.html
//
class Explain_Queries
{
  // how many queries were executed
  var $query_count = 0;
  // which queries and their count
  var $queries = array();
  // results of EXPLAIN-ed SELECTs
  var $explains = array();
  // the MDB2 instance
  var $db = false;
 
  // constructor that accepts MDB2 reference
  function Explain_Queries(&$db) {
    $this->db = $db;
  }
 
  // this method is called on every query
  function collectInfo(
    &$db,
    $scope,
    $message,
    $is_manip = null)
  {
    // increment the total number of queries
    $this->query_count++;
    // the SQL is a key in the queries array
    // the value will be the count of how
    // many times each query was executed
    @$this->queries[$message]++;
  }
 
  // print the debug information
  function dumpInfo()
  {
    echo '<h3>Queries on this page</h3>';
    echo '<pre>';
    print_r($this->queries);
    echo '</pre>';
    echo '<h3>EXPLAIN-ed SELECTs</h3>';
    echo '<pre>';
    print_r($this->explains);
    echo '</pre>';
  }
 
  // the method that will execute all SELECTs
  // with and without an EXPLAIN and will
  // create $this->explains array of debug
  // information
  // SHOW WARNINGS will be called after each
  // EXPLAIN for more information
  function executeAndExplain() {
 
    // at this point, stop debugging
    $this->db->setOption('debug', 0);
    $this->db->loadModule('Extended');
 
    // take the SQL for all the unique queries
    $queries = array_keys($this->queries);
    foreach ($queries AS $sql) {
 
      // for all SELECTs…
      $sql = trim($sql);
      if (stristr($sql,"SELECT") !== false){
        // note the start time
        $start_time = array_sum(
            explode(" ", microtime())
        );
        // execute query
        $this->db->query($sql);
        // note the end time
        $end_time = array_sum(
            explode(" ", microtime())
        );
        // the time the query took
        $total_time = $end_time - $start_time;
 
        // now execute the same query with
        // EXPLAIN EXTENDED prepended
        $explain = $this->db->getAll(
          'EXPLAIN EXTENDED ' . $sql
        );
 
        $this->explains[$sql] = array();
        // update the debug array with the
        // new data from
        // EXPLAIN and SHOW WARNINGS
        if (!PEAR::isError($explain)) {
          $this->explains[$sql]['explain'] = $explain;
          $this->explains[$sql]['warnings'] =
               $this->db->getAll('SHOW WARNINGS');
        }
 
        // update the debug array with the
        // count and time
        $this->explains[$sql]['time'] = $total_time;
      }
    }
  }
}
 
// instance of the custom debug handler
$my_debug_handler = new Explain_Queries($mdb2);
// set debug option
$mdb2->setOption('debug', 1);
// set debug handler to the method that
// collects all queries
$mdb2->setOption(
  'debug_handler',
  array($my_debug_handler, 'collectInfo')
);
// register functions to be executed on shut down
// after the script has finished execution.
// Now that the show's over, it's the time to
// report what happened in this script db-access-wise
// First shutdown function executes the
// SELECTs again, the other one prints the results
register_shutdown_function(
  array($my_debug_handler, 'executeAndExplain')
);
register_shutdown_function(
  array($my_debug_handler, 'dumpInfo')
);
 
 
//
//
// At this point all MDB2 setup is done,
// time for the actual script to do something
//
//
 
// load the DB manager module
$mdb2->loadModule('Manager');
 
// drop database if it exists
// temporarily change the PEAR error handling
PEAR::pushErrorHandling(PEAR_ERROR_RETURN);
$mdb2->dropDatabase('test_db_explain');
PEAR::popErrorHandling();
 
// create and set a new database
$mdb2->createDatabase('test_db_explain');
$mdb2->setDatabase('test_db_explain');
 
// create table "events" from a definition array
// the table has event ID, name and date/time
$definition = array (
  'id' => array (
    'type' => 'integer',
    'unsigned' => 1,
    'notnull' => 1,
    'default' => 0,
  ),
  'name' => array (
    'type' => 'text',
    'length' => 255
  ),
  'datetime' => array (
    'type' => 'timestamp'
  )
);
 
$mdb2->createTable('events', $definition);
 
// create a primary key - the ID field
$definition = array (
  'primary' => true,
  'fields' => array (
    'id' => array()
  )
);
$mdb2->createConstraint(
  'events',
  'myprimekey',
  $definition
);
 
// load the class that has some static helper
// functions to work with MDB2's cross-RDBMS
// date format
MDB2::loadFile('Date');
 
// INSERT
// some data to insert into the events table
$data = array(
  // using MDB2-managed sequences
  'id'     => $mdb2->nextId('events'),
  'name'     => "Breakfast a Tiffany's",
  'datetime'   => MDB2_Date::unix2Mdbstamp(
    strtotime('Jan 15, 2007')
  )
);
// The "datetime" value shows how you can use
// any date format you wish as long as you're
// able to get a unix timestamp out of it
// In this case I'm using strtotime()
// Then there is a call to MDB2's date helper
// to get the MDB2 timestamp
 
// auto insert
// for the autoExecute() method we need to
// load the Extended module
$mdb2->loadModule('Extended');
$result = $mdb2->autoExecute(
  'events',
  $data,
  MDB2_AUTOQUERY_INSERT
);
 
//
// Time to SELECT something
//
// Using date helpers again
$start_date = MDB2_Date::date2Mdbstamp(0,0,0,12,31,1980);
$end_date   = MDB2_Date::date2Mdbstamp(0,0,0,12,31,2020);
$sql = 'SELECT * FROM %s WHERE %s > %s AND %s < %s';
$sql = sprintf(
  $sql,
  $mdb2->quoteIdentifier('events'),   // quote table name
  $mdb2->quoteIdentifier('datetime'), // quote field name
  $mdb2->quote($start_date, 'date'),  // quote data as date
  $mdb2->quoteIdentifier('datetime'), // quote field name
  $mdb2->quote($end_date,   'date')   // quote data as date
);
$res = $mdb2->getAll($sql); // execute
 
//
// * Bad practice code follows *
// Just some more inserts and selects, bad
// practice because these queries are not
// necessarily portable accross various RDBMS,
// lacking proper quoting and preparation
for ($i = 2; $i < 31; $i++) {
  $mdb2->query(
    'INSERT INTO events VALUES('
    . $i
    . ', "test event", "2005-05-05 00:00:00")'
  );
}
$res = $mdb2->getCol(
  'SELECT DISTINCT datetime FROM events'
);
$res = $mdb2->getRow(
  'SELECT * FROM events WHERE id
  IN (SELECT id FROM events WHERE id > 1)'
);
?>
 

DB-2-MDB2

Saturday, February 4th, 2006

Intro

Recently I had to move an existing project from using PEAR::DB to PEAR::MDB2 - the new database abstraction layer. I took notes on the parts of the code I needed to change, I hope they can benefit someone who's doing the same. Many thanks go to Lukas Smith, the lead developer, he was always responding very fast to my reports and questions in the PEAR mailing list.

One thing to notice in MDB2 is that it tries not to do any unnecessary work and does many things only on demand. For example when you create an object, that doesn't mean that a connection is established. It is established only when you make the first real database access, a SELECT for example.

I assume you have an idea of PEAR::DB, since this posting illustrates a DB-to-MDB2 endeavour, but even if you don't, I hope the posting will still be useful as an intro to DB and MDB2.

Including the libs

So first off, including the libs (I assume you have PEAR on your machine).

require_once 'DB.php';
require_once 'MDB2.php';

One thing to note here is that installing MDB2 doesn't install any of the database wrappers. So if you use MySQL for example, you'd need to install it separately:
pear install MDB2_Driver_mysql-beta
in addition to
pear install MDB2-beta

MDB2 is now a stable release! So you can now remove the "-beta" monkier when installing the packages.

DSN

Next - the DSN string. It's the same for MDB2 as for DB.

$dsn = 'mysql://root@localhost/db2mdb2';

BTW, MDB2 can also accept an array of all the connection details, as opposed to a DSN string. And so does DB (Thanks for the clarification, Justin!)

Creating instances

$db =& DB::connect($dsn);
$mdb2 =& MDB2::factory($dsn);

MDB2 provides a factory method to create an instance. At this time no database connection is yet established. MDB2 also provides a singleton() method to create an instance.

Fetchmode

It is the same in both DB and MDB2, just note the prefix of the constant.

// set fetchmode
$db->setFetchMode(DB_FETCHMODE_ASSOC);
$mdb2->setFetchMode(MDB2_FETCHMODE_ASSOC);

Simple SELECTs

There are the methods to select one row, one column, one cell and a bunch of records. DB prefixes them with get, while MDB2 uses query.

// select several records and shove them into an array
$all = $db->getAll('SELECT * FROM people');
$all = $mdb2->queryAll('SELECT * FROM people');


// select one cell
$one = $db->getOne('SELECT name FROM people WHERE id = 1');
$one = $mdb2->queryOne('SELECT name FROM people WHERE id = 1');


// one row
$row = $db->getRow('SELECT * FROM people WHERE id = 1');
$row = $mdb2->queryRow('SELECT * FROM people WHERE id = 1');


// a column
$col = $db->getCol('SELECT name FROM people');
$col = $mdb2->queryCol('SELECT name FROM people');

Quoting values

In DB, the suggested method to quote is quoteSmart(). In MDB2 it's quote() and it accepts a second parameter, which tells the type of the value to be quoted. If the second parameter is omitted, MDB2 will try to guess the type.

$one = $db->getOne(
         'SELECT name FROM people WHERE id = ' 
         . $db->quoteSmart(1)
       );


$one = $mdb2->queryOne(
         'SELECT name FROM people WHERE id = ' 
         . $db->quote(1, 'integer')
       );

Sequence tables

If you use sequence tables, both libs will provide you with a nextId() method:

echo $db->nextId('people_db');
echo $mdb2->nextId('people_mdb2');

The only difference is that when DB creates a sequence table (with one field and one value), the name of the field is id, where MDB2 will use sequence. If you're translating an existing project to MDB2 like me, and the sequence tables are already created by DB, you have the option of renaming this field in the database for all sequence tables, or you can set an MDB2 option and you're good to go.

$mdb2->setOption('seqcol_name','id');

Auto execute

Say you have the data:

$data = array('id' => 5, 'name' => 'Cameron');

To auto-insert it using DB, you'd do:

$db->autoExecute('people', $data, DB_AUTOQUERY_INSERT);

For MDB2, the auto execution is probably not considered an often-used feature, so it's not in the base instance. You need to load an additional module to have access to it:

$mdb2->loadModule('Extended');

Now you can

$mdb2->autoExecute('people', $data, MDB2_AUTOQUERY_INSERT);

The above will work in PHP5 only. In PHP4, due to the limited support of object overloading (Thanks again to Lukas for clarifying this!), you'd need to do:

$mdb2->extended->autoExecute('people', $data, MDB2_AUTOQUERY_INSERT);

Note that the second way will also work in PHP5.

Prepared statements

In DB:

$statement = $db->prepare('INSERT INTO people VALUES (?, ?)');
$data = array(6, 'Chris');
$db->execute($statement, $data);
$db->freePrepared($statement);

In MDB it's almost the same, only that the statement becomes an object and you call its (as opposed to MDB2 main object's) methods to execute and to release memory:

$statement = $mdb2->prepare('INSERT INTO people VALUES (?, ?)');
$data = array(7, 'Dave');
$statement->execute($data);
$statement->free();

Execute multiple

The same applies to executing a statement with with multiple "rows" of data from an array. executeMultiple() is in the Extended MDB2 module, so you need to load it:

DB:

$statement = $db->prepare('INSERT INTO people VALUES (?, ?)');
$data = array(
    array(8, 'James'),
    array(9, 'Cliff')
);

$db->executeMultiple($statement, $data);
$db->freePrepared($statement);

MDB2:

$statement = $mdb2->prepare('INSERT INTO people VALUES (?, ?)');
$data = array(
    array(10, 'Kirk'),
    array(11, 'Lars')
);

$mdb2->loadModule('Extended');
$mdb2->extended->executeMultiple($statement, $data);

$statement->free();

Transactions

In DB:

$db->autoCommit();
$result = $db->query('DELETE people'); // will cause an error

if (PEAR::isError($result)) {
    $db->rollback();     //echo 'rollback';
} else {
    $db->commit();     //echo 'commit';

}

In MDB2 you have to check if transactions are supported in your RDBMS. Then during the transaction, you can always check "Am I in transaction?"

if ($mdb2->supports('transactions')) {
    $mdb2->beginTransaction();

}
$result = $mdb2->query('DELETE people');
if (PEAR::isError($result)) {
    if ($mdb2->in_transaction) {
        $mdb2->rollback();         // echo 'rollback';
    }
} else {
    if ($mdb2->in_transaction) {
        $mdb2->commit();         // echo 'commit';
    }
}

Example script

You can download a script that has the examples above and play with it. Here's also the sql file to recreate the database:

Any questions or comments are welcome ;) Thanks for reading!