Automating HTTPWatch with PHP #2

March 7th, 2011. Tagged: HTTP, performance, php

In part 1 I demonstrated how you can use PHP to script and automate HTTPWatch. And how you can get data back, either reading the API docs or using a quick HAR hack to get a lot of data in one go.

Now I want to share a little class I wrote to make all that a little easier.

The code is here on GitHub.

Basic usage

Open IE, navigate, close:

$http = new HTTPWatch();
$http->go('http://phpied.com/');
$http->done();

To do the same in Firefox, just pass "ff" to the constructor:

$http = new HTTPWatch('ff');
$http->go('http://phpied.com/');
$http->done();

The constructor accepts a second param with options, like empty cache, hide browser (ie only), etc, largely underused for the time being.

Handle to the HTTPWatch plugin

After $http = new HTTPWatch(); a watch property will be added to $http. This is the HTTPWatch instance which gives you access to all its APIs, so you can do e.g.

$summary = $http->watch->Log->Entries->Summary;

Data out

My main motivation behind this class, other than simpler api, has been to provide the ability to just dump all the data that HTTPWatch collects in a quick print_r(). That has been a challenge with the COM PHP bridge, but I found a hack around it. In any event, most of the HTTPWatch API I've exported to a second PHP file - the HTTPWatchAPI.php script. (This is an auto-generated file, created by another script, but let's leave that out for now.)

So after you've navigated to a page you have two convenient methods to grab a bunch of data from HTTPWatch. The first is:

$http->getSummary();

This gives a summary stats for the http observation session. The second is

$http->getEntries();

It gives you details about every HTTPWatch log entry - be it cached or an actual HTTP request.

Here's an example of what getSummary() can give you. Here's how this file was generated:

$http = new HTTPWatch();
$http->go("http://google.com");
print_r($http->getSummary());
$http->done();

And here's some output print_r()-ed from getEntries(). Here's the code that produced it:

$http = new HTTPWatch();
$http->go("http://google.com");
print_r($http->getEntries());
$http->done();

If you look carefully at the dump, you may notice something like [Stream] => [BYTESTREAM]. Most of the times you don't need the raw HTTP streams (gzipped, chunked, etc), but you can get them if you want by setting:

$http->skipStreams = false;

Here's the same google.com example, this time including the raw streams. And the code:

$http = new HTTPWatch();
$http->go("http://google.com");
$http->skipStreams = false;
print_r($http->getEntries());
$http->done();

Free vs. paid

One pain with HTTPWatch is that the free version has restrictions. The summary for example doesn't include TimingSummaries and WarningSummaries properties. The entry log has almost nothing - no headers or content or streams. My class handles that by giving you as much as it can. If you're using the free version, it will return the limited data for the restricted URLs, but still the full data for those URLs that HTTPWatch's demo version allows - the top Alexa sites.

So here's a dump of visiting http://givepngachance.com with my free HTTPWatch edition.

The data has restricted information about givepngachance.com URL but full data related to the embeded youtube.com resource.

The code:

$http = new HTTPWatch();
$http->go("http://givepngachance.com");
print_r($http->getEntries());
$http->done();

Again with the video

If you've read part 1, you've probably seen the video, but here's the link again (try the HD version). This is a screencapture of loading FF and IE using my new class. The code that produced it is:

$ie = new HTTPWatch();
$ie->go('http://google.com/');
$sum = $ie->getSummary();
$ff = new HTTPWatch('ff');
$ff->go('http://google.com/');
$sumff = $ff->getSummary();
 
echo "\nRun 1 ";
echo $ie->watch->Log->BrowserName, ' ';
echo $ie->watch->Log->BrowserVersion;
echo "\nSent: ", $sum['BytesSent'], "; Received: ", $sum['BytesReceived'];
 
echo "\nRun 2 ";
echo $ff->watch->Log->BrowserName, ' ';
echo $ff->watch->Log->BrowserVersion;
echo "\nSent: ", $sumff['BytesSent'], "; Received: ", $sumff['BytesReceived'];
 
$ie->done();
$ff->done();

As you can see I'm accessing the HTTPWatch plugin object ($http->watch) directly to get the browser version and name. I didn't think this was worth wrapping in a more convenient API the way I did with getSummary() and getEntries().

The result of this is:

$ php examples.php

Run 1 Internet Explorer 6.0.2900.5512
Sent: 7102; Received: 89188
Run 2 Firefox 3.5.6
Sent: 6388; Received: 166473 

If you're wondering why FF gets twice the bytes, it's because google.com in IE6 is very basic - no search-as-you-type so much less JavaScript and one less sprite.

That's all folks

Enjoy, fork and keep an eye on what's up with the HTTP traffic. What goes through the tubes is too important not to be observed and monitored 🙂

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov