Automating HTTPWatch with PHP

March 5th, 2011. Tagged: HTTP, performance, php

HTTPWatch is a nice tool to inspect HTTP traffic in easy and convenient way and it works in both IE and FF now. Drawback - windows-only and paid. But the free version is good enough for many tasks.

HTTPWatch can be automated and scripted which is pretty cool for a number of monitoring-like tasks. Their site and help section lists C# and Ruby+Watir examples. So I was curious - what about PHP (and no Watir).

In general with PHP you can open/close/navigate IE using COM (whatever that is) which is nice, but you can't do that with Firefox as it doesn't expose a COM interface. But HTTPWatch fills the gap. K, let's see an example.

Prerequisites

OS: Windows
Install: IE, Firefox, HTTPWatch, php (command-line is fine, no need for Apache, MySQL, etc)

Getting started

Create a file, say C:\http.php, open command prompt and go:

cd \
C:\>php http.php

Now all that's left is to put something worth executing in http.php 🙂

Instantiating HTTPWatch

$controller = new COM("HttpWatch.Controller");
if(!method_exists($controller, 'IE')) {
  throw new Exception('failed to enable HTTPWatch');
}

Opening a new Firefox window:

$plugin = $controller->Firefox->New();

BTW, it's the same for IE:

$plugin = $controller->IE->New();

Disabling any filters (filters defined in HTTPWatch that is)

$plugin->Log->EnableFilter(false);

Clear HTTPWatch's log (the list of requests), clear the browser cache and start recording traffic:

// clear log and cache
$plugin->Clear();
$plugin->ClearCache();
 
// start
$plugin->Record();

Navigate to a URL and wait for it to complete - that means wait a bit after onload even

// browse
$plugin->GotoUrl('http://google.com');
$controller->Wait($plugin, -1);

Stop monitoring traffic and quit the browser:

$plugin->Stop();
$plugin->CloseBrowser();

This is nice, we opened the browser, visited a URL and closed. Now we can even get some meaningful data out of the whole experience.

$plugin->Log->Entries is an object that has a list of all requests. It also has a property Summary. So we can see how many bytes we sent and how many received as a result of this visit to google.com

$sum = $plugin->Log->Entries->Summary;
echo "in: {$sum->BytesReceived}, out: {$sum->BytesSent}";

Note: oh, you need to get your data before closing the browser, otherwise the Log object gets destroyed it seems

So the result:

C:\>php http.php
in: 89185, out: 7102

Yeah!

This may look like nothing, but is pretty impressive in an of itself. At least I know I was happy the first time it worked. Because, you see, any monitoring that doesn't use a real browser is kinda smelly, isn't that right? Plus this is awesome for performance tests, research and experiments. You can create page A and page B and go out for a walk. Meanwhile your script can load the pages 200 times in the two browsers (at least, because you can have FF+IE[678]), with empty and full cache... and you come back for the results! Tired of all the walking, not of hitting REFRESH.

Below you can see (HD!) video of a script that opens IE and FF, loads Google and then gives you the bytes in/out in the two browsers. This example uses a PHP class I created and will talk about later, but you can still see the idea.

A better experience in IE

One thing I don't like is that HTTPWatch won't let you control the browser very well. Two features I'm looking for: being able to see HTTPWatch's log while running (for testing) and then being able to completely hide the window (for "production"). Luckily IE let's you do that and HTTPWatch let's you "attach" an already running IE instance.

So. We open IE with its own COM interface:

$browser = new COM("InternetExplorer.Application");
if(!method_exists($browser, 'Navigate')) {
  throw new Exception('didn\'t create IE obj');
}
$browser->Visible = true;

As you can see - not very different. But there's Visibile which can be false if you so like. This way you can still work on something while tests are running in the background without windows popping up all the time.

Also if you open HTTPWatch manually and close the browser, then the next time (in your scripted runs) HTTPWatch will stay open and you can check what's up.

So, connecting HTTPWatch with the IE instance means instantiating HTTPWatch as before and passing the IE object to Attach() method (was New() before).

// watch this!
$controller = new COM("HttpWatch.Controller");
if(!method_exists($controller, 'IE')) {
  throw new Exception('failed to enable HTTPWatch');
}
 
// enable plugin
$plugin = $controller->IE->Attach($browser);

The rest is all the same.

There's more

The most interesting part is getting data back from HTTPWatch. Dunno about you, but I love just dumping whatever structure I have with print_r() or var_dump() and then deciding what I want from it and how to to go about getting it.

That doesn't happen here because these COM objects are Variants and you can't just dump'em. You have to read the API docs. That sucks. So I did a hack (next post) and also read the APIs ("Stoyan: reading the APIs so you don't have to!") to enable just dumping the httpwatch's log.

Meawhile...

HAR

HTTPWatch can write you a HAR file with the log. Not everything is in there, but it's still a lot and it's easy. HAR is JSON so you json_decode() it and voila - a log!

$filename = tempnam('/tmp', 'watchmenowimgoindown');
$plugin->Log->ExportHAR($filename);
$json = file_get_contents($filename);
 
print_r(json_decode($json));

If you're curious as to what that prints - here it is.

Want to see a HAR (from another run)? Here it is.

So here you go - much data can be extracted and dumped for inspection from the HAR output. For the full httpwatch data, there's the API.

(to be continued...)

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov