Automating HTTPWatch with PHP #3

March 7th, 2011. Tagged: HTTP, performance, php

The first part is here, the second is here. This third post is more about PHP and COM, rather than HTTPWatch or monitoring web performance, so feel free to skip if the title mislead you :) Keep reading if you want to use and improve/update my HTTPWatch class in the future.

The problem

After running a HTTPWatch-ed browser via a script I wanted to have an easy way to dump all the data collected. Since the PHP-HTTPWatch bridge is via COM interface, all the objects returned by HTTPWatch are Variants and not ready for introspection with the usual PHP functions, like get_object_vars() for example.

The solution

Interestingly, turns out there exist a function called com_print_typeinfo(). You give it a COM object (works with those variants HTTPWatch gives you) and it returns you source code for a PHP class defining this COM object. So the hack here is to evaluate this source code with eval() (oh, the horror!) and then inspect it with get_class_vars(). Luckily there's no naming collision between the PHP built-in classes and those defied by HTTPWatch.

// the input is a class name 
// and an object of that class
$class = "Entry";
$object = $http->watch->Log->Entries->Item(0);
 
// buffer output
ob_start();
// print out class definition
// derived from an object
com_print_typeinfo($obj);
$typeinfo = ob_get_contents();
ob_end_clean();
 
// evaluate the generated PHP source
eval($typeinfo);
 
// get the properties
$properties = get_class_vars($class);
 
// Horay!

In order for this to work, as you can see, we need to know the class names and we need access to an example object of each class. This is where I needed to study the HTTPWatch API and find a suitable example page that will generate enough objects to derive the API from.

The free HTTPWatch version

The free version has restrictions where you don't have access to all properties. I wanted my class to be able to do the best job possible with or without the presence of restrictions. That's why I first load google.com which is unrestricted and then an image on my blog, which is restricted.

From the first page I derive the complete API (that I'm interested in) and then I use the derived API to study the second URL request. Accessing each property in a try-catch blows up when a property is restricted, so I write it to a second array of API properties $paidproperties.

Source: the end

In the end when you run the script dumpapi.php. It uses my HTTPWatch class to derive the API for the HTPWatch class itself. How meta! The result of the run you write to an API file and then this file is included by the class. Nice and clean after a messy hack :)

Run:

$ php dumpapi.php > HTTPWatchAPI.php

(This is a one-off operation, no need to run it at all if you don't need to change anything in the class)

Then, before you instantiate the constructor, you go:

// point to the API dump
HTTPWatch::$apipath = "HTTPWatchAPI.php";
 
// the usual
$http = new HTTPWatch();

The default name and location for that API dump is the same directory and file name HTTPWatchAPI.php, so you skip that first line unless you have a valid reason to store the API in a different location.

Tell your friends about this post: Facebook, Twitter, Google+

One Response

  1. Hi Stoyan,

    For those people that don’t want to build their own I spent last year working with Site Confidence to build a SaaS version that does exactly this http://www.siteconfidence.com/services/site-wide-performance-analysis.aspx

    * In multiple real browsers – IE 6/7/8 Firefox 2/3/3.6/4beta

    * and crawls your website to test all the pages up to a pre-set limit (# pages, crawl depth etc)

    * Full reporting – waterfall graphs, crawl graphs, cross-browser comparison graphs, drill down all the way to the HTTP headers.

    I think it rocks, but I might be a bit biased!

    There are some screen shots over here on the SC blog – http://blog.siteconfidence.com/2010/11/performance-analyser-is-live-site-wide.html

    cheers,
    Steve

Leave a Reply