Content-to-markup ratio bookmarklet
When you care about performance, or SEO (or just doing a good job as web dev) an interesting data point is the ratio of page content vs. the markup used to present this content. Or... how much crap we put in HTML in order to present what the users want to see - the content.
So I played tonight with a bookmarklet to provide this piece of stats.
Install
Right-click, add to favourites/bookmarks. Or simply click to see the ratio of this page.
How it works
Since the scripts on the page may modify the content and markup, the bookmarklet makes an Ajax request to get a fresh copy of the page from the server. Then it runs a few regular expressions ("borrowed" from prototype.js) to strip all tags and the scripts/styles content. The first metric it provides is the size of the stripped content divided by the size of the original markup.
Then the bookmarklet tries to be more fair and count the alt, title and value attributes as content, including the size of attribute names themselves. And this is the second, "fair", metric. The content attributes are inspected using DOM methods, not regexp, so they can be affected by any javascript that has modified the page. Oh well, life's not fair.
Code
The bookmarklet code is served from here. The code is also on github.
Results
Here are some random results of running the bookmarklet on different sites.
http://www.cnn.com:
Total size: 92004 bytes
Content size: 11475 bytes
Content-to-markup ratio: 0.12
Fair ratio * : 0.16
http://www.sitepoint.com
Total size: 65989 bytes
Content size: 16199 bytes
Content-to-markup ratio: 0.25
Fair ratio * : 0.60
Article on http://en.wikipedia.org:
Total size: 21648 bytes
Content size: 3315 bytes
Content-to-markup ratio: 0.15
Fair ratio * : 0.35
http://www.phpied.com
Total size: 31899 bytes
Content size: 7933 bytes
Content-to-markup ratio: 0.25
Fair ratio * : 0.48
http://www.google.com SERP
Total size: 29963 bytes
Content size: 3351 bytes
Content-to-markup ratio: 0.11
Fair ratio * : 0.14
This entry was posted on Thursday, March 5th, 2009 and is filed under Ajax, bookmarklets, JavaScript, performance, SEO. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Get notification for future posts: follow me on Twitter or subscribe to my RSS feed

March 5th, 2009 at 6:29 am
A bit slow but an AMAZING idea!! Congrats!!!
It’s the ideal tool for curious people like me
March 5th, 2009 at 8:35 am
Useful thanks stoyan
March 5th, 2009 at 9:05 am
Very cool idea, Stoyan. Within what range would you consider the content-to-markup ratio to be reasonable? Optimal?
March 5th, 2009 at 12:01 pm
Thanks, everybody!
Lea, the slowness may come from:
a. my site being slow to respond (because the main js part of the bookmarklet is hosted here)
b. the site you inspect is slow (because the bookmarklet makes an ajax request to get the page content)
If it becomes annoyingly slow, you can improve a. by copying the bookmarklet file to a server of your choice, or even, if you’re not on IE, strip \n and host the whole code in the properties of the bookmark. (IE has troubles with URL length)
Karl, I don’t know about the range. Somehow I was hoping that good sites will have 0.5 ratio, but looks like this is way too optimistic. I guess the good range is relative, will come up after you run it on a bunch of sites you consider nicely marked up. Also depends on the type of page, image galleries cannot do too well, obviously.
March 6th, 2009 at 2:58 am
[...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]
March 6th, 2009 at 4:02 am
Nicely done Stoyan. Good for SEO testing. Might be good to count the number of tags, and also show the final raw text. I’m guessing it doesn’t take into account content that is loaded into the dom after the page has loaded right?
March 6th, 2009 at 4:20 am
[...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]
March 6th, 2009 at 5:50 am
[...] Stoyan Stefanov interessierte sich dafür und schrieb ein kleine Javascript-Funktion die genau dieses Verhältnis berechnet. [...]
March 6th, 2009 at 8:16 am
Thank you for this bookmarklet. It’s helpful for us.
March 6th, 2009 at 1:57 pm
nice bookmarklet. I think it is very helpful for optimizing webpages. thanks!
March 6th, 2009 at 4:26 pm
Thanks everybody!
@Mojo – no DOM-inserted elements are counted, since I make a fresh request to the server and use regexp to strip the tags. You’re right # of DOM elements could be another good data point.
March 9th, 2009 at 7:55 am
[...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]
April 13th, 2009 at 11:00 am
[...] Content-to-markup ratio bookmarklet / phpied.com [...]
April 13th, 2009 at 4:02 pm
[...] Content-to-markup ratio bookmarklet / phpied.com [...]
May 19th, 2009 at 10:11 pm
For my homepage:
Content-to-markup ratio: 0.52
Fair ratio * : 0.56
What do I win?
Pretty cool little bookmarklet. I thought I did something wrong first time I clicked it though, since there’s no indication it’s working. Maybe an alert before it grabs the external script so users know to wait it out?
May 20th, 2009 at 2:07 am
@HB you will a virtual pat on the back
May 30th, 2009 at 4:12 pm
Your web site is beautiful. issues are understood. very useful and descriptive. Thanks for sharing.
May 31st, 2009 at 9:09 am
It’s strange, on http://jqueryvsmootools.com/ the bookmarklet found :
Total size: 62503 bytes
Content size: 49764 bytes
Content-to-markup ratio: 0.80
Fair ratio * : 3.24
There is too faire ratio, isn’t it ?
November 11th, 2009 at 7:13 am
[...] nützliche Tools (aber kein Muss) sind HTML Validator, YSlow, sowie das content-to-markup ratio [...]
April 11th, 2011 at 12:26 pm
[...] is me repurposing two old bookmarklets that gather some interesting stats (one of them was even featured on Ajaxian, [...]
July 2nd, 2011 at 5:27 pm
Thank you very much for this great report!