Content-to-markup ratio bookmarklet

When you care about performance, or SEO (or just doing a good job as web dev) an interesting data point is the ratio of page content vs. the markup used to present this content. Or... how much crap we put in HTML in order to present what the users want to see - the content.

So I played tonight with a bookmarklet to provide this piece of stats.

Install

Right-click, add to favourites/bookmarks. Or simply click to see the ratio of this page.

content/markup

How it works

Since the scripts on the page may modify the content and markup, the bookmarklet makes an Ajax request to get a fresh copy of the page from the server. Then it runs a few regular expressions ("borrowed" from prototype.js) to strip all tags and the scripts/styles content. The first metric it provides is the size of the stripped content divided by the size of the original markup.

Then the bookmarklet tries to be more fair and count the alt, title and value attributes as content, including the size of attribute names themselves. And this is the second, "fair", metric. The content attributes are inspected using DOM methods, not regexp, so they can be affected by any javascript that has modified the page. Oh well, life's not fair.

Code

The bookmarklet code is served from here. The code is also on github.

Results

Here are some random results of running the bookmarklet on different sites.

http://www.cnn.com:

Total size: 92004 bytes
Content size: 11475 bytes
Content-to-markup ratio: 0.12
Fair ratio * : 0.16

http://www.sitepoint.com

Total size: 65989 bytes
Content size: 16199 bytes
Content-to-markup ratio: 0.25
Fair ratio * : 0.60

Article on http://en.wikipedia.org:
Total size: 21648 bytes
Content size: 3315 bytes
Content-to-markup ratio: 0.15
Fair ratio * : 0.35

http://www.phpied.com

Total size: 31899 bytes
Content size: 7933 bytes
Content-to-markup ratio: 0.25
Fair ratio * : 0.48

http://www.google.com SERP
Total size: 29963 bytes
Content size: 3351 bytes
Content-to-markup ratio: 0.11
Fair ratio * : 0.14

This entry was posted on Thursday, March 5th, 2009 and is filed under Ajax, bookmarklets, JavaScript, performance, SEO. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


Get notification for future posts: follow me on Twitter or subscribe to my RSS feed

21 Responses to “Content-to-markup ratio bookmarklet”

  1. Lea Verou Says:

    A bit slow but an AMAZING idea!! Congrats!!!
    It’s the ideal tool for curious people like me :P

  2. Sorin Says:

    Useful thanks stoyan

  3. Karl Swedberg Says:

    Very cool idea, Stoyan. Within what range would you consider the content-to-markup ratio to be reasonable? Optimal?

  4. Stoyan Says:

    Thanks, everybody!

    Lea, the slowness may come from:
    a. my site being slow to respond (because the main js part of the bookmarklet is hosted here)
    b. the site you inspect is slow (because the bookmarklet makes an ajax request to get the page content)
    If it becomes annoyingly slow, you can improve a. by copying the bookmarklet file to a server of your choice, or even, if you’re not on IE, strip \n and host the whole code in the properties of the bookmark. (IE has troubles with URL length)

    Karl, I don’t know about the range. Somehow I was hoping that good sites will have 0.5 ratio, but looks like this is way too optimistic. I guess the good range is relative, will come up after you run it on a bunch of sites you consider nicely marked up. Also depends on the type of page, image galleries cannot do too well, obviously.

  5. Ajaxian » Calculate your content to markup ratio Says:

    [...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]

  6. Mojo Says:

    Nicely done Stoyan. Good for SEO testing. Might be good to count the number of tags, and also show the final raw text. I’m guessing it doesn’t take into account content that is loaded into the dom after the page has loaded right?

  7. Ajax Girl » Blog Archive » Calculate your content to markup ratio Says:

    [...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]

  8. Verhältnis von Inhalt und Markup berechnen | Ajaxschmiede.de Says:

    [...] Stoyan Stefanov interessierte sich dafür und schrieb ein kleine Javascript-Funktion die genau dieses Verhältnis berechnet. [...]

  9. Tuan Anh Says:

    Thank you for this bookmarklet. It’s helpful for us.

  10. Ingo Says:

    nice bookmarklet. I think it is very helpful for optimizing webpages. thanks!

  11. Stoyan Says:

    Thanks everybody!

    @Mojo – no DOM-inserted elements are counted, since I make a fresh request to the server and use regexp to strip the tags. You’re right # of DOM elements could be another good data point.

  12. Calculate your content to markup ratio | Guilda Blog Says:

    [...] Stefanov has created a fun little bookmarklet that calculates the content to markup ratio of a webpage: When you care about performance, or SEO (or just doing a good job as web dev) an [...]

  13. Feed Stats Processing Caught Up — Hobby Cash: Make Cash Blogging About the Things You Love Says:

    [...] Content-to-markup ratio bookmarklet / phpied.com [...]

  14. An ADSPACE preview — Hobby Cash: Make Cash Blogging About the Things You Love Says:

    [...] Content-to-markup ratio bookmarklet / phpied.com [...]

  15. HB Says:

    For my homepage:

    Content-to-markup ratio: 0.52
    Fair ratio * : 0.56

    What do I win? ;)

    Pretty cool little bookmarklet. I thought I did something wrong first time I clicked it though, since there’s no indication it’s working. Maybe an alert before it grabs the external script so users know to wait it out?

  16. Stoyan Says:

    @HB you will a virtual pat on the back :)

  17. neon Says:

    Your web site is beautiful. issues are understood. very useful and descriptive. Thanks for sharing.

  18. Cerium Says:

    It’s strange, on http://jqueryvsmootools.com/ the bookmarklet found :

    Total size: 62503 bytes
    Content size: 49764 bytes
    Content-to-markup ratio: 0.80
    Fair ratio * : 3.24

    There is too faire ratio, isn’t it ?

  19. Socialize@CoreMedia OpenSpace am 13. November 2009 | Patricia's Blog Says:

    [...] nützliche Tools (aber kein Muss) sind HTML Validator, YSlow, sowie das content-to-markup ratio [...]

  20. HTTPWatch automation with JavaScript | Follia Digitale Says:

    [...] is me repurposing two old bookmarklets that gather some interesting stats (one of them was even featured on Ajaxian, [...]

  21. Stillen Says:

    Thank you very much for this great report!

Leave a Reply