Free-falling waterfalls

December 14th, 2009. Tagged: browsers, performance

2010 update:
Lo, the Web Performance Advent Calendar hath moved

Dec 14 This post is part of the 2009 performance advent calendar experiment. Stay tuned for the articles to come.

In this serias of performance posts, so far we've looked at having fewer components in the waterfall (meaning less HTTP requests) and also making the components as small as possible. The next task is to make sure that the waterfall is as short as possible - meaning let it fall freely, without interruptions and have the browser download as many components as possible in parallel.

Some ways to make the waterfall fall free include having:

  • fewer DNS lookups
  • parallel downloads using several domains (this point contradicts the previous, ain't performance fun)
  • fewer redirects (ideally no redirects)
  • non-blocking scripts and styles
  • smaller request/reponse headers, which includes using fewer cookies

Reducing DNS lookups

When you request a component, the browser needs to resolve the hosthame of the component to an IP address. This is known as a DNS lookup and you can see those lookups in the waterfall charts. The DNS time may look negligible (plus DNS lookups get cached by browsers and operating systems), but they sometimes take ridiculous amount of time. It depends on many factors, often beyond your control, so the best thing to do is change what you do control and that is - require fewer DNS lookups. Carlos Bueno has an excellent writeup on DNS lookups here.

You should limit the number of DNS lookups the browser needs to perform, ideally to no more than 2 to 4, according to Yahoo's studies.

Parallel downloads

Browsers have limits on how many components they download from the same domain at the same time. In older browsers, including IE6 and IE7 this limit is 2. This can definitely slow down your waterfall significantly, when you have a greater number of components to download.

Newer browsers have increased that limit to 4 (Safari, Opera 10) or 6 (FF3, IE8), so this should be less of an issue. But at the end it depends on your page - how many components and how many people on IE6,7.

Below is an image of how IE7 loads a page with 8 images, where each image is artificially delayed to take 2 seconds. Downloading two components at a time, IE spends at least 8 seconds on the images (2-4-6-8) or a total time of over 9 seconds.

Loading the same page in IE8 is shown below. IE8 loads 6 components at a time, so the 8 images are loaded in two batches (1st - 6 images, 2nd - 2 images), for a total time of 4 seconds spend on images. Overall, the whole page loads in 5 seconds.

Now, to work around this limitation in older browsers a common technique is to create dummy subdomains, like img1.example.org, img2.example.org and so on, so that more components can be downloaded in parallel (the limitation is per domain). If you're going to do this, remember to balance this optimization with the fewer DNS lookups recommendation, don't spread on too many domains. Look carefully at your waterfalls to find the balanced point. Again, the general recommendation is that a page should require up to 2-4 domains tops.

Quick sidetrack: two URLs for your toolbox.

  1. For the up-to-the-moment insight into different browser limits and capabilities, bookmark Google's BrowserScope project. Check the "Network" tab for example to see the parallel downloads limitations across browsers
  2. Cuzillion (by Steve Souders) lets you quickly create test pages. The waterfalls above are actually coming from a page created with Cuzillion and tested in AOL's WebPageTest

No redirects

Redirects are bad for your waterfall. They do nothing for the user but just slow down the experience. Think about what happens: the browser makes a request, waits for the response, the response says "no, no, go get your component from way over there", so the poor old browser starts again - makes a new request, waits for the response.

So, avoid redirects, be they server-side redirects or client-side (JavaScript or meta-tag redirects).

As an illustration how bad redirects can be - consider the waterfall below. It's from a real page, not a made up case. So it all looks like the page could finish loading after about 1.1 seconds, but then a redirect occurs at 0.9s, takes half a second and then points to a 1x1 blank GIF. Obviously it's some sort of stats tracking image. But the bad parts are that: a/ it's an IMG tag, therefore delays onload and b/ there's a redirect. At the end, the page loads in 1.7 seconds instead of 1.1 seconds. The user experience suffers for no reason. The way to fix this is simply remove the image from the IMG tag and load it with new Image().src = "1x1.gif" this way taking it out of the onload flow. Then remove that redirect. For such stats tracking cases a 204 No Content response is the appropriate way to go.

Blocking script and styles

Scripts block downloads, hence slow down your free-falling waterfall. This is an important topic which deserves an article of its own, so stay tuned.

And what about stylesheets, do they block other downloads in the waterfall? Turns out stylesheets are mostly fine, but they could also block in these cases:

  • in Firefox before version 3 (probably no need to worry about it)
  • in all browsers, if followed by an inline script

The second one is interesting as much as it's surprising. It's probably not a good idea to have inline script tags scattered all around the HTML to begin with. And since this can cause the stylesheets to block the downloads of the other components, it should be avoided at all costs.

Be sure to check Steve Souders' blog post for more information. Credit to Steve - I believe he was the first to take note and report this issue.

So check your waterfall if you see a stylesheet that blocks, look around it in the markup for any inline script tags that can be moved further down.

Cookies and other HTTP headers

We talked about making the responses smaller. But we can also optimize the requests by making the HTTP headers smaller. You can take a look at your request and response headers and see if you're not sending too much.

Looking at my blog I see this Server header:

Server: Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.7a Phusion_Passenger/2.2.4 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635

Seems to me like too much information.

There's also an ETag:
Etag: "9f38013-2af0-453b354a20bc0"

While ETags may be help with caching, when you have far-future Expires header, they are not necessary. And if you have a multi-machine setup, ETags can actually be bad (YSlow will warn you for this issue)

While you have some control over the response headers, you have very little control over the request headers. It's your users' browsers that control them. But you do have control over the Cookie header and this is where the bigger savings will actually come from because Cookie is often the biggest part of the HTTP headers

Reduce cookies

You should aim at sending the least possible amount of cookies. Be careful when you write cookies. Make them smaller and write them to the appropriate sub-domain name. If you have a blog at blog.example.org and a main site at www.example.org, then don't write the blog.example.org cookies at *.example.org level.

Hotmail have talked about how they compress their cookies before writing them. That's also an idea if you have big Cookie headers.

Cookie-less domains for components

Better yet, for static components that don't have any use for these cookies, just don't send them. Setup static.example.org and don't write cookies for this domain. Then put your static components there.

A curious piece of stats here - Philip Dixon reported (slides) that after Shopzilla.com moved their static components to a cookie-free domain they made more money.

Images to non-cookie domain resulted in
0.5% top line revenue increase!

"top line" means revenue (maybe you know that but I had to check :) ). This is a fascinating idea - that you can make more money by improving something as simple as cookie-less components. So, every little bit helps. Keep making your site faster and ... you never know.

www or no-www

This point also adds to the good old www vs. no-www flamewar. If you opt for no www, then in IE you cannot write cookies to example.org, but you'll write to *.example.org. This means your static.example.org will see all the cookies too. This post has some more info on the topic.

If you've already polluted your top-level domain with long-term cookies, the remedy would be to just buy a new "clean" domain, like examplestatic.org, never ever write cookies to it and use it for your static components.

Thanks!

And that's it for today's post, thank you for reading and may HTTP be with you ;)

Tell your friends about this post: Facebook, Twitter, Google+

8 Responses

  1. Wonderful article as usual ;-)

  2. Excellent article. It seems that recently the number of your blog posts suddenly increased and reader like me are very much happy about it.

    One small correction – “In this serias of performance posts” It seems that serias should be series.

  3. Just to be sure, i always new clean domain to publish static content on. This way, when ever you are using 3rdParty script, that potentially set the cookie for all sub domains you are 100% sure than your static server will not / never be affected.

    A domain only costs like 10$ max, so that a money well spend.

  4. You mentioned Yslow will warn about misconfigured etags – do you know if Yslow knows the etags are misconfigured, or if the warning given is shown whenever etags are sent as part of the response?

  5. Nice WebPageTest Tool, Cuzillion I knew.
    Thank you for good article.

  6. Another great article. I’d love to see more coverage of some of the dynamic script loaders that are starting to come out that allow scripts to load in parallel instead of with blocking behavior. It’s an important component in helping the waterfall free fall, as you said.

    My project (which Steve Souders has helped with) is LABjs for exactly that purpose. One key thing this approach provides is giving the same kind of parallel loading behavior that new browsers like FF3.5 have to all other browsers (even IE6 and 7!).

    Here’s a write up I did explaining the basics of how LABjs does its thing: http://blog.getify.com/2009/11/labjs-new-hotness-for-script-loading/

  7. @tim – looking at a string, it’s hard to judge what goes into an ETag. So YSlow only complains when you’re using IIS and the default IIS formatted ETag or you’re using Apache and the default pattern. If it sees the default, there’s a chance that you didn’t touch the ETags and may be at risk. If it’s anything but the default, it won’t complain, assuming the people who configured their ETags know what they are doing.

  8. Hey I am so grateful I found your web site, I really found you by accident, while I
    was browsing on Aol for something else, Anyhow I
    am here now and would just like to say many thanks for a fantastic post and a all round thrilling blog (I also love the theme/design), I don’t have time to browse
    it all at the moment but I have book-marked it and also
    added in your RSS feeds, so when I have time I will be back to read much more, Please do keep up the awesome jo.

Leave a Reply