The performance roadmap / Stoyan's phpied.com

2010 update:
Lo, the Web Performance Advent Calendar hath moved

Dec 1 This is the first in the series of performance articles as part of my 2009 performance advent calendar experiment. Stay tuned for the next articles.

As you've probably heard (and maybe all too often), we live in Web 2.0. This may mean different things to differently inclined folks but for us developers it means more rich Ajaxy pages, communicating more frequently with the server, one-page type of apps (think Gmail). Where "the web as she was meant to be" used to be a document-serving system, things are different now and web pages are increasingly more like applications, and much less document-y.

Let's take a look at what I once called (obviously still under the effect of that dental anesthetic) "The Life of Page 2.0" - a parallel between the human life and the modern day web app.

The life of Page 2.0

page 2.0 timeline

On this timeline above you can see the key moments of the life of a page and how they correspond to the human life.

It all starts with the page request. Someone types in a URL or clicks a link. This is the moment of page's conception
What follows next is the server is "pregnant" with that page. Your server side code will fetch data from somewhere - a database, web services - and crunch that data stitching a string of HTML code output
Luckily if there's no complications, abortions, 404s and such, the pregnancy is over, HTML is sent to the browser and the page is officially born. Look, it even has a <title>, which is hopefully something different than "500 Internal Server Error"
Then comes the waterfall - downloading all the extra page components required by the page - images, scripts, styles... This phase ends with the onload event. This phase roughly corresponds to the childhood and teen years of our little human, which eventually graduates and becomes a full blown Mr.Page.
Right after onload comes a settling process (the young fella finding his/her identity) - attaching even handlers to DOM elements, some initialization work, maybe fetching a few more components, or getting some data via an Ajax call. The page then settles. Status bar stops showing stuff that's being downloaded, indicators stop spinning, cursor is not busy. All is good, the young adult is back from backpacking across Western Europe and Tibet and ready to get married, take over that sales position in dad's company and own a barbeque grill
Then life goes on, user interacts with the page. Some pages are quite uneventful, others are full of ups and downs (uploads/downloads), always getting more data, updating and self-improving, always on the move
Sooner or later along comes the Grim Reaper to end it all. The user clicks away from the page, making a new request and our Page is laid to rest after a brief onunload moment.

Some comments on Mr. Page's life

First - if that looks like a lot happening, it is. The good news for the performance folk is that since there's a lot going on, it means there's a lot to improve. Performance optimization is a fun and challenging activity which is all but boring.

Next - the onload. While technically onload is a concrete event which should signify when the page is ready, it's not always that simple. The "user onload" is an undefined point in time that could happen before the onload and way after. Depends on the page and on the user. An article type of page could be considered ready when the article title and content are ready. The user happily reads and hence interacts with the page, while images, ads and what not is still being downloaded. Other times the onload may happen relatively quickly, but the actual page content is still being retrieved (Google Reader on the iPhone comes to mind) and the page is far from usable. It's up to you to figure out the "user onload" for your type of page.

Where to improve

Probably the first and most important place for optimization for most pages is the waterfall stage. But you can optimize the page in any of the stages above, prioritizing on where the most time is spent.

Below is a summary of the main optimization activities in each stage, many of these will be discussed in follow-up articles.

at request time - for example send less cookies, no cookies for static requests
optimize the waterfall - this is pretty simple to understand. "Simple laws of physics", as my fellow Yahoo! and perf extraordinaire Venkat says. I would divide the waterfall optimizations into these categories:
1. less stuff - your waterfall will be shorter when there's less things to fall. Have less HTTP requests, merge components, use sprites, data URIs/MHTML, remove what you don't need, lazy-load the rest. Use caching and "never expires" policy to improve repeat visits. Remove duplicates, near duplicates, and plain old 404 errors
2. smaller stuff - once you've removed or merged components, the ones that are left should be as small as they can be - that means use compression, minification, image optimization, zero body 204 components
3. move out of the way - some things in the waterfall harm more that others, not everything happens in parallel. Parallel is good, means that more stuff is downloaded at the same time and the waterfall will finish earlier. JavaScripts block downloads. Sometimes CSS does too. CSS blocks rendering. Redirects should be burned. DNS lookups cost too, so use less domains.
4. start early so you can finish early, kind of obvious, eh? This means use flushing and HTTP chinked encoding to start the waterfall even before the page is born. (You know it's going to be a baby girl, err on the pink color scheme)
the settling after-onload phase - you can do stuff here too. For example, you can start some of the work at DOMContentLoaded, you don't always need to wait for onload
the interactive/life phase - this includes CSS and JavaScript optimizations to make your user interaction smoother and more pleasant and the UI more responsive. Touching the DOM lightly. Towards the end-of-life you can help too. What is a poor old page to do, but leave a good inheritance by preloading some of the components that the children and grandchildren pages may need

End of day 1

Thanks for reading so far! This first article was more high level, an overview of some of the more technical parts to follow. It's good to have a view. It helps you prioritize and adopt a more holistic approach to the optimization effort. At the end of it, it's a marathon, not a sprint.

Comments? Find me on BlueSky, Mastodon, LinkedIn, Threads, Twitter