Archive for the 'performance' Category

Here’s to a faster Recommendations plugin

Sunday, May 5th, 2013

So I've been part of the quest of making all Facebook social plugins faster, even if it means rewriting them from scratch. After the Send plugin, Like button (perf optimizations described here), Follow plugin, Facepile and Likebox (perf details here), now you have a faster Recommendations plugin.

The techniques used to make it faster are simple and effective: better resource packaging, reducing number of requests, inlining CSS, reducing the amount of CSS and JS by untangling dependencies, cleaning up, and sometimes simply rewriting. You can read more about these in the previous posts.

Now the results. The before-after comparison is accurate because it uses snapshots taken at the same time. This is possible because we kept an old endpoint serving the old code path. The official URL is /plugins/recommendations.php but we kept the legacy URL /widgets/recommendations.php pointing to the old code for a little while.

Before

After

Some analysis

The total payload change is drastic. Mainly due to better packaging. And better JS modules and dependencies.

The number of requests is reduced by 1/3. Not as drastic, but not too bad either. Most of the requests are images, which is ok. They don't block anything and whenever they arrive, they are welcome. But they are not on the critical path. The reduced requests are all JavaScript. We already had a previous optimization so CSS was already inlined. But thanks to rewrites, now the HTML payload (including inline CSS) is 6.2K gzipped, down from 9.4K. Which means the initial paint can start sooner.

The initial paint (render start) now happens in half the time. And, even better, the initial paint is a complete plugin, except for the images. While before it was just a partial content. This is because CSS is here early and all JS is out of the way (async loaded, also take a peek here)

initial paint screenshots

The fully loaded time is not all that important since the user has a usable list of recommendations already delivered with the initial paint. But it's still 2x faster which makes me happy.

All in all

2x faster plugin overall, 2x faster (and infinitely better) first impression. 7x payload improvement.

Making the web faster and other personal notes

Just want to take a second to mention how good it feels to be working on such high-impact performance optimizations. These social plugins are everywhere on the web. By making them faster I am fortunate to have the opportunity to make the whole web faster. Meaning make millions of sites faster, affecting the live of billions of people, every day.

What can I say, Facebook is a great place to work. The people, the impact. Every line you write matters. It's also up to you to pick what do you want to work on and where your talents and interests will have the greatest impact. And then there are the hackathons and hackamonths which means even more freedom.

I recently finished a hackamonth project, which explains why I've been silent here and on Twitter and everywhere (yet, thanks to O'Reilly folks, even though I missed a few deadlines, we were able to push this baby out the door). Let me tell you - a hack-a-month is better than vacation. Being left alone for a month to explore a completely new (to you) territory - priceless!

(Oh, if that sounds something you'd like to do, hit me up on ssttoo at ymail with your resume. FB now has engineering offices in NYC, Seattle and London, so if moving was a problem, now there are more options)

 

CSS animations off the UI thread

Tuesday, March 12th, 2013

This excellent Google I/O talk mentions that Chrome for Android moves the CSS animations off of the UI thread, which is, of course, a great idea. Playing around with it, here's what I found:

  • Browser support: Desktop Safari, iOS Safari, Android Chrome.
  • You need to use CSS transforms. Animating regular properties doesn't work.

Update: (see comments) confirmed support in IE10. Reported support in Firefox OS too, but I cannot personally confirm

More details and test page below.

Single UI thread

As you probably know the browser is single threaded. Do something heavy in ECMAScript land and everything freezes.

The big idea

CSS animations should be excluded from the "everything" that freezes.

Test page

Here's a test page with some animations. Click the kill button and see what happens.

Animations

The red box that spins is animated like:

.spin {
  animation: 3s rotate linear infinite;
}
 
@keyframes rotate {
  from {transform: rotate(0deg);}
  to {transform: rotate(360deg);}
}

The green one is also animated with a transform:

.walkabout-new-school {
  animation: 3s slide-transform linear infinite;
}
 
@keyframes slide-transform {
  from {transform: translatex(0);}
  50% {transform: translatex(300px);}
  to {transform: translatex(0);}
}

The blue one is animated using the margin-left property, not a transform:

.walkabout-old-school {
  animation: 3s slide-margin linear infinite;
}
 
@keyframes slide-margin {
  from {margin-left: 0;}
  50% {margin-left: 100%;}
  to {margin-left: 0;}
}

Kill switch

The kill button just pegs the CPU in a infinite loop for 2 seconds:

function kill() {
  var start = +new Date;
  while (+new Date - start < 2000){}
}

Results

In non-supporting browsers, which is most of them, the kill switch kills all the animations. Business as usual.

In the supporting browsers (All Safaris and Andriod Chrome) the kill only affects the blue button, the one that animates a CSS property, as opposed to using a CSS transform. But the animations that use a transform keep on going!

Take aways

  1. Rejoice! The future is here! Drink and dance uncontrollably around the campfire!
  2. After you sober up, make sure your CSS animations use transform: where possible
  3. Keep migrating them JS animations to CSS
  4. Bug your browser vendor to support this
 

C3PO: Common 3rd-party objects

Monday, February 18th, 2013

Problem: too much JavaScript in your page to handle 3rd party widgets (e.g. Like buttons)
Possible solution: a common piece of JavaScript to handle all third parties' needs

3t1jsi

What JavaScript?

If you've read the previous post, you see that the most features in a third party widget are possible only if you inject JavaScript from the third party provider into your page. Having "a secret agent" on your page, the provider can take care of problems such as appropriately resizing the widget.

Why is this a problem?

Third party scripts can be a SPOF (an outage), unless you load them asynchronously. They can block onload, unless the provider lets you load it in an iframe (and most don't). There can be security implications because you're hosting the script in your page with all permissions associated with that. And in any case, it's just too much JavaScript for the browser to parse and execute (think of mobile devices)

If you include the most common Like, Tweet and +1 buttons and throw in Disqus comments, you're looking at well over 100K (minified, gzipped) worth of JavaScript (wpt for this jsbin)

This is more than the whole of jQuery, which previous experiments show can take the noticeable 200ms just to parse and evaluate (assuming it's cached) on an iPhone or Android.

What does all of this JS do?

The JavaScript used by third parties is not always all about social widgets. The JS also provides API call utilities, other dialogs and so on. But the tasks related to social widgets are:

  1. Find html tags that say "there be widget here!" and insert an iframe at that location, pointing to a URL hosted by the third party
  2. Listen to requests from the new iframes fulfill these requests. The most common request is "resize me, please"

Now, creating an iframe and resizing it doesn't sound like much, right? But every provider has to do it over and over again. It's just a wasted code duplication that the browser has to deal with.

Can't we just not duplicate this JavaScript? Can we have a common library that can take care of all widgets there are?

C3PO draft

Here's a demo page of what I have in mind. The page is loading third party widgets: like, tweet, +1 and another one I created just for illustration of the messaging part.

It has a possible solution I drafted as the c3po object. View source, the JS is inline.

What does c3po do?

The idea is that the developer should not have to make any changes to existing sites, other than remove FB, G, Tw, etc JS files and replace with the single c3po library. In other words, only the JS loading part should be changed, not the individual widgets code.

c3po is a small utility which can be packaged together with the rest of your application code, so there will be no additional HTTP requests.

Parsing and inserting iframes

The first task for c3po is to insert iframes. It looks for HTML tags such as

<div class="fb-like" data-href="http://phpied.com"></div>

Similar tags are generated by each provider's "wizard" configuration tools.

In place of this tag, there should be an iframe, so the result (generated html) after c3po's parsing should roughly be like:

<div class="fb-like" data-href="http://phpied.com">
  <iframe 
    src="http://facebook.com/plugins/like.php?href=http://phpied.com">
  </iframe>
</div>

The way to do this across providers is to just have every data- attribute passed as a parameter to the 3rd party URL.

Third parties can be setup using a register() method:

// FB
c3po.register({
  'fb-like': 
    'https://www.facebook.com/plugins/like.php?',
  'fb-send':
    'https://www.facebook.com/plugins/send.php?',
});
 
// Tw
c3po.register({
  'twitter-share-button':
    'https://platform.twitter.com/widgets/tweet_button.html#'
});
 
// ...

The only additional parameter passed to the third party URL is cpo-guid=..., a unique ID so that the iframe can identify itself when requesting services.

The parsing and inserting frames works today, as the demo shows. The only problem is you don't know how big the iframes should be. You can guess, but you'll be wrong, given i18n labels and different layouts for the widgets. It's best if the widget tells you (tells c3po) how big it should be by sending a message to it.

X-domain messaging

What we need here is the iframe hosted on the provider's domain to communicate with the page (and c3po script) hosted on your page. X-domain messaging is hard, it requires different methods for browsers and I'm not even going to pretend I know how it works. But, if the browser supports postMessage, it becomes pretty easy. At the time of writing 94.42% of the browsers support it. Should we let the other 5% drag us down? I'd say No!

c3po is meant to only work in the browsers that support postMessage, which means for IE7 and below, the implementers can resort to the old way of including all providers' JS. Or just have less-than-ideally-resized widgets with reasonable defaults.

When the widget wants something, it should send a message, e.g.

var msg = JSON.stringify({
  type: 'resize',
  guid: '2c23263549d648000',
  width: 200, 
  height: 300
});
parent && parent.postMessage(msg, '*');

See the example widget for some working code.

The c3po code that handles the message will check the GUID and the origin of the message and if all checks out it will do something with the iframe, e.g. resize it.

Again, take a look at the demo code to see how it all clicks together

Next?

As you see in the demo, only the example widget is resized properly. This is because it's the only one that sends messages that make sense to c3po.

Next step will be to have all widget providers agree on the messages and we're good to go! The ultimate benefit: one JS for all your widget-y needs. One JS you can package with your own code and have virtually 0 cost during initial load. And when you're ready: c3po.parse() and voila! - widgets appear.

Of course, this is just a draft for c3po, I'm surely missing a lot of things, but the idea is to have soemthing to start the dialogue and have this developed in the open. Here's the github repo for your forking pleasure.

Make sense? Let's talk.

 

Speed geek’s guide to Facebook buttons

Thursday, February 14th, 2013

or "How to help your users share your content on Facebook and not hurt performance"

Facebook's like button is much much faster now than it used to be. It also uses much fewer resources. And lazy-evaluates JavaScript on demand. And so on. But it's still not the only option when it comes to putting a "share this article on Facebook" widgety thing on your site.

The list of options is roughly listed in order of faster (and least features) to slowest (and most features).

#1: A share link

Note that this feature has been deprecated but it still does work. And you see it all over the place.

A simple link to sharer.php endpoint is all it takes. The u parameter is your URL. E.g.:

<a 
  href="https://www.facebook.com/sharer/sharer.php?u=phpied.com" 
  target="_blank">
  Share on Facebook
</a>

The above is a hardcoded URL. You can, of course, spit the current URL on the server side. A JS-only client-side solution could be to take the document.location. You can also pop a window. And use a button, or an image. Say something like:

<button id="sharer">Share</button>
<script>
document.getElementById('sharer').onclick = function () {
  var url = 'https://www.facebook.com/sharer/sharer.php?u=';
  url += encodeURIComponent(location.href);
  window.open(url, 'fbshare', 'width=640,height=320');
};
</script>

Try it:


Method #1's performance price: none

This is just a link you host in your HTML or bit of JavaScript you can inline or package with your own JavaScript (it is, after all, your own JavaScript)

#2: Feed dialog

The feed dialog a next incarnation of the share popup.

It can also be as simple as a link, like so

https://www.facebook.com/dialog/feed
  ?link=jspatterns.com
  &app_id=179150165472010
  &redirect_uri=http://phpied.com

Try it:

Share

You need a redirect_uri which can be something like a thank you page. But instead of "thank you", you can simply go back to the article by making redirect_uri and link point to the same URL

Again, a client-only solution could be something like:

  var feed = 'https://www.facebook.com/dialog/feed?app_id=179150165472010';
  var url = encodeURIComponent(location.href);
  feed += '&link=' + url + '&redirect_uri=' + url;
  window.open(feed, 'fbshare', 'width=640,height=480');

The result is a dialog that looks like:

feed

But this feed dialog can also be a popup. You do this by adding &display=popup. This hides the FB chrome. And you can also make the "thank you" page just a simple page that closes the window.

Try it:

The result:

feedpopup

The other required thing is the app id. You need one. But that's actually cool because it has side benefits. For example better error messages for you (the app admin) that the users don't see. It also gives you a little "via phpied.com" attribution linked to the App URI which is a nice traffic boost hopefully as your sharer's friends see the story in their newsfeed or timeline and click the "via".

story

So, App ID is good, you can get one here.

Additionally there's a bunch of other params you can pass to the feed dialog to control how the story is displayed. You can provide title, description, image, etc. Full list here.

Method #2's performance price: none

Feed dialog has the same (non-existing) performance requirements as the share links. It's all inline. Any content coming from Facebook is only on user interaction.

BTW, this is the method youtube currently uses.

#3: Feed dialog via JS SDK

Now we move on from simple links and popups to using the JavaScript SDK.

First things first, you absolutely must load the SDK asynchronously. Or non-onload-blocking-asynchronously in an iframe. More on these two later.

After you load the SDK like so:

(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));

Then, whenever you're ready, you can make a call to get the feed dialog:

FB.ui({
  method: 'feed',
  redirect_uri: 'http://phpied.com/files/fb/window.close.html',
  link: 'http://phpied.com',
  // picture: 'http...jpg',
  caption: 'Awesomesauce',
  // description: 'Must read daily!'
});

For a working example, check this example in jsbin

The result:

jsbin-feed

As you can see, this is now a real properly resized popup. No FB chrome, nice and clean. In general the JS SDK makes everything better. But you need to load it first - the performance price you pay for all the magic.

Method #3's performance price: an async JS

Opening the feed dialog this way requires you to load the Facebook JavaScript SDK. It's one JS file with a short expiration time (20 mins). When it loads, it also makes two additional requests required for cross-domain communication. These requests are small though and with long-expiration caching headers. Since the JS SDK is loaded many times during regular user's surfing throughout the web, these two additional requests have a very high probability of being cached. So is the JSSDK itself. If not cached, at least it's a conditional requests with likely a 304 Not Modified response.

Here's the waterfall of loading the jsbin test page where you can see the JS SDK loading (all.js) and the two x-domain thingies (xd_arbiter.php)

Note that by default the JS SDK sends an additional request checking whether the user is logged in. If you don't need that, make sure you set the login status init property to false, as shown in the test page, like:

FB.init({appId: 179150165472010, status: false});

When loading the JS SDK you must absolutely make sure it's loaded asynchronously, and even better - in an iframe, so the onload of your page is never blocked.

#4: Like button in an iframe

We're coming to the Like button. There are two ways to load it: either you create an iframe and point it to /plugins/like.php or you include the JS SDK and let the SDK create the iframe. Let's take a look at the you-create-iframe option first.

The integration is straightforward: You go to the help page, use the "wizard" configurator found there and end up with something like:

<iframe 
  src="//www.facebook.com/plugins/like.php?href=phpied.com&amp;width=450&amp;show_faces=true&amp;height=80" 
  scrolling="no" 
  frameborder="0" 
  style="border:none; overflow:hidden; width:450px; height:80px;" 
  allowTransparency="true"></iframe>

You're done!

The button comes in three layouts: standard (biggest), box_count and button_count

Try it:

Standard

Box count

Button count

As you can see, you get quite a bit more features here, e.g. number of likes and social context (who else has liked) in the standard layout. Also in the standard layout you get a little comment input. You don't get one in the other layouts because there's no space in the little iframe. You define the iframe and the code inside the iframe cannot break out of it and do something wild (or useful), e.g. open a big commenting dialog. Or make the iframe bigger because the word "Like" may be significantly longer in some languages. When you "trap" the iframe in your dimensions, it stays there.

Method #4's performance price: iframe content

In this method every time someone loads your page, they also visit a page (like.php) hosted by facebook.com. Now, this page is highly optimized: it only has html, sprite and async lazy-executed JS (which doesn't block onload). 3 requests in total. Maybe some faces (profile photos), depending on the layout and whether the user's friends have liked the URL.

As you probably know, every iframe's onload blocks the parent window's onload. So, if you feel so inclined you can always do any old lazy-load trick in the book. E.g. create the iframe after window.onload, or "double-frame" it, or (for the webkits out there) write the iframe src with a setTimeout of 0.

Another thing to consider is to always load the iframe via https, so there's no http-https redirect if the user has opted to always use facebook via https.

#5: Like button via SDK

This is building on what you already know about #3 and #4: You load the SDK. You sprinkle <fb:like> (or <div class="fb-like">) where you want buttons to appear. The SDK finds these and replaces them with iframes.

<!-- all defaults -->
<fb:like></fb:like>
 
<!-- layout, send button -->
<div class="fb-like" data-send="true"></div>

If you don't need to specify the URL to like, it's the current page.

Try it:

Standard

box count

button count

This is the most full-featured button implementation. It will resize the button as required by content and i18n. It will always present a comment dialog. (When people share with their own comment, these stories do better, because it's always nice to see a friend's comment attached to a URL, right?)

The good thing about this method is that you can load any other FB plugin (e.g. follow button by just adding an fb:follow in the HTML) without re-loading the SDK, it's already there and can handle all the plugins, dialogs and API requests.

Method #5's performance price: JSSDK + iframe content

Combining the features of methods #3 and #4 also combines their perfromance impact. Again, the like.php iframe is heavily optimized and tiny. Also the SDK has a chance of being cached from the users visit on another page. And, of course, you always load the SDK asynchronously so it's impact on your initial page loading is minimal. Or load the SDK in an iframe so the impact is virtually 0.

So the total cost in terms of number of requests in empty cache view is 6. 3 from the iframe + 3 from the SDK. Full cache view should be 1 request - just the like.php frame with the current count, faces and so on.

But again, to minimize the impact, you just load the SDK in an iframe (so the whole widget doesn't block onload and doesn't SPOF) or asynchronously (so it doesn't SPOF and doesn't block onload in IEs)

Summary

# Method Features Cost
1 Share link link opens popup, no like count, no social context none
2 Feed dialog link opens page, no like count or context. You can pass customized description, image, etc for the story. Up to you to do a "thank you" page. none
3 Feed via SDK properly resized popup, JS control over the flow. No like count or context Loading JS SDK
4 Like button in your frame like count, social context, but no i18n resizing, comment option only sometimes like.php iframe (3 requests)
5 Like button via SDK All features plus proper resizing, comment dialog, easier to implement via fb:like tags in HTML like.php + SDK

I mentioned a few times in the article but let me repeat once again for the TL;DR folks. If you're loading the JS SDK, it's absolutely mandatory that you make sure it's either loaded asynchronously to avoid SPOF, or even better - in an iframe to avoid blocking onload.

 

Run jsperf tests in a bunch of WebPagetest browsers

Monday, February 11th, 2013

Motivation

1. You write a new test to confirm a JavaScript-related performance speculation
2. You click
3. Your test runs in a bunch of browsers

Glossary

JSperf.com is the site where all you JavaScript performance guesswork should go to die or be confirmed. You know how the old wise people say "JSperf URL or it didn't happen! Now off my lawn!". Yup, that jsperf.com

WebPagetest.org (WPT) is the site where you get answers to the ol' question: "Why do people say my oowsome site is slow? And what should I do about it?"

Bookmarklet is a little piece of JavaScript you conveniently access from your browser bookmarks and inject into other non-suspecting sites.

Github is where you host code.

Bookmaker tool makes a bookmarklet from a .js file URL (probably hosted on github)

Trouble in paradise

These days we're so happy and spoiled with all these amazing tools around us. And yet, when you create a JSPerf test, you have to open all these browsers and run the test everywhere. Even IE. And, when on Mac, IE is usually not readily available. Plus it comes in a bunch versions - from almost-but-not-quite-forgotten IE6, all the way to IE10 The Greatest - and they have different, sometimes contradicting, performance characterics.

To the rescue: WPT

WebPagetest has: a/ ability to run in a bunch of browsers and b/ an API

The bookmarklet

The bookmarklet. It's here, on github

It starts by inquiring about your WPT API key. I know, you have to get one. You can read the API docs on how to get one, but let me save you the trip: you just need to ask pmeenan@[the tool's domain].org for a key. Politely. Tell him I sent you. Promise not to abuse.

  var key = localStorage.wpt_key;
  if (!key) {
    var prompt = window.__proto__.prompt;
    key = prompt('Your WebPageTest API key, please?');
    if (!key) {
      return gameOver();
    }
    localStorage.wpt_key = key;
  }

The key is stored in your localStorage so you don't have to paste it all the time.

Oh, you may wonder what's up with that:

var prompt = window.__proto__.prompt;
prompt('Message...');

Looks like something somewhere on jsperf is doing window.prompt = function(){}, same for window.open and probably others. Makes sense, you don't want popup-y stuff (by the thousands) while running a test a gazilion times. So the bookmarklet has to go the window.__proto__ for the original prompt

Moving on.

Setting up the constant params of the API call. The variable param will be the location which will tell what browser to use. We also give the (undocumented) time a value of 60s, so that the test has time to run. We also want only one run and just the first run (no full cache run).

The URL to test will be the current page loaded in jsperf.com which is where you run the bookmarklet. And we'll append #run for autorun.

  // base params
  var wpt = 'http://www.webpagetest.org/runtest.php?';
  var params = {
    k: key,
    time: 60,
    runs: 1,
    fvonly: 1,
    url: 'http://jsperf.com' + location.pathname + '#run'
  };
  Object
    .keys(params)
    .forEach(function(key) {
      wpt += key + '=' + encodeURIComponent(params[key]) + '&';
    });

Finally, setup the locations with browsers IE6,7,8,9,10 and open all these browser windows:

  var locations = localStorage.wpt_locations;
  if (!locations) {
    locations = ['Dulles_IE6', 'Dulles_IE7', 'Dulles_IE8', 'Dulles_IE9', 'Dulles_IE10'];
  }
  
  // pop some windows up
  var open = window.__proto__.open;
  locations.forEach(function(loco){
    open(wpt + 'location=' + encodeURIComponent(loco));
  });

Again, the full source is here on github

Github has a "raw" version, e.g. this so we take this url, paste it in the bookmaker tool and we get a nice installable bookmarklet link.

Install

Drag this link to you bookmarks:

jsperf -> wpt

Run

1. Go to any jsperf test, e.g. http://jsperf.com/array-proto-vs/3
2. Click the bookmarklet
3. Observe 5 new tabs with 5 IE versions running your test!

jsperf

More browsers

In addition to the browsers (locations) I've defined you can always add more, like Chrome and Firefox. However you probably have these already handy so no need to kill WPT's servers. But the option is there, just edit your localStorage.wpt_locations

Thanks for reading! Comments welcome!

 

webkit css-on-demand issues

Monday, February 11th, 2013

This post brought to you via Facebook engineers Jeff Morrison and Andrey Sukhachev, who discovered and helped isolate the issue.

Use case

Think a "single page app" use case. You click a button. Content comes via XHR. But content is complex (and app is as lazy-loading as possible) and content requires extra CSS. In an external file.

Only when the external CSS arrives should the app show the content. Otherwise content will be weirdly styled.

Execution

Two "modules" (or "widgets") of the app require two different CSS files. Both modules are requested at about the same time. We listen to onload of the CSS files. Expected behavior: whenever a module and its CSS dependency arrive - show that module. Asynchronously. No one cares which module shows first, as long as they show up as soon as possible.

Experimentation

Two modules. Two CSS files. 1st CSS happens to take one second. The second CSS takes 5 seconds.

Test pages: one and two

Here's the end result. The first module is pinky, the second is yellow. All good.

Same with network panel ON:

The question is what does the user see during the ?-mark - between the first CSS is done and the second one is still loading.

Oh, and here's the load() function that runs when the user clicks the button "load" initiating the new modules to appear:

function load() {
  var these = ['class1.css.php', 'class2.css.php'];
  var classes = ['class1', 'class2'];
  var head = document.getElementsByTagName('head')[0];
  
  for (var i = 0; i < these.length; i++) {
    var url = these[i];
    var link = document.createElement('link');
    
    link.type = "text/css";
    link.rel = "stylesheet"
    link.href = url;
    link.onload = (function (i) {
      return function () {
        console.log(these[i]);
        var div = document.createElement('div')
        div.appendChild(document.createTextNode(these[i]));
        result.appendChild(div);
        div.className = classes[i];
        //s = getComputedStyle(result).height;
      }
    }(i));
    
    head.appendChild(link);
  }
 
}

Expected behavior

In FF, whenever the the first CSS is loaded, we see a new module.

Test for yourself (in Firefox)

#1 issue: "efficient" webkit

You know that browsers batch layout and paints tasks because these tend to be expensive. For example they wait for all CSS (even useless print and other @media stylesheets) to arrive and block the rendering of the page. (More on these topics: here, here)

So turns out that here webkit also waits for both CSS files to arrive before rendering anything.

You see in the console we know (in JavaScript) that CSS has arrived. But webkit (chrome, safari, mobile safari) doesn't paint anything, waiting for the second CSS. Bummer!

Issue #2: painting unstyled content

While issue #1 is just a bummer that can be done better for the progressive feedback-y user experience, #2 is a bug. This is the issue that Jeff and Andrey found and were floored.

If there's a paint going on between the two stylesheets, the browser dumps the unstyled content on the page. Ugly stuff.

This was only happening sometimes, but after forming and testing an hypothesis, I was able to distill a reproducible test case. The only change is: after the load of the first CSS, flush the rendering queue by requesting a style information. e.g.

link.onload = function () {
  s = getComputedStyle(dom).height;
}

Repro and file a bug

You can reproduce for yourself. Use chrome and try:

  1. Issue: no paint till all CSS is here
  2. Bug: unstyled paint while waiting for all CSS

I was faced with "registration wall" trying to file a webkit bug, hence this post. Someone please file a bug and feel free to use the provided test cases.

The solution, IMO, is to make webkit behave like FF. No waiting for all CSS. This solves both issues. In the worst case, at least the unstyled bug (issue #2) should be addressed.

Interim solution for web developers: inline CSS required by the module together with the module content.

Thanks for reading!

 

Digging into the HTTP archive #2

Friday, December 28th, 2012

Continuing from earlier tonight, let's see how you can use the HTTP archive as a starting point and continue examining the Internet at large.

Task: figure out what % of the JPEGs out there on the web today are progressive vs baseline. Ann Robson has an article for the perfplanet calendar later tonight with all the juicy details.

Problemo: there's no such information in HTTPArchive. However there's table requests with a list of URLs as you can see in the previous post.

Solution: Get a list of 1000 random jpegs (mimeType='image/jpeg'), download them all and run imagemagick's identify to figure out the percentage.

How?

You have a copy of the DB as described in the previous post. Now connect to mysql (assuming you have an alias by now):

$ mysql -u root httparchive

Now just for kicks, let's get one jpeg:

mysql> select requestid, url, mimeType from requests \
    where mimeType = 'image/jpeg' limit 1;
+-----------+--------------------------------------------+------------+
| requestid | url                                        | mimeType   |
+-----------+--------------------------------------------+------------+
| 404421629 | http://www.studymode.com/education-blog....| image/jpeg |
+-----------+--------------------------------------------+------------+
1 row in set (0.01 sec)

Looks promising.

Now let's fetch 1000 random images, while at the same time dump them into a file. For convenience let's make this file a shell script so it's easy to run. And the contents will be one curl command per line. Let's use mysql to do all the string concatenation.

Testing with one image:

mysql> select concat('curl -o ', requestid, '.jpg "', url, '"') from requests\
    where mimeType = 'image/jpeg' limit 1;
+-----------------------------------------------------------+
| concat('curl -o ', requestid, '.jpg "', url, '"')         |
+-----------------------------------------------------------+
| curl -o 404421629.jpg "http://www.studymode.com/educ..."  |
+-----------------------------------------------------------+
1 row in set (0.00 sec)

All looks good. I'm using the requestid as file name, so the experiment is always reproducible.

mysql>
 SELECT concat('curl -o ', requestid, '.jpg "', url, '"') 
  INTO OUTFILE '/tmp/jpegs.sh' 
  LINES TERMINATED BY '\n' FROM requests
  WHERE mimeType = 'image/jpeg'
  ORDER by rand() 
  LIMIT 1000;
Query OK, 1000 rows affected (2 min 25.04 sec)

Lo and behold, three minutes later, we have generated a shell script in /tmp/jpegs.sh that looks like:

curl -o 422877532.jpg "http://www.friendster.dk/file/pic/user/SellDiablo_60.jpg"
curl -o 406113210.jpg "http://profile.ak.fbcdn.net/hprofile-ak-ash4/370543_100004326543130_454577697_q.jpg"
curl -o 423577106.jpg "http://www.moreliainvita.com/Banner_index/Cantinelas.jpg"
curl -o 429625174.jpg "http://newnews.ca/apics/92964906IMG_9424--1.jpg"
....

Now, nothing left to do but run this script and download a bunch of images:

$ mkdir /tmp/jpegs
$ sh ../jpegs.sh

curl output flashes by and some minutes later you have almost 1000 images, mostly NSFW. Not 1000 because of timeouts, unreachable hosts, etc.

$ ls | wc -l
     983

Now back to the original task: how many baseline and how many progressive JPEGs:

$ identify -verbose *.jpg | grep "Interlace: None" | wc -l
     XXX
$ identify -verbose *.jpg | grep "Interlace: JPEG" | wc -l
     YYY

For the actual values of XXX and YYY, check Ann's post later tonight :)

Also turns out 983 - XXX - YYY = 26 because some of the downloaded images were not really images, but 404 pages and other non-image files.

 

Digging into the HTTP archive

Friday, December 28th, 2012

Update: Second part

One way to do web performance research is to dig into what's out there. It's a tradition dating back from Steve Souders and his HPWS where he was looking at the top 10 Alexa sites for proof that best practices are or aren't followed. This involves loading each pages and inspecting the body or the response headers. Pros: real sites. Cons: manual labor, small sample.

I've done some image and CSS optimization research grabbing data in any way that looks easy: using the Yahoo image search API to get URLs or using Fiddler to monitor and export the traffic and loading a bazillion sites in IE with a script. Or using HTTPWatch. Pros: big sample. Cons: reinvent the wheel and use a different sampling criteria every time.

Today we have httparchive.org which makes performance research so much easier. It already has a bunch of data you can export and dive into immediately. It's also a common starting point so two people can examine the same data independently and compare or reproduce each other's results.

Let's see how to get started with the HTTP archive's data.

speak it

(Assuming MacOS, but the differences with other OS are negligible)

1. Install MySQL
2. Your mysql binary will be in /usr/local/mysql/bin/mysql. Feel free to create an alias. Your username is root and no password. This is of course terribly insecure but for a local machine with no important data, it's probably tolerable. Connect:

$ /usr/local/mysql/bin/mysql -u root

You'll see some text and a friendly cursor:

...
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

3. Create your new shiny database:

mysql> create database httparchive;
Query OK, 1 row affected (0.00 sec)

Look into the new DB, it's empty, no tables or data, as expected:

mysql> \u httparchive
Database changed
mysql> show tables;
Empty set (0.00 sec)

4. Quit mysql for now:

mysql> quit;
Bye

5. Go the archive and fetch the database schema.

$ curl http://httparchive.org/downloads/httparchive_schema.sql > ~/Downloads/schema.sql

While you're there get the latest DB dump. That would be the link that says IE. Today it says Dec 15 and is 2.5GB. So be prepared. Save it and unzip it as, say, ~/Downloads/dump.sql

6. Recreate the DB tables:

$ /usr/local/mysql/bin/mysql -u root httparchive < ~/Downloads/schema.sql

7. Import the data (takes a while):

$ /usr/local/mysql/bin/mysql -u root httparchive < ~/Downloads/dump.sql

8. Log back into mysql and look around:

$ /usr/local/mysql/bin/mysql -u root httparchive;

[yadda, yadda...]

mysql> show tables;
+-----------------------+
| Tables_in_httparchive |
+-----------------------+
| pages                 |
| pagesmobile           |
| requests              |
| requestsmobile        |
| stats                 |
+-----------------------+
5 rows in set (0.00 sec)

Dataaaa!

What's in the requests table I couldn't help but wonder (damn you, SATC)

mysql> describe requests;
+------------------------+------------------+------+-----+---------+----------------+
| Field                  | Type             | Null | Key | Default | Extra          |
+------------------------+------------------+------+-----+---------+----------------+
| requestid              | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| pageid                 | int(10) unsigned | NO   | MUL | NULL    |                |
| startedDateTime        | int(10) unsigned | YES  | MUL | NULL    |                |
| time                   | int(10) unsigned | YES  |     | NULL    |                |
| method                 | varchar(32)      | YES  |     | NULL    |                |
| url                    | text             | YES  |     | NULL    |                |
| urlShort               | varchar(255)     | YES  |     | NULL    |                |
| redirectUrl            | text             | YES  |     | NULL    |                |
| firstReq               | tinyint(1)       | NO   |     | NULL    |                |
| firstHtml              | tinyint(1)       | NO   |     | NULL    |                |
| reqHttpVersion         | varchar(32)      | YES  |     | NULL    |                |
| reqHeadersSize         | int(10) unsigned | YES  |     | NULL    |                |
| reqBodySize            | int(10) unsigned | YES  |     | NULL    |                |
| reqCookieLen           | int(10) unsigned | NO   |     | NULL    |                |
| reqOtherHeaders        | text             | YES  |     | NULL    |                |
| status                 | int(10) unsigned | YES  |     | NULL    |                |
| respHttpVersion        | varchar(32)      | YES  |     | NULL    |                |
| respHeadersSize        | int(10) unsigned | YES  |     | NULL    |                |
| respBodySize           | int(10) unsigned | YES  |     | NULL    |                |
| respSize               | int(10) unsigned | YES  |     | NULL    |                |
| respCookieLen          | int(10) unsigned | NO   |     | NULL    |                |
| mimeType               | varchar(255)     | YES  |     | NULL    |                |
.....

Hm, I wonder what are common mime types these days. Limiting to 10000 or more occurrences of the same mime type, because there's a lot of garbage out there. If you've never looked into real web data, you'd surprised how much misconfiguration is going on. It's a small miracle the web even works.

9. Most common mime types:

select count(requestid) as ct, mimeType 
  from requests 
  group by mimeType 
  having ct > 10000 
  order by ct desc;
+---------+-------------------------------+
| ct      | mimeType                      |
+---------+-------------------------------+
| 7448471 | image/jpeg                    |
| 4640536 | image/gif                     |
| 4293966 | image/png                     |
| 2843749 | text/html                     |
| 1837887 | application/x-javascript      |
| 1713899 | text/javascript               |
| 1455097 | text/css                      |
| 1093004 | application/javascript        |
|  619605 |                               |
|  343018 | application/x-shockwave-flash |
|  188799 | image/x-icon                  |
|  169928 | text/plain                    |
|   70226 | text/xml                      |
|   50439 | font/eot                      |
|   45416 | application/xml               |
|   41052 | application/octet-stream      |
|   38618 | application/json              |
|   30201 | text/x-cross-domain-policy    |
|   25248 | image/vnd.microsoft.icon      |
|   20513 | image/jpg                     |
|   12854 | application/vnd.ms-fontobject |
|   11788 | image/pjpeg                   |
+---------+-------------------------------+
22 rows in set (2 min 25.18 sec)

So the web is mostly made of JPEGs. GIFs are still more than PNGs despite all best efforts. Although OTOH (assuming these are comparable datasets), PNG is definitely gaining compared to this picture from two and a half years ago. Anyway.

It's you time!

So this is how easy it is to get started with the HTTPArchive. What experiment would you run with this data?

 

Non-onload-blocking async JS

Thursday, June 28th, 2012

Asynchronous JS is cool but it still blocks window.onload event (except in IE before 10). That's rarely a problem, because window.onload is increasingly less important, but still...

At my Velocity conference talk today Philip "Log Normal" Tellis asked if there was a way to load async JS without blocking onload. I said I don't know, which in retrospect was duh! because I spoke about Meebo's non-onload-blocking frames (without providing details) earlier in the talk.

Stage fright I guess.

Minutes later in a moment of clarity I figured Meebo's way should help. Unfortunately all Meebo docs are gone from their site, but we still have their Velocity talk from earlier years (PPT). There are missing pieces there but I was able to reconstruct a snippet that should load a JavaScript asynchronously without blocking onload.

Here it goes:

(function(url){
  var iframe = document.createElement('iframe');
  (iframe.frameElement || iframe).style.cssText = "width: 0; height: 0; border: 0";
  var where = document.getElementsByTagName('script');
  where = where[where.length - 1];
  where.parentNode.insertBefore(iframe, where);
  var doc = iframe.contentWindow.document;
  doc.open().write('<body onload="'+
    'var js = document.createElement(\'script\');'+
    'js.src = \''+ url +'\';'+
    'document.body.appendChild(js);">');
  doc.close();
})('http://www.jspatterns.com/files/meebo/asyncjs1.php');

The demo page is right here. It loads a script (asyncjs1.php) that is intentionally delayed for 5 seconds.

Features

  • loads a javascript file asynchronously
  • doesn't block window.onload nor DOMContentLoaded
  • works in Safari, Chrome, Firefox, IE6789 *
  • works even when the script is hosted on a different domain (third party, CDN, etc), so no x-domain issues.
  • no loading indicators, the page looks done and whenever the script arrives, it arrives and does its thing silently in the background. Good boy!

* The script works fine in Opera too, but blocks onload. Opera is weird here. Even regular async scripts block DOMContentLoaded which is a shame.

Drawback

The script (asyncjs1.php) runs is in an iframe, so all document and window references point to the iframe, not the host page.

There's an easy solution for that without changing the whole script. Just wrap it in an immediate function and pass the document object the script expects:

(function(document){
 
  document.getElementById('r')... // all fine
 
})(parent.document);

How does it work

  1. create an iframe without setting src to a new URL. This fires onload of the iframe immediately and the whole thing is completely out of the way
  2. style the iframe to make it invisible
  3. get the last script tag so far, which is the snippet itself. This is in order to glue the iframe to the snippet that includes it.
  4. insert the iframe into the page
  5. get a handle to the document object of the iframe
  6. write some HTML into that iframe document
  7. this HTML includes the desired script
 

3PO

Wednesday, June 27th, 2012

Say hello to the 3PO extension for YSlow. It checks your site for integration with popular 3rd parties, such as Facebook, Twitter widgets, Google Analytics and so on.

3PO (3rd party optimization) extension currently has 5 checks: two of them generic to all 3rd parties and three specific to Facebook plugins. I'm looking forward to adding more checks and more specific to a particular provider's best practices.

The extension is currently available as a bookmarklet, but since YSlow is a platform available on many platforms, it can be built as a Firefox or Chrome extension, command line tool, etc.

Install

Click this link to test, or drag to your bookmarks to install

YSlow +3PO

And the code is available in my YSlow fork on Github.

Checks

Here's a the list of checks along with some explanation.

Load 3rd party JS asyncrhonously

Category: Common

Use the JavaScript snippets that load the JS files asynchronously in order to speed up the user experience. Most providers offer you an asynchronous version of the script you're including on your page. If they don't, let them know and meanwhile do it yourself

If you don't include the script asynchronously, you create a SPOF (Single Point of Failure) and your site effectively goes down when the 3rd party goes down. See for yourself.

Load the 3rd party JS only once

Category: Common

Loading the 3rd party JS files more than once per page is not necessary and slows down the user experience. Sometimes people copy-paste snippets multiple times on the page, e.g. when you have one widget per blog post in a blog post listing. The script only needs to load once and serve multiple widgets.

Define XML namespace

Category: Facebook

If you use tags like <fb:like> you need to define an XML namespace to make the plugin work in old IE versions. Same for any tag that has :

Add an #fb-root element

Category: Facebook

The Facebook JS SDK needs an element with id="fb-root". So add this to your page, before you include the Facebook JS SDK

<div id="fb-root">

Include OG (Open Graph) meta tags

Category: Facebook

Open graph tags let you better describe your content. To learn more, see the documentation. And run the tool to validate your page.

 

<style> tag to inline style=”" attrrib

Thursday, June 21st, 2012

As you may have noticed, I claim that CSS is bad for performance because:

  1. Most browsers block the very first paint until all screen CSS arrives
  2. Additionally many browsers block rendering until all non-screen (e.g.print) CSS arrives
  3. Sometimes CSS blocks downloads

See "The evil that CSS do" in CSS and the critical path for details.

CSS is the critical path to delivering any UI in the browser. Images arrive whenever, JS can be async.

So any page needs to get CSS out of the way ASAP.

Simple, highly optimized pages (e.g., e.g.) reduce CSS to the bare minimum and then shove it inline in a <style> tag.

ExCeSS

It's a fact of life that there will always be unused CSS, no matter how hard you try to reduce it. (Run PageSpeed for a proof)

Take the simplest CSS: your reset.css

It has stuff like

h1, h2 , h3, ..., abbr, blockquote{ margin:0; ....}

All the HTML tags are in there. But do you have all the tags in the page? Unlikely. So there's excess CSS even at the very base. It usually gets much worse from here. Whole features may or may not be in the page or combined in different ways, but the CSS to handle all combinations is always there, omnipresent.

To style="" attrib

I saw today that Mailchimp has this CSS inliner tool. (Because mail clients often strip <style>). It takes the <style> tags in the markup, strips them and adds style="" attributes where applicable.

I decided to give it for a spin with Facebook like and Google search's HTML. Remember: these are two already highly optimized pages.

Assuming the tool works correctly, the results were pretty impressive.

  • Like: 8,133 bytes from 10,115 (20% reduction, 23% after `gzip -9`)
  • Search: 63,508 from 90,846 (30% reduction, 27% post gzip)

I know, I know what you'll say: inline style="" is an abomination. Should we bring <font> back? What about the cascade? Is this transformation needed on every page view with dynamic content, how's that scalable? What if there's a lot of content with the same class, lot of duplicates?

I know, I know.

But, but... look at the results. 25% reduction of the HTML payload!

With web development moving more and more toward transformations and compilation (css preprocessors, coffee script, monification, etc) it may not be unthinkable.

Back to Earth

On more realistic note, just reduce the CSS to under 2K or thereabouts, inline it in the head, send it with the the first server flush (even before any data fetching) and you'll be in a good place already!

 

YSlow development: custom rulesets

Wednesday, June 20th, 2012

(This is part 3. See part one and part two.)

There are two concepts to remember when working on your YSlow extensions and customizations:

  • rules (or "recommendations" if you will, or "best practices" or simply "lint checks"), and
  • rulesets which are lists of rules

An example rule is "Reduce HTTP requests". An example ruleset is "Small site or blog" (which is less strict than the default ruleset, because it assumes a small site has no CDN budget for example)

YSlow has a number of rules defined. How many? Easy to check once you have your setup from the last blog post. Open the console and go:

>>> Object.keys(YSLOW.controller.rules).length
23

And how many rulesets?

>>>Object.keys(YSLOW.controller.rulesets)
["ydefault", "yslow1", "yblog"]

Each ruleset has an id (e.g. ydefault), friendly name, list of rules and list of weights for each rule:

>>> YSLOW.controller.rulesets.ydefault
Object
  id: "ydefault"
  name: "YSlow(V2)"
  rules: Object
  weights: Object

The weights define what is the relative importance of each rule in the final score. And the rules contain rule-name => rule-config pairs. Because each rule is configurable. For an example configuration consider the "Thou shalt use CDN" rule. The patterns that match CDN hostnames are configurable. So is the number of points subtracted from the score for each violation.

(I can talk more about scores, but it's not all that important. The thinking was that people might be offended by and disagree with the scores. So we should let them customize the scoring algo)

Alrighty, enough talking, let's create one new custom ruleset.

New ruleset from the UI

  1. Click "Edit" next to the rulesets dropdown. A list of rules appear each with a helpful hint on mouseover and a friendly checkbox for your checking pleasure
  2. Click "New Set" to clear all default checks
  3. Check the most "duh!" rules, those that require no effort and are just sanity
  4. Click "Save ruleset as..."
  5. Type a name, like "Duh", save

Congratulations! You have a new ruleset.

If that wasn't the bookmarklet version, YSlow would remember this new ruleset. But YSlow doesn't (yet) remember settings in bookmarklet version. (Try another YSlow run in a different tab if you don't believe it).

But you can still save your ruleset, and even share it with others in your team.

Coded ruleset

This above was all-UI way of creating the ruleset. Behind the UI there's a simple JS object (that can be serialized to JSON for future use) that defines the ruleset as explained above.

>>>JSON.stringify(YSLOW.controller.rulesets.Duh)
"{"custom":true,"rules":{"ycompress":{},"yredirects":{},"yno404":{},"yemptysrc":{}},"weights":{},"id":"Duh","name":"Duh"}"

Tada!

Now just take this JSON string, paste into your mystuff/stuff.js (from the previous post), clean it up a little and add a call to the YSlow API to register this new rule.

parent.YUI = parent.YUI || YUI;
parent.YSLOW = YSLOW;
 
var duh = {
  id: "duh",
  name: "Duh",
  rules: {
    ycompress: {},
    yredirects: {},
    yno404: {},
    yemptysrc: {}
  },
  weights: {} 
};
 
YSLOW.registerRuleset(duh);

Than build and push:

$ make bookmarklet config="config-phpied.js"; \
  scp build/bookmarklet/* \
  username@perfplanet.com:~/phpied.com/files/yslow

So we have our own rule and we can run it and it can spit out reports.

(Note: Small correction from the previous post: in the Makefile your mystuff.js should go before the bookmarklet controller, which is responsible for the initialization. Because you want your registerRuleset() call to run before the initialization)

(Another Note: Disable Chrome's cache if you're testing with Chrome, because it's pretty aggressive in this bookmarklet scenario)

If we decide to tweak the scores and weights a little bit (take out 50 out 100 points for a single non-gzipped component and increase the rule's relative weight), we can do:

var duh = {
  id: "duh",
  name: "Duh",
  rules: {
    ycompress: {
      points: 50
    },
    yredirects: {},
    yno404: {},
    yemptysrc: {}
  },
  weights: {
    ycompress: 10,
    yredirects: 3,
    yno404: 3,
    yemptysrc: 5
  } 
};

The the result of running the tweaked ruleset on the same page is different this time:

You can inspect each rule's default config like:

>>> YSLOW.controller.rules.ycompress.config

Object
  min_filesize: 500
  points: 11
  types: Array[5]
    0: "doc"
    1: "iframe"
    2: "xhr"
    3: "js"
    4: "css"

Adios

This is it for tonight, next time: how to write your own rules.

pssst, a hack to make your new rule the default because bookmarklets don't remember preferences:

// HACK
YSLOW.util.Preference.getPref = function(name, def) {
  return name === "defaultRuleset" ? 'duh' : def;
};

And the final version of mystuff/stuff.js for completeness (and without global variables this time):

 
YSLOW.registerRuleset({
  id: "duh",
  name: "Duh",
  rules: {
    ycompress: {
      points: 50
    },
    yredirects: {},
    yno404: {},
    yemptysrc: {}
  },
  weights: {
    ycompress: 10,
    yredirects: 3,
    yno404: 3,
    yemptysrc: 5
  } 
});
  
 
// HACK
YSLOW.util.Preference.getPref = function(name, def) {
  return name === "defaultRuleset" ? 'duh' : def;
};

// DEBUG 
parent.YUI = parent.YUI || YUI;
parent.YSLOW = YSLOW;
 

YSlow development: setup

Tuesday, June 19th, 2012

As promised, let's setup for YSlow development using the easiest option - the bookmarklet version. The journey of conquering the world with your rules and extensions... starts with the first step.

Checkout

First you need to get teh codez. Go to the Github repository and click that big ol' Fork button. Then checkout the repository somewhere.

Alternatively, if you don't have a github account and don't care to install and deal with git on your computer, this is OK. Just download the repository as a 1.1MB zip file from:

https://github.com/marcelduran/yslow/zipball/master

Make

For this next step you need make. Good luck if you're on Windows. On Mac, seems like the most "blessed" version you can by installing this package called Command Line Tools for Xcode. Which (I'm not sure but probably) also requires Xcode. Xcode in the App Store. It's about 1.5GB. You go, I'll wait.

In the top directory of the code you've downloaded, there's a readme and /src (where it gets interesting) and a Makefile.

Since we're building the bookmarklet we'll go like:

$ make bookmarklet

But. Not yet. First things first.

The bookmarklet consists of one largish JS file and one smallish CSS. The bookmark that you'll click in the browser will load the JS file. Then the JS needs to know where to find the CSS. So you need a big of config.

If you look in /src/bookmarklet you'll see some config-*.js files. You need a new one for you too.

Let's say you'll host the bookmarklet at http://www.phpied.com/files/yslow.

You start by copying config-local.js:

$ cp src/bookmarklet/config-local.js src/bookmarklet/config-phpied.js

Next you edit one line there so it looks like:

 
YUI.add('yslow-config', function (Y) {
    Y.namespace('YSLOW').config = {
        host: 'http://www.phpied.com/files/yslow/',
        js: '{{BOOKMARKLET_JS}}',
        css: '{{BOOKMARKLET_CSS}}'
    };
});

Now, time to build! All you need is to run `make` pointing to your config:

$ make bookmarklet config="config-phpied.js"

building YUI...
done
building BOOKMARKLET files...
done
merging YUI and BOOKMARKLET...
done

Now look what you've done! You've created a directory build/bookmarklet

$ ls build/bookmarklet/
yslow-bookmarklet.js	yslow-style.css

(you can also do `make pkg-bookmarklet` to create a minified version, but let's keep things readable for now)

Host

Now you need to copy the .js and .css to a server of your choosing, be it localhost or now. I'd go:

$ scp build/bookmarklet/* username@perfplanet.com:~/phpied.com/files/yslow

Install bookmarklet

If you've already installed the YSlow bookmarklet in your browser, you can go and edit the location of the JS file. If not, visit http://yslow.org/mobile for the instructions.

This page will update the hash in the url to:

http://yslow.org/mobile/#javascript:.....more stuff...

All you need to do is change yslow.org to your location, e.g. phpied.com/files/yslow.

Then bookmark the page.

Then edit the bookmark and remove everything up to and including the hash # (http://yslow.org/mobile/#)

Run

  1. Go to a page of your choosing
  2. Click the bookmarklet
  3. See YSlow UI appears

It works so well that you even need to look at a network analyzer to believe it's really using your own hosted version.

And your own version is just a big javascript really, so there's nothing new and nothing extension-y to learn like XUL or manifest.json. You can just start tinkering immediately. You can even edit that .js file directly and make it like a real tight web programming loop: save-refresh. Or you can edit source files and rebuild, repush combining the make and scp commands. Let's do that.

Console: the best friend

YSlow takes extra care to run unobtrusively to the page. In an iframe, not leaving any globals behind. Meh, I want to play in the console. So I want to access the two globals: YUI and YSLOW. Let's see how you add your codes to YSlow. That's as good an exercise as any.

Create a new file in a new dir like: mystuff/stuff.js with this in it:

parent.YUI = parent.YUI || YUI;
parent.YSLOW = YSLOW;

You know YSlow bookmarklet runs in an iframe, so we want to expose the iframe's two globals to the parent.

Add your new file in the Makefile in the bookmarklet-files part:

 
bookmarklet-files:
  @echo "building BOOKMARKLET files..."
  @if [ ! -d $(BUILD_BOOKMARKLET) ]; then mkdir -p $(BUILD_BOOKMARKLET); fi
  @cat $(SRC_COMMON)/yslow.js \
            $(SRC_COMMON)/version.js \
            [....]
            $(SRC_COMMON)/peeler.js \
            $(SRC_COMMON)/peeler-bm-ch-ph.js \
            $(SRC_BOOKMARKLET)/$(BM_CONFIG) \
            mystuff/stuff.js \
            $(SRC_BOOKMARKLET)/controller.js | \
            sed s/{{YSLOW_VERSION}}/$(YSLOW_VERSION)/ | \
            sed s/{{BOOKMARKLET_JS}}/$(BOOKMARKLET_JS)/ | \
            sed s/{{BOOKMARKLET_CSS}}/$(BOOKMARKLET_CSS)/ \
            > $(BUILD_BOOKMARKLET)/$(BOOKMARKLET_YSLOW_JS)

Then build and deploy:

$ make bookmarklet config="config-phpied.js"; \
  scp build/bookmarklet/* \
  username@perfplanet.com:~/phpied.com/files/yslow

Now you can run the bookmarklet and start exploring what's available to you in the console:

 

YSlow development: getting started

Sunday, June 17th, 2012

Since version 2.0, YSlow is no longer just a tool, it's a platform. You can create your own rules (performance or otherwise), combine them into rulesets, tweak scores to your liking and so on.

Once Marcel took over and did version 3.0. YSlow can now run in many many environments: as a Firebug extension (like version 1.0 did), as a Firefox extension, Chrome extension, command-line and so on... including running as a bookmarklet in any browser (including mobile browsers). Funny aside is that YSlow version 0.XYZ was originally just a bookmarklet. Now it's a bookmarklet among everything else.

Now, setting up for browser extension development can be intimidating if you've never done it. But worry not, I want to show you how you can create YSlow extensions and customizations knowing nothing but JavaScript.

We'll be using the bookmarklet version for development.

What's even lovelier is that YSlow is open source now on Github.

Stay tuned

I wish I could tell you more, but it's father's day and the backyard BBQ party (including a rare live appearance from Anaconda Limousine) starts in an hour. And something tells me I won't feel very bloggy after the party. So YSlow would have to wait.

If you cannot wait though here are some pointers:

So welcome to the exciting world of YSlow development, it's fun and games and new rules and new integrations and pure webperf joy!

 

3PO#fail

Saturday, June 16th, 2012

So I was flipping through recent slides from Steve Souders and came across a reference to a nice post from Pat Meenan explaining how he setup blackhole.webpagetest.org and how you can edit your hosts file to send third party scripts to the black hole simulating a firewall-blocked or down third party and the effect on your site. (whew, long sentence)

I thought to would be nice to make that easier and have people see (and demonstrate to bosses and clients) how damaging frontend SPOF (Single Point Of Failure) can be. A browser extension maybe. A Chrome extension, because I've never made one. The idea marinated undisturbed for a few days and last night all of a sudden I got to work.

May I present you...

Now available at the Chrome web store.

3PO?

3PO = 3rd Party Optimization

I find it amusing, hope you do too

#fail?

Well, yeah. What happens to your site when a 3rd party goes down? Does it still work?

Is it true that your site is only down when it's down? Or it's down when:

It's down
or
Facbeook is down
or
Google is down
or
Twitter is blocked in your office
or
code.jquery.com is down
...and so on and so on.

This extension helps you check what happens with a click of button.

What 3PO#fail does

Very simple: it's looking for scripts from a list of suspects (api.google.com, platform.twitter.com, etc) and redirects them to blackhole.webpagetest.com

The current list of 3rd parties:

var urls = [
  '*://ajax.googleapis.com/*',
  '*://apis.google.com/*',
  '*://*.google-analytics.com/*',
  '*://connect.facebook.net/*',
  '*://platform.twitter.com/*',
  '*://code.jquery.com/*',
  '*://platform.linkedin.com/*',
  '*://*.disqus.com/*'
];

How?

Install the extension. Load your page. Or mashable.com for example. Then this happens:

It's a button with # on it. Click it. It turns red.

The extension now listens to script requests made to one of the suspect domains.

Now shift-reload the page. If a 3rd party script is found, it's redirected to the black hole and then a counter appears.

Observe whether or not the page is usable when a third party is down. Enjoy and demo to your boss. Tell them: sites do go down, companies ban social networking sites, and btw what do you think will happen when you visit China and load our site?

If you're looking for a page to try, go to mashable or business insider or, ironically, test the extension's page in Chrome web store. Turns out they include Google+'s button synchronously.

Dupe

Here come the LOLz. I blasted this extension out to Steve Souders and back he came with: doh, Pat Meenan also did a Chrome extension to do just this.

Bwahaha. What? You snooze, you miss a whole new tool by Pat Meenan himself.

Here's Pat's extension: SPOF-O-Matic. Try it, use it. It looks more thought out than mine definitely. And there's more code. Maybe Pat spend more time than a night on it. Or maybe he didn't, he's an amazing hacker and half! I mean, uh, webpagetest, hello!

I'll definitely "borrow" his list of 3rd parties which has more entries than mine.

Oh well, you live, you learn (to write Chrome extensions)

Chrome extensions

Creating a Chrome extension was a first for me and was mostly frictionless. Well documented, plenty of samples (try to browse the samples in the repository, because downloading ZIP files is too many clicks). Debugging the extension in the same web inspector is a big plus! Overall I think it's easier to write a Chrome extension than a FF one. Although the last I checked, FF has improved a lot.

Now for the nitpicks.

The API is sometimes irritating. I mean things like

setTitle({title: "My title"});

or

setBadgeText({text: "My text"});

Doplicating title, title, title is annoying. Sometimes it's title, sometimes text, or path or name. Method name appears short but in fact you have to remember one more thing - a property name in a config object. Sounds more like setTitleWithTitle(title) which is just as ridiculous (and popular in Obj-C it seems). Anyway.

The web store asks you for 5 bucks to register and submit an extension. Credit card and all. I didn't like that.

My extension was held for a review which doesn't always happen. The help section says 2-3 business days, but it turned out to be only hours for me. Got a nice email saying the extension is approved and also an explanation why it was held for review. Nice touch.

Code

The code is here: https://github.com/stoyan/3PO-fail. There's not a lot of it. A manifest file and an script that listens to specific URLs and request types in a onBeforeRequest event.

Stripping away UI stuff here's all there is to it.

Callback function which redirects the request:

function failer(info) {
  console.log(info.url); // test
  return {
    redirectUrl: 'https://blackhole.webpagetest.org'
  };
}

There's no logic here because the API allows you to let the browser do request inspection and filtering for you. Here all you do is return an object with a redirectUrl property.

And how do you setup your callback to be invoked?

chrome.webRequest.onBeforeRequest.addListener(
  failer,
  {
    urls: urls,
    types: ['script']
  },
  ["blocking"]
);

You specify your callback to be invoked only for script requests and only those that match a URL in the url array (see above)

The end to the SPOF

All you have to do is load third party scripts synchronously. See here the BFF function for an example. Yet, so many sites are not doing it. There's a need for people to understand this problem. Let's call it demand for advocacy. And now there's supply of 2 brand new tools that make it in-you-face obvious what the damaging effects are.

Random

I went over some of the pages that Steve has listed in his calendar blog post: Business Insider and O'Reilly. O'Reilly is better now and it uses my BFF script (nice, 'scuse me there's something in my eye). Business Insider is almost there. The social stuff is async now, but code.jquery.com is still a SPOF. Funny enough they have a blocking script tag pointing to twitter, but it has a class "post-load". So a script kicks in before this tag and replaces it with async loading. I wonder: why the trouble and not just go async to begin with?

 

Web Performance Daybook vol. 2

Wednesday, June 6th, 2012

In the spirit of the true high-performance non-blocking asynchronous delivery, we'll have the Web Performance Daybook volume 2 published before volume 1. I hope you'll enjoy reading the book as much as I enjoyed working on it and rubbing (virtual) shoulders with some of the brightest people in our industry.

Back in December of 2009 I wanted to give an overview of the web performance optimization (WPO) discipline. I decided on a self-imposed deadline of an-article-a-day from December 1st to 24th: the format of an advent calendar similar to 24ways.org. As it turned out, 24 articles in a row was quite a challenge and so I was happy and grateful to accept the offers for help from a few friends from the industry: Christian Heilmann (Mozilla), Eric Goldsmith (AOL) and two posts from Ara Pehlivanian (Yahoo!).

The articles were warmly accepted by the community and then on the following year, in December 2010, the calendar was already something people were looking forward to reading. The calendar also got a new home at http://calendar.perfplanet.com as a subdomain of the “Planet Performance” feed aggregator. And this time around also more people were willing to help. Developers of all around our industry were willing to contribute their time, to share and spread their knowledge, announce new tools, and this way create a much better set of 24 articles than a single person could. This is what soon will become volume 1 of the series of Daybooks.

Then came December 2011 and we had so much good content and enthusiasm that we kept going past December 24, all the way to December 31st and even publishing 2 articles on the last day. This is the content that you have in your hands in a book format as Web Performance Daybook vol.2.

Our WPO community is young, small, but growing, and in need of nourishment in the form of community building events such as the advent calendar. That's why it was exciting to have the opportunity to collaborate on this title with O'Reilly and all 32 authors. I'm really happy with the result and I know that both volumes will serve as a reference and introduction to performance tools, research, techniques and approaches for years to come. There's always the risk with outdated content in offline technical publications but I see references to the calendar articles in the latest conferences today all the time, so I'm confident this knowledge is to remain fresh for quite a while and some of it is even destined to become timeless.

Enjoy the book, prepare to learn from the brightest in the industry and, most of all, be ready to make the Web a better place for all of us!

Authors in order of appearance:

  • Patrick Meenan
  • Nicholas Zakas
  • Guy Podjarny
  • Stoyan Stefanov
  • Tim Kadlec
  • Brian Pane
  • Josh Fraser
  • Steve Souders
  • Betty Tso
  • Israel Nir
  • Marcel Duran
  • Éric Daspet
  • Alois Reitbauer
  • Matthew Prince
  • Buddy Brewer
  • Alexander Podelko
  • Estelle Weyl
  • Aaron Peters
  • Tony Gentilcore
  • Matthew Steele
  • Bryan McQuade
  • Tobie Langel
  • Billy Hoffman
  • Joshua Bixby
  • Sergey Chernyshev
  • JP Castro
  • Pavel Paulau
  • David Calhoun
  • Nicole Sullivan
  • James Pearce
  • Tom Hughes-Croucher
  • Dave Artz

Or, as I like to call them, the sultans of speed!

(So far expected publication date is Velocity US, end this month, fingers crossed!)

Oh, and authors' royalties are donated to a charity and a WPO charity at that. (Stay tuned for the announcement.) So feel free to get a copy for everyone in your org, it's for a good cause.

 

CSS and the critical path

Tuesday, June 5th, 2012

Back when I was still actively into speaking at public events (way, way back, something like year and a half ago (which strangely roughly coincides with the time I joined Facebook, hmmm (hmm? (huh? what's with the parentheses? sure all of them are closed at this point?)))) I remember showing this slide:

The reason I'm bringing it up now is this experiment I saw today by Scott Jehl.

media="nonsense"

Scott added LINK elements with non-applicable media attribs, such as tv, too much min-width and pixel ratio of 6 among others:

<link href="inc/tv.css" rel="stylesheet" media="tv">
<link href="inc/min-width-4000px.css" rel="stylesheet" media="(min-width: 4000px)">
<link href="inc/min-device-pixel-ratio-6.css" rel="stylesheet" 
    media="(min-device-pixel-ratio: 6)">

And just for the fun of it, why not a nonsense value:

<link href="inc/nonsense.css" rel="stylesheet" media="nonsense">

Scott observed that (with one happy Opera nonsense exception) all browsers will load all this junk, all this CSS that they don't need.

(BTW, Opera 11.64 loaded nonsense css for me too)

Blocking rendering?

Having recently remembered how browsers block rendering because of print stylesheets, I speculated that all the nonsense media will also block rendering. Unfortunately I was right.

So not only browsers download useless bytes, but they also block the rendering of the page (or block window.onload, or both) until all the crap is downloaded. And by blocked rendering I mean showing a white page of death. Most browsers wait until all CSS is loaded because they don't like doing extra layouts and painting (except Opera).

Here's a test page for you to try:

http://www.phpied.com/files/css-loading/mq.php?mq=all

Change all with your media query of choice, hit Enter and weep.

E.g.
http://www.phpied.com/files/css-loading/mq.php?mq=tv
http://www.phpied.com/files/css-loading/mq.php?mq=nonsense

This test page loads css with delay: css1 delayed 5 seconds and css2 delayed 10 seconds. The HTML is:

<link rel="stylesheet" href="css1.css.php" type="text/css" media="screen" />
<link rel="stylesheet" href="css2.css.php" type="text/css" 
    media="<?php echo $YOUR_MEDIA_QUERY; ?>" />

The correct browser behavior should be:
1. load only the CSS you need
2. render
3. fire onload

Maybe even:
0. render if step 1. takes too long

Instead, randomness ensues: Firefox treats us to a white page for 10 seconds while downloading nonsense. Chrome takes 15 seconds to fire onload. (see the print CSS post for more)

So what are we to do? First, understand...

The evil that CSS do

  1. Browsers (except Opera) block rendering until all screen CSS arrives. With the worst possible experience: white page.
  2. Browsers download CSS they don't need, e.g. print, tv, device-ratio... And most browsers (except Opera and Webkit) block rendering because of these too
  3. Sometimes CSS blocks the other downloads too (not just block rendering, but block images and scripts that follow):

The critical path

When building high-performance pages we want to stay off the critical path. Critical is the path from the user following a link to the first impression and then the working experience. That's why we load javascript asynchronously and so on.

But I argue that CSS is not only on the critical path, it is the critical path. And because it's a jungle (network, 3g, edge) out there, anything on the critical path will fail. Guaranteed.

Think about this: you have an HTML page and then you have components. Without the HTML, there is no path really. Game over. Without images? Depends on the page, but you can live without images most of the time. Without JavaScript? Well you should build the pages so the important stuff, links, forms, content works without javascript. Without webfonts? You're kidding me, I don't need no stinkin' fonts when I'm late and running to the airport and checking in for the flight on the damn phone with the spotty mobile network while Wifi wants to connect and I have to say no, because if I say yes I'll wait for another page where I have to say "I accept" and aim at a miniscule checkbox with these sweaty fat fingers or worse I have to enter usernameandpassword, and omg-omg-OMG mobile.southwest.com wants to look like native iPhone and won't let me click until mountains of JS arrive, so no, don't talk to me about no damn fonts!

What's left on the critical path is CSS. Not only the page is ugly without CSS, we can live with that, but there is no page without CSS because the browser waits and waits and takes forever to timeout showing us a blank white page.

Get the CSS out of the way

So if you worry about performance, you should get the CSS out of the way as soon as possible. Get off the critical path. Make CSS small, minify, compress, load from the same hostname even (no DNS) and inline, if small enough. Yup, inline.

Take a look at these highly optimized experiences...

Look ma, no CSS!

Yes, these pages make no CSS requests whatsoever.

If your CSS is not puny enough to be all inline (Guy has some observations on what puny means) it should at least be a single file, way at the top of the document, with the first flush. Just get it over with. Your users will love you and praise you and use words like smooth and snappy.

 

Update a far-future expiring component

Tuesday, May 22nd, 2012

NOTE: This is late night mumbling from a week or two ago, leaving here for posterity. Don't read it. Meanwhile Steve wrote up a proper blog post which is highly recommended: Self-updating scripts. Read his post instead!

tl;dr: Load the component in an iframe, then reload()


Backstory

Steve Souders and I were chatting last week about his blog post and he was expressing his disappointment with the status quo of the short expiration time of those omnipresent third party scripts like the Facebook SDK (which loads Like button among others). We need short expiration on those because we need to be able to push quick fixes to critical bugs. And we cannot ask webmasters to keep up with versioned file names.

So the question is how do you set far-future Expires header of a static URL (like http://connect.facebook.net/en_US/all.js) and at the same time be able to update the content of this file without changing its filename.

Ahaa!

Today I had this sudden moment of clarity that we can use location.reload() of an iframe that contains the resource in question.

Sudden clarity

reload(true) forces uncoditional GETs.
reload(false) (default) sends coditional GETs.

So I can load the component in an iframe and then reload() the iframe. Cross-origin restrictions kick in when the static resource is on a CDN domain. This can be solved by loading an HTML page in an iframe and reloading it. All resources referred to by this page will be revalidated with a conditional GET.

Codeh

Far-future expiring component with a static URL:

<img src="http://stevesouders.com/images/book-84x110.jpg">

Version-checking iframe:

<iframe id="i" src="about:blank"></iframe>

Test-links:

<p><a href="javascript:loadit()">load image in iframe</a>
<p><a href="javascript:reloadit()">reload the image in the iframe</a>

And the JS code:

// load the version-checking html
function loadit() {
  document.getElementById('i').src = 'versioncheck.html';
}
 
// reload iframe
function reloadit() {
  window.frames[0].location.reload(false);
}

This is it. Here's the test page.

The versioncheck.html does nothing but include the static resource. Here it is.

Results

Safari, IE9, FF, Chrome are all consistent - when you reload() the versioncheck.html, conditional GETs go out to the HTML and all the resources in it. Then the server can reply 304 if the static resource hasn't changed. Or send a new file if the contents has changed.

Not entirely trusting the browser tools, also validated the 200 and 304 responses tailing the access log file on the server.

Conclusion

Plus: you can update the content of static URL with far-future Expires fairly easily.

Minus:
1. Additional 304s. Two if CDN domain is different than the page, 1 otherwise.
2. The users sees the updated resource on the next page view, sorta like App Cache

So for those omnipresent third party JavaScripts... They currently have short expiration times - between 20 mins and 2 hours. In that time if the user needs the script again, no requests go out.

But if you use the technique outlined, you can set their Expires header to 10 years in the future and for every single page view make two more conditional requests with likely a 304 Not Modified response. These extra requests, of course will not interfere with the user experience, because they will be lazy.

So you get faster user experience with more HTTP requests and delayed-till-next-page-view update.

 

5 years later: print CSS still sucks

Wednesday, April 25th, 2012

This tweet had me revise a 5 year old experiment on how print CSS affects page loading, especially in the light of mobile browsers.

So I tweaked the test ever so slightly to print out timing info in the document.title and after the page is done.

The test is essentially how does a slow print stylesheet affect the rendering of the page on the screen.

<link rel="stylesheet" href="screen.css" media="screen">
<link rel="stylesheet" href="print.css"  media="print">

In the experiment I have screen.css delayed with 5 seconds and print.css delayed 10 seconds.

Results 5 years ago

Browsers blocked rendering waiting for print.css. Some took 10 seconds (downloading print.css and screen.css in parallel), some took 15. Why, oh why? It's a print CSS, you don't need this sh...eet.

Results today

Good guy Opera, doesn't even wait for screen.css. After some timeout, O renders unstyled page and restyles after screen.css arrives. Yes, brave O takes rendering risks this way, all others wait for the screeen.css before styling anything. Still, onload fires in ~10s, so this is bad. All your onload JS code is blocked on a useless print.css

FF blocks rendering on the print.css. Boo! Nothing renders for 10 seconds. And it fires onload after ~10s. Boo-boo! At least it loads both stylesheets in parallel. Galaxy (Android) waits for print.css too. How often do you print from a mobile device? Same in IE8 and IE9. Even more retarded in IE is DOMContentLoaded event also waiting for 10 seconds. speech=less.

Safari, Chrome, Mobile Safari - render after 5 seconds, meaning only after screen.css. There is hope for the humanity. However the onload fires in 15 seconds. So the two CSS files are downloaded sequentially. Kinda makes sense, print.css is low priority and should give way to everything else. Still could start earlier if there are no other downloads competing for precious resources.

So on the wall of shame - IE worst, FF yuck, Webkit bad, O least bad.

Recommendation

Ditch media="print" if you have one! (Hey why isn't this a yslow/pagespeed rule?). Ditch it because in the best case scenario it will only block onload. In the worst case it will block initial paint, onload and DOMContentLoaded. Sitting in front of a white page with no feedback is the worst possible user experience.

Put all (should be minimal anyway) print rules inline in your normal screen stylesheet.

@media print {
  body {font: fit-to-print}
  .big-ads, .sidebar, .menu {display: none}
}
 

Velocity Europe discount

Tuesday, November 1st, 2011

The web performance and operations conference, Velocity Europe, is just around the corner. This always sold-out event is making its EU debut this year in Berlin.

Get your ticket now (with 20% discount no less) or punish your users and clients with slow user experiences :)

Info:

Some highlights from the program:

  • Jon Jenkins will be talking about Amazon Silk.
  • Browser sessions from Chrome, Firefox, and Opera.
  • David Mandelin talking on JavaScript engines.
  • Jeff Veen talking about Designing for Disaster.
  • Estelle Weyl on Mobile UI Performance.

Full list of speakers here

Full disclosure: I'm in the program committee for the US event

 

Social button BFFs

Tuesday, September 27th, 2011

TL;DR: Loading JavaScript asynchronously is critical for the performance of your web app. Below is an idea how to do it for the most common social buttons out there so you can make sure these don't interfere with the loading of the rest of your content. After all people need to see your content first, then decide if it's share-worthy.

Japanese translation by Koji Ishimoto is here

Facebook now offers a new asynchronous snippet to load the JavaScript SDK, which lets you load social plugins (e.g. Like button) among doing other more powerful things.

It has always been possible to load the JS SDK asynchronously but since recently it's the default. The code looks pretty damn nice (I know, right!), here's how it looks like (taken from here):

(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) {return;}
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));

Some nice steal-me JS patterns here:

  • immediate (self-invoking) function so not to bleed vars into global namespace
  • pass oft-used objects (document) and strings ("script", "facebook-jssdk") to the immediate function. Sort of rudimentary manual minification, while keeping the code readable
  • append script node by using the first available script element. That's 99.99% guaranteed to work unless all your code is in body onload="..." or img onload or something similar (insanity, I know, but let's allow generous 0.01% for it)
  • assign an ID to the node you append so you don't append it twice by mistake (e.g. like button in the header, footer and article)

All buttons' JS files

Other buttons exist, most notably the Twitter and Google+1 buttons. Both of these can be loaded with async JavaScript whether or not this is the default in their respective configurators.

So why not make them all get along and shelter them under the same facebook immediate function? We'll save some bytes and extra script tags in the HTML. For G+/T buttons all we need is a new script node. Google+'s snippet has some additional attribs such as type and async, but these are not really needed. Because type is always text/javascript and async is always true. Plus we kinda take care of the async part anyways.

The end result:

  <div id="fb-root"></div>
  <script>(function(d, s, id) {
    // fb + common
    var js, fjs = d.getElementsByTagName(s)[0];
    if (d.getElementById(id)) {return;}
    js = d.createElement(s); js.id = id;
    js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
    fjs.parentNode.insertBefore(js, fjs);
    // +1
    js = d.createElement(s); 
    js.src = 'https://apis.google.com/js/plusone.js';
    fjs.parentNode.insertBefore(js, fjs);
    // tweet
    js = d.createElement(s); 
    js.src = '//platform.twitter.com/widgets.js';
    fjs.parentNode.insertBefore(js, fjs);
  }(document, 'script', 'facebook-jssdk'));</script>

So this thing loads all three JS files required by the three buttons/plugins.

Additionally we can wrap the node creation/appending part into a function. So all the code is tighter. Here's the final snippet:

<div id="fb-root"></div><!-- fb needs this -->
<script>(function(d, s) {
  var js, fjs = d.getElementsByTagName(s)[0], load = function(url, id) {
    if (d.getElementById(id)) {return;}
    js = d.createElement(s); js.src = url; js.id = id;
    fjs.parentNode.insertBefore(js, fjs);
  };
  load('//connect.facebook.net/en_US/all.js#xfbml=1', 'fbjssdk');
  load('https://apis.google.com/js/plusone.js', 'gplus1js');
  load('//platform.twitter.com/widgets.js', 'tweetjs');
}(document, 'script'));</script>

All buttons' markup

Next is actually advising the scripts where the widgets should be rendered. Facebook offers XFBML syntax, with tags such as <fb:like>, but it also offers pure HTML(5) with data-* attributes. Luckily, so do all others.

Here's an example:

<!-- facebook like -->
<div class="fb-like" data-send="false" data-width="280"></div>
<!-- twitter -->
<a class="twitter-share-button" data-count="horizontal">Tweet</a>
<!-- g+ -->
<div class="g-plusone" data-size="medium"></div>

G+ requires a div element (with g-plusone class name), Twitter requires an a (with a twitter-share-button class name). Facebook will take any element you like with a fb-like class name (or fb-comments or fb-recommendations or any other social plugin you may need)

Also very important to note that you can (and should) load the JS files once and then render as many different buttons as you need. In Facebook's case these can be any type of plugin, not just like buttons. Economy of scale - on JS file, many plugins.

All together now

So here's the overall strategy for loading all those buttons.

  1. Copy the JS above at the bottom of the page right before /body just to be safe (G+ failed to load when the markup is after the JS). This will also help you make sure there should be only one place to load the JS files, although the snippet takes cares of dedupe-ing.
  2. sprinkle plugins and buttons any way you like anywhere on your pages using the appropriate configurator to help you deal with the data-* attributes (FB, G+, Tw)
  3. Enjoy all the social traffic you deserve!

To see it all in action - go to my abandoned phonydev.com blog. Yep, those buttons play nice in mobile too.

 

Overlooked Optimizations: Images

Tuesday, June 14th, 2011

#1 This guest post from Billy Hoffman is the last post in the Velocity countdown series. Velocity starts first thing tomorrow! Hope you enjoyed the ride and please welcome Billy Hoffman!

Billy HoffmanBilly Hoffman (@zoompf) is the founder and CEO of Zoompf, a web performance startup whose scanning technology helps website owners find and fix performance issues which are slowing down their sites. Previously Billy was a web security researcher at SPI Dynamics and managed a research team at HP. He can open a Coke can without using his hands.

(tl;dr: Images make up the majority of the Internet, yet we consistently fail to apply the most basic of optimizations. Even big sites like Twitter are completely screwing this up. Furthermore, there are huge unexplored areas when it comes to image optimization which would provide significant savings. We should stop worrying about esoteric performance optimizations when there is so much other low hanging fruit.)

Images constitute the bulk of content on the Internet, both in terms of content size and number of resources. Using data from the wonderful HTTP Archive we see that 60% of the bytes that make up an Alexa Top 1000 website are images. The average webpage references 81 external resources, and 64% of these are images. And the dominance of images is growing. In the last 6 months, total page content size increased by 70 kB. 75% of that increase (52 kB) came from images.

We know that lossless image optimization tools reduce content size anywhere from 5-20%. Occasionally you will see 70% savings or more, but that only happens when the image contains an embedded thumbnail. That level of savings doesn't sound all that impressive. After all HTTP compression can save 60-70% on real world text resources like HTML, JavaScript, or CSS. However text resources only make up on average 188 kB, or 24% of total content size. Saving 66% on 24% of content saves about as much as 5-20% savings on 60% of the content. In fact, if you could reduce images by 25%, that would have more of an effect on reducing total content size than using HTTP compression!

If you work in front-end performance, none of this should be a surprise. Obviously any front-end performance strategy needs to include image optimizations. Image optimization is an old topic. Shouldn't we instead be focusing on more esoteric optimizations, like refactoring CSS rules so that external fonts render faster on Blackberry Webkit? No, we shouldn't, because sadly we collectively suck at optimizing images.

Give PNG a Chance? Nope.

One of the most basic image optimizations that you can make is converting GIFs to PNGs. PNGs can do everything that GIFs can do and more, and the browser issues with PNGs are larger a problem of the past. Even without applying additional lossless optimization tools on the PNG, converting a GIF file will almost without exception create a smaller PNG. This is because the fundamental way graphics data is compressed in a file PNG, using the DEFLATE algorithm, is more efficient than GIF's LZW compression scheme. Once you apply lossless tools on the converted PNG they get even smaller. Animated GIFs are the exception here, as PNGs are not animated and alternatives for simple animations (MNG, Flash) are either not widely supported or result in larger files. So what is the break down of image formats on the web today?

37% of images on the Alexa Top 1000 websites are GIFs. That makes no sense given what we know about PNGs over GIFs. 37% of the images on the Internet are not animated "Under construction" icons or Ajax status thumper animations. People are not being intelligent about file formats they use for images.

The Internet, now with more bloat!

Applying lossless image optimization tools is one of the simplest optimizations to do. Take an image, run a program, get an optimized image. Stoyan and I love optimizations like this because they are so easy to automate. Just add a step to the website build process or to your staging-to-production publishing process that automatically optimizes images. It should be transparent, something you setup once and forget about. So how are we doing?
82% of Alexa Top 1000 websites contain images which were not losslessly optimized. Apply lossless optimizations across all the images from the Alexa top 1000 would reduce file size by an additional 15%.

Surely there are just a few number of smaller sites which aren't properly optimizing images which are pulling down the statistics right? Sadly no. Twitter, the ninth largest website in the world by traffic, doesn't losslessly optimize any of their images. 33% of total page load bytes could be eliminated solely by applying lossless image optimization. Let me phrase that a different way: 1 byte of our every 3 bytes Twitter sends you is unnecessary! This is an incredible waste.

Unplowed Fields

It's clear we are not applying the image optimizations we already know about. However there is much more work to be done with images. This is a largly unresearched or unadvocated area which needs more attention.

Consider choosing the correct image format. Are people saving images as a PNG when they should be saved as a JPEG? Indeed they are. Tumblr's background image is a 76 kB PNG image and it would be 33 kB (55% smaller) if it was a JPEG. This is better than their old 827 kB PNG background image, which would be 47 kB (94% smaller) if it was a JPEG. Unfortunately I know of no other tool besides Zoompf's free performance scan which identifies PNG candidate images for conversion to JPEG.

What about JPEGs saved with a high quality setting? This is a large enough topic for its own blog post. To quickly summarize, JPEG "quality" is an arbitrary, non-linear scale, quality is not a percentage of anything, and "quality of 80" does not mean "discard 20% of graphics data." Thought leaders like Adobe recommend a quality setting of 70-80 for JPEGs published on the web. Zoompf found that 36% of Alexa Top 1000 images have a quality setting over 80, and reducing them to quality 70 would on average reduce image size by 48%! While all of these images might not be able to be reduced in quality, surely some of them can. Again, this is an area that needs more attention, more best practices and guidance, and more tools to help validate.

Not "Instead of" but "In addition too"

I am not saying other performance optimizations are not important. Zoompf checks for over 380 performance issues and we are adding more all the time. Many of them are esoteric and low impact. We flag things like duplicate cookies, unnecessary HTTP headers, and even when your <META> contains duplicate keywords. However these checks are for when you have handled all the other important checks. Image optimizations, and research into new image optimization techniques should be not done instead of other work, but in addition to it. Just remeber to prioritize what you are working on so that it will affect the most number of people in the largest possible way.

Conclusions

Images are a huge component of the web and modern web performance. This importance is only growing. Sadly, there are only one or two widely recognized image optimizations techniques. Unfortunately, these most basic optimizations are ignored, forgotten, and not uniformly applied by even the largest of websites today. Additionally, there are a lot of unexplored areas of image optimization, including lossy image optimization, with no clear recommendations or best practices and virtually no tool support. Some areas for further research include:

  • Lossy image optimizations
  • Comparison of JPEG encoders
  • PNG-to-JPEG and GIF-to-JPEG best practices, recommendations, and processes
  • Image quality for Desktop vs. mobile browsing experiences
  • Better PNG24 to PNG8 conversion guidelines. (I converted all the figures in this blog post from PNG24 to PNG8 and reduced file size by 52%)
  • Viablility of WebP and automated delivery to supported browsers

I will be discussing many of these topics this week during my presentation Take it all Off! Lossy Image Optimization at Velocity 2011 on Wednesday. I hope you all can make it.

 

Sultans of Speed

Monday, June 13th, 2011

#2 This post is part of the Velocity countdown series. Stay tuned for the last one tomorrow.

With only 2 days to Velocity, it's time to drop in the quality of these posts (but the one tomorrow will be great, I promise) with today's announcement of the immediate availability of the project called http://sultansofspeed.com.

I think we've had enough of experts, gurus, ninjas, jedis, pirates and overloards. Time for the sultans to step in!

So there: a slideshow of bios and photos of a number of Web Performance Sultans.

The background music is my heavy metal cover (sorry!) of "Sultans of Swing" by Dire Straits.

The Sultans you see there are the people who have written for the Perfplanet Calendar. But this is just the initial seed. (And because these are the bios/photos I have easy access to.)

Are you a sultan? Add/delete/edit your bio in the Github repo in the sultans.js file.

Want to change something - better slideshow maybe? Yes, the repository is still there.

In the immortal words of Mark Knopfler:

And he makes it fast with one more thing:
"We are the sultans, yeah the sultans of speed"

 

Book of Speed

Sunday, June 12th, 2011

#3 This post is part of the Velocity countdown series. Stay tuned for the articles to come.

Without further ado, please point your browser to the newborn bookofspeed.com.

It's a free (public domain), online, open-source, not yet finished, book about web performance.

Contributions welcome

The source files are on Github - https://github.com/stoyan/Book-of-Speed. I'll be glad to receive any errata, technical mistakes, requests, grammar checks, anything really. Just edit the stuff in /src and send a patch. /src is the text for the chapters alone, then what you see on the site and in the main directory - TOC and chapters - are generated by a build script (of course a javascript).

How did we end up here

Year and half ago I did this Performance advent calendar experiment (since moved to a new home), writing an article a day for 24 days (sounds vaguely familiar?). PeachPit press approached me about publishing a book based on those. PeachPit publishes mostly web design books (like Designing with Web Standards) and I thought designers should know about performance. Also business folks, product managers. So why not write something more accessible and less technical?

"Speed Matters" was the title.

Fast forward... I kept missing deadlines (a favorite thing, ask Douglas Adams) until eventually after 5 and a half chapters out of 9, the publisher decided to cancel the project. Fair enough. Wasn't meant to be. We're grown ups, no hard feelings. (Well, I did try to save the project by suggesting Marcel Duran who now works on YSlow to finish it, to which PeachPit expressed interest initially but then didn't bother to follow up with a comment or explanation)

So instead of letting PeachPit keep the content and maybe publish it on their site, I decided to keep the chapters and return them the money for the royalty advance they have given me. After all, I did wanted to try self-publishing for some time .

Fast forward again... I didn't do anything further. Changing computers, failing disks and non-existing backups convinced me I should let this content free sooner. "Information wants to be free". So I managed to restore from emails (but not the images, had to copy images from Word) and thought the Velocity countdown is a good excuse to release this thing.

I mentioned to my good friend and designer Yavor about the project two days ago, he had a few free cycles and sent me a mock. Awesome! the only "brief" I gave him was "it's to be a free online book, like diveintohtml5.com and eloquentjavascript.net". And here's what he came up with, how cool is that! (oh and I gave him a turtle drawing, see below)

(As you can see, he's so humble he doesn't want any credit on the site. But this is my blog and I can give credit as much as want now, can't I? :) )

So last night between writing last night's post and today, I turned this mock into HTML (not fully complete, missing ego-header and pagination) and converted the 5 chapters I have so far from word docs to HTML.

Audience

If you follow my blog there isn't much new for you. Like I said, the audience was to be less technical. But there are a few new never-before seen bits and pieces.

Assuming the html-writing part of the PeachPit audience will be still very attached to XHTML, I decided to do what I generally tend to avoid - closing tags, using type="text/javascript" etc. Further edits should convert these to more compact html5-allowed syntax.

In the markup for the site though, in the rush to convert everything I started not closing P and LI to save time :) Feel free to send a patch.

No credits

I was planning on having one round of credit-giving either as footnotes or appendix once the book is done. But the books is not done, so forgive me if I havent given you credit where it was due.

No links

It's silly to have no links in an online publication, but given the rush, I didn't edit the content at all to add them. Again I was planning on appendix, or actually a companion site. Will do. Will accept a patch :)

On editing

My editor from PeachPit sent me notes and edits. These are not in the online edition. Partly because I don't think it's fair (what's in it for them?) and partly because, trivially, I didn't have the time.

On reviewing

I got technical reviews from Marcel Duran and Sergey Chikuyonok while working on the book. I haven't incorporated their feedback. Will do ;) (Sergey said my chapter on image optimization was too basic :) It is, especially compared with his articles on smashing magazine and his blog )

But Annie Sullivan from Google went way above and beyond any review I have seen. She actually read the chapter with her husband (not technical) and explained to him what's going on. So I had very eye-opening observations and I'm grateful and indebted for this.

(As you guessed, the feedback is not yet reflected in the text)

PageSpeed

PageSpeed runs on Dreamhost where the site is. So I though I should check the "use pagespeed" check in DH's panel. Not bad, not bad at all. Having your images and other stuff taken care of for you automagically. I have 99/100 Page Speed score and 94/100 YSlow.

I do minify CSS myself though and inline it, because it's small

Turtle

I couldn't use the turtle (nor the title) from Speed Matters. But my kid drew a turtle in drawing class so I thought I should use it. Here's what it looked like before the my designer friend took over:

Happy reading!

Like I metioned, regulars on this blog won't find much new information, but feel free to send your junior team members to learn from the free source.

And don't forget to send patches - book editing via GitHub sounds pretty nice to me.

Once again - here's the book of speed and here are its source files.

 

YSlow 2.0: the first sketches

Saturday, June 11th, 2011

#4 This post is part of the Velocity countdown series. Stay tuned for the articles to come.

I'm working on tomorrow's kind of big thing, so will take it easy today, with a stroll down memory lane.

I was clearing up my space at home few days ago and came across this oldish notepad. In there (among the usual amount of lists of todos and ideas in the spirit of i-wanna-do-this-tool/site/experiment!) I found these early sketches of what has since become YSlow 2.0. These are all still pretty relevant, so why not take a minute to review them and get acquainted with the YSlow internals.

Back at the time Steve Souders and I had just released YSlow 0.9 and Steve had moved to Google. It was the right time to have a quick bugfix-or-two release of YSlow 1.0 and in parallel get cranking on a complete YSlow 2.0 rewrite.

The motivation behind the total rewrite was (aside from the usual "I didn't make this mess" ego-driven desire to start fresh and do a better job the second time around) was that we were getting a lot of "meh, these are Yahoo's problems/rules, not yours". For example a normal mere mortal blog with no CDN budget should still try to get an A in most other checks. Another, somewhat forward thinking, as opposed to reactive reason was that I was a big fan of letting others contribute rules and checks of their own. The idea was to make YSlow your own tool, not only Yahoo's. For example if you want to set a rule that there should be no more than 5 images on a page, you should be able to codify this into a rule. And share the rule with the rest of team or the world. (Here's an example). Another thing was also to decouple the tool from Firebug. Make it work without Firebug and even without Firefox. Go back to having a bookmarklet version (Steve's original very first version) and versions for other browsers. (Thanks to Marcel Duran this is also becoming a reality now)

So the new architecture (a big name for a bunch of objects) was conceived on these sketches while en route a red-eye flight to Bulgaria. My little kids were asleep taking over my seat as well, so here I was standing up in the aisle on the plane or sitting on the seat's handrail, scribbling these notes.

The main idea was divide and conquer. Split this monolithic piece of code into smaller components.

When you run YSlow, it starts by "peeling off" the page, extracting all possible information. Hence the Peeler singleton.

Peeler has methods such as getJavaScript() and getDocuments() (as in document + any frames). This can work most anywhere (bookmarklet too). Then if YSlow is running inside Firebug and has access to Net Panel (or any other browser or environment that lets you access stuff happening on the network, not only DOM crawling), it can also find things such as XHR requests or image beacons, which are not part of the DOM, using a NetMonitor listener object of some sorts.

Whatever Peeler finds, it sticks into a ComponentSet which is just an array of components along with some convenience methods such as getComponentsByType('css').

Moving on, the ComponentSet contains Component objects which have all the data, like headers, type, content, URL, the whole thing.

K, now we have a bunch of components waiting and willing to be inspected. To make this inspection as lego-like as possible, there's no big-ass inspector, but there are many little Rule objects. Each Rule object has a bunch of properties like name, URL with more info, etc, but the main thing is - it needs to implement a lint() method. The lint() method takes a reference to the ComponentSet and then returns a Result object.

The Result objects are fairly simple - they have a grade/score, message and optionally a list of offending components (e.g. images without Expires header). A bunch of result objects make a ResultSet which has methods to get the final total score.

A bunch of Rule objects go into a RuleSet. The idea is to mash those up as you wish. So a Rule object is for example "Use CDN". (it's also configurable, e.g. how many score points to take away for each offender). Also within a RuleSet you can define what is the relative weight of each Rule. E.g. is F on "Expires" rule as bad as F on "CSS expressions". You can create your own RuleSets (e.g. "Small blog") including an configuring any of the existing rules you like and also add more custom Rules. It's one big happy pool of Rules to pick from and configure. In fact YSlow 2.0 shipped with three rulesets - the new one with more rules, the old yslow1 and a "small site or blog"

At the end there is one central lint() method which takes a RuleSet, loops over the Rules in it, calls each Rule's lint() and collects the results into a ResultSet.

From there it's a question of rendering the ResultSet, grades, offenders, etc. Additionally there are tools that can run on the ComponentSet (e.g. JSLint) and stats. In addition to the YSlow UI, you should be able to render these results in any way you like, including exporting a JSON or whathaveyou.

Whew!

I may have missed some details but that's about all there is to the core of YSlow 2.0

Here's also a presentation that talks about these things and offers some diagrams that hopefully clarify even further

Thanks for reading!

That was it for today, only 4 days to go to Velocity. Hope you learned something you can use and you're ready to start coding your own rules and create rulesets to customize what YSlow can do for you.

To stay connected, there's now a Facebook page for YSlow and there's always the YDN (Yahoo! Developer Network) section about YSlow