Duplicates and near-duplicates

December 9th, 2009. Tagged: CSS, images, JavaScript, performance

2010 update:
Lo, the Web Performance Advent Calendar hath moved

Dec 9 This post is part of the 2009 performance advent calendar experiment. Stay tuned for the next articles.

One of Yahoo!'s first batch of performance best practices has always been "Avoid duplicate scripts" (check Steve Souders' post). Later we added "... and styles". This is a pretty obvious, kind of a "Duh!" type of recommendation, it's like saying "Avoid sleep() in your server-side scripts". But it didn't come up out of thin air, duplicates were noticed on some quite high-profile sites.

Duplicates are easy to spot (and YSlow will warn you), but let's talk a bit about another concept - let's call it near-duplicates - when two components are similar, almost the same, but not quite.

Duplicate scripts and styles

As a refresher and a quick illustration of the effects of duplicate scripts and styles, fire off your HTTP sniffer and hit this test page.

(btw, this is a simple page I put up to test different YSlow scenarios, you can actually use it as a web service of sorts to create any type of components with different options)

Firefox 2 downloading both duplicate styles and scripts:

FF2 duplicate scripts and styles

IE6 and duplicate scripts:

IE duplicate scripts

Exact details of when/which browsers chose to download duplicates are not that interesting, it's obviously bad to waste time downloading the same resource. Even if no repeating download happens, the browser still has to parse through and execute the script/style for a second time.

Even if you have iframes you don't need to repeat the same JS/CSS in each frame, you can "borrow" them from the parent page, here's an example.


Near duplicates can be:

  • components with the exact same response bodies but different URLs causing the browser to do double work
  • components (images) that are too close to each other - in terms of looks or purpose. Only one component should be selected in this case.

Same component, different URLs

This could happen especially when you have user-generated content such as image uploads for profile photos and avatars in social sites, forums, images people put in comments on MySpace and so on.

Also images of stuff for sale (Craigslist, eBay). Often different sellers offering the same item would take the same photo from the manufacturer's site and upload it over and over again.

Luckily, PageSpeed warns about components with identical content, so those can be identified:

In the screenshot above, you see one image (2.3K) repeated 3 times, another (the iPhone, 1.7K) is repeated 4 times, and yet another one (2.8K) repeated 2 times.

It's not exactly trivial to avoid this type of duplications with user-generated content (for example, the first poster may delete the photo, in which case the second poster's photo will need to "shine through"). But it's not impossible, using for example a hash of the component's content as an identifier.


Ajax loading indicators are a great idea to give feedback to the user that something is happening. They come in all shapes and sizes... sometimes on the same page, unfortunately. And again, sometimes it's the same stock image but used at different stages of gradual "ajaxification" of the page and with different URLs.

As we're moving more and more towards modular pages and client-side logic, often different modules on the same page are coded by different teams at different times, independently, without being aware of each other's assets. This way of building pages has it's challenges and one is that common components, such as Ajax loading indicators, should be shared.

Too similar modules

Along the same lines - different modules are sometimes created by different designers at different times. The result - one rounded corner box with 1px shadow and one with 2px shadow, both on the same page. Or two different shades of the same gray color, which no one can tell apart. That's just a waste. (See Nicole Sullivan's presentation for illustration, e.g. slides 44, 45)

Below is an example, can you tell that these 5 rounded corner boxes are all different - slightly different shadow, color or radius? How many different boxes does this page need?

Different sizes of the same image

It's highly recommended to not scale images in HTML (or CSS). If you need a 100x100 image you don't use a 400x400 one with <img width="100" height="100" ... />. That's a good rule of thumb... to break sometimes 😉

In cases where the same image is used with different sizes and likely even on the same page, it may be beneficial to reuse the same bigger image and scale it down, because this could be saving extra HTTP requests of downloading the same (but slightly smaller) image.

Facebook is an example, the same hairy guy on the screenshot has two images with different sizes. It's actually the same image but resized in CSS.

The relevant CSS which shows the profile image in LARGE and SMALL (and looks like there's even a TINY view, although I couldn't find an example on this page)


Thank you!

Thanks for reading! Reducing HTTP requests is critical for page performance. You've merged your scripts and styles as much as reasonable, you've crafted CSS sprites and inlined images with data URIs. Now it's time to look at what's left - are there components that are way too similar, are there any near-duplicates or even exact duplicates? Same image on different backgrounds? Ever-so-subtle gradients and shadows? Time to pick up the old axe and cut.

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov