Uncompressed data in base64? Probably not / Stoyan's phpied.com

The beauty of experimentation is that failures are just as fun as successes. Warning: this post is about a failure, so you can skip it altogether 🙂

The perf advent calendar was my attempt to flush out a bunch of stuff, tools and experiments I was doing but never had the time to talk about. I guess 24 days were not enough. Here's another little experiment I made some time ago and forgot about. Let me share before it disappears to nothing with the next computer crash.

I've talked before about base64-encoded data URIs. I mentioned that according to my tests base64 encoding adds on average 33% to the file size, but gzipping brings it back, sometimes to less than the original.

Then I saw a comment somewhere (reddit? hackernews?) that the content before base64-encoding better be uncompressed, because it will be gzipped better after that. It made sense, so I had to test.

"Whoa, back it up... beep, beep, beep" (G. Constanza)

When using data URIs you essentially do this:

take a PNG (which contains compressed data),
base64 encode it
shove it into a CSS
serve the resulting CSS gzipped (compressed)

See how it goes: compress - encode - compress again. Compressing already compressed data doesn't sound like a good idea, so it sounds believable that skipping the first compression might give better results. Turns out it's not exactly the case.

Uncompressed PNG?

The PNG format contains information in "chunks". At the very least there's header (IHDR), data (IDAT) and end (IEND) chunks. There could be other chunks such as transparency, background and so on, but these three are required. The IDAT data chunk is compressed to save space, but it looks like it doesn't have to be.

PNGOut has an option to save uncompressed data, like
$ pngout -s4 -force file.png

This is what I tried - took several compressed PNGs, uncompressed them (with PNGOut's -s4), then encoded both with base64 encoding, put them in CSS, gzip the CSS and compared file sizes.

Code

<?php
// images to work with
$images = array(
  'html.png',
  'at.png',
  'app.png',
  'engaged.png',
  'button.png',
  'pivot.png'
);
//$images[] = 'sprt.png';
//$images[] = 'goog.png';
//$images[] = 'amzn.png';
//$images[] = 'wiki.png';
 
// css strings to write to files
$css1 = "";
$css2 = "";
 
foreach ($images as $i) {
  
  // create a "d" file, d as in decompressed
  copy($i, "d$i");  
  $cmd = "pngout -s4 -force d$i";
  exec($cmd);
  
  // selector
  $sel = str_replace('.png', '', $i);
 
  // append new base64'd image 
  $file1 = base64_encode(file_get_contents($i));
  $css1 .= ".$sel {background-image: url('data:image/png;base64,$file1');}\n";
  $file2 = base64_encode(file_get_contents("d$i"));
  $css2 .= ".$sel {background-image: url('data:image/png;base64,$file2');}\n";
 
}
 
// write and gzip files
file_put_contents('css1.css', $css1);
file_put_contents('css2.css', $css2);
exec('gzip -9 css1.css');
exec('gzip -9 css2.css');
 
?>

Results

I tried to keep the test reasonable and used real life images - first the images that use base64 encoding in Yahoo! Search results. Then kept adding more files to grow the size of the result CSS - added Y!Search sprite, Google sprite, Amazon sprite and Wikipedia logo.

test	with compressed PNG, bytes	with uncompressed PNG, bytes	difference, %
Y!Search images	700	1506	54%
previous + Y!Search sprite	5118	8110	36%
previous + Google sprite	27168	40836	33%
previous + Amazon sprite + Wikipedia logo	55804	79647	29%

Clearly starting with compressed images is better. Looks like the difference becomes smaller as the file sizes increase, it's possible that for very big files starting with uncompressed image could be better, but shoving more than 50K of images inline into a CSS file seems to be missing the idea of data URIs. I believe the idea is to use data URIs (instead of sprites) for small decoration images. If an image is over 50K it better be a separate request and cached, otherwise s small CSS tweak will invalidate the cached images.

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov