Inline MHTML+Data URIs

October 3rd, 2010. Tagged: CSS, HTTP, IE, images, performance

MHTML and Data URIs in the same CSS file is totally doable and gives us nice support for IE6+ and all modern browsers. But the question is - what about inline styles. In other words can you have a single-request web application which bundles together markup, inline styles, inline scripts, inline images? With data URIs - yes, clearly. But MHTML?

I remember hacker extraordinaire Hedger Wang coming up with a test page, which proved it's doable. Problems with the test are that a/ I can't find the page anymore, his domain has expired b/ there was some funky IE7/Vista stuff (probably now solvable) in there which even included an undesired redirect c/ was complex - the whole HTML becomes a multipart document, if I remember correctly there was something that required html served as text/plain....

So I tried something simple - shove an MHTML doc inside an inline style comment. It so totally worked! Including IE6 and IE8 in IE7 mode on Windows 7 (which in my experience behaves as badly as IE7 proper on Vista)

Here's the test page. Look ma', no extra HTTP requests :)

So it's a simple HTML doc:

<!doctype html>
<html>
  <head>
    <title>Look Ma' No HTTP requests</title>
    <style type="text/css">
 
/* magic here */
 
    </style>
  </head>
  <body>
    <h1>MHTML + Data:URIs inline in a <code>style</code> element</h1>
    <p class="image1">hello<br>hello</p>
    <p class="image2">bonjour<br>bonjour</p>
  </body>
</html>

And the magic is two parts: the MHTML doc inside a CSS comment and the actual CSS which uses data URIs for normal browsers and refers to the MHTML parts in IE6,7.

/*
Content-Type: multipart/related; boundary="_"
 
--_
Content-Location:locoloco
Content-Transfer-Encoding:base64
 
iVBORw0KGgoAAAAN ... [more crazyness]... QmCC
--_
Content-Location:polloloco
Content-Transfer-Encoding:base64
 
iVBORw0KGgoAAAANSUh ... [moarrr] ... ggg==
--_--
*/
.image1 {
  background-image: url(" ... QmCC"); 
  *background-image: url(mhtml:http://phpied.com/files/mhtml/mhtml-html.html!locoloco); 
}
 
.image2 {
  background-image: url(" ... ggg=="); 
  *background-image: url(mhtml:http://phpied.com/files/mhtml/mhtml-html.html!polloloco); 
}
 
body {
  font: bold 24px Arial;
}

How cool is that!

Please report any issues you might find in any browser/os combination

The obvious drawback is repeating the long base64'd image content twice, but it's solvable with either server-side sniffing or... one crazy hack, found on the Russian site habrahabr.ru. I should talk about it separately and help spread the word to the larger English-speaking audience, but for the impatient - click!

So there you go - MHTML inline in CSS inline in HTML or building single-request x-browser web apps :)

Tell your friends about this post: Facebook, Twitter, Google+

23 Responses

  1. There is a possibility to include only 1 base64-chunk into the document. http://bolknote.ru/2009/03/21/~2050#21 should tell more

  2. This is very useful for hard-core optimizations like in-lining core CSS in your homepage and boost start render / progressive rendering, only one request (html+css+datauris) and you’re done, your homepage shows some content to the user.

    After that you can download real css files in background for the next user click that will generate a standard page.

    Thanks !

  3. Btw, repeating the base64 content twice is not an issue as long as 2nd repetition comes right after (or close enough to) the 1st one. In this case gzip-ing will eliminate the difference:

    Samples: http://sandbox.extjs-ux.org/datauri-gzip/

    1) Images embedded into datauris (one repetition for each), size 174K: http://sandbox.extjs-ux.org/datauri-gzip/once.js

    Gzipped, size 129K: http://sandbox.extjs-ux.org/datauri-gzip/once.js.gz

    2) Images as data-uris 2 repetitions for each, 2nd is being placed right after original, size: 325K: http://sandbox.extjs-ux.org/datauri-gzip/twice.js

    Gzipped, size: 131K: http://sandbox.extjs-ux.org/datauri-gzip/twice.js.gz

    3) Images as data-uris, whole file content is repeated twice, repetitions are far from each other, size 348K: http://sandbox.extjs-ux.org/datauri-gzip/twice-whole.js

    Gzipped, size: 259K: http://sandbox.extjs-ux.org/datauri-gzip/twice-whole.js.gz

    As you can see in the 2) the difference between gzip-ed files is negligible even that source file is twice bigger. And in 3) gziped file is twice bigger with the roughly the same size of source file as in 2)

    Question is – how far the 2nd repetition can be placed from the 1st? Probably it depends from some kind of “buffer size” in the gzipping algorithm.

  4. So may be instead that hack it will be possible to just create 2 styles near each other – one for normal browsers and another for IE. Some CSS frameworks like Less or SASS can ease this task.

  5. Nice.

    Although it’s not ideal, to prevent having the base64 strings included twice you could have the strings as JavaScript variables (or properties of a hash lookup) and have JavaScript write out the style element for you.

  6. Stoyan,

    imagine you use @font-face in the CSS for 1 special font and you want to minimize HTTP requests and go with this inlining images with data uri & mhtml and inlining the whole CSS.

    In your opinion, is there a good reason *not* to base64 encode the OTF, TTF and WOFF files and inline those as well, just like the images? SVG and EOT font files have to stay external, so not all users will benefit, but many will.
    The main benefit is the font file being available in the browser sooner and the Flash of Unstyled Text happens less often.

  7. *Very* Interesting! Thanks for sharing this.

    Two things I have noticed:

    1) After playing around with this further and have worked around the problem that the mhtml: protocol has with relative URLs problem as described on your previous post at http://www.phpied.com/data-uris-mhtml-ie7-win7-vista-blues/ . My solution involves using IE’s CSS expressions.

    background-image: expression(“url(mhtml:” + document.location + “!polloloco)”);

    I have a working example here:

    http://www.useragentman.com/tests/dataURL/selfContained.html

    Unfortunately, this workaround doesn’t work with @font-face, although it’s fun to know one can embed fonts using absolute URLs:

    http://www.useragentman.com/tests/dataURL/fontFace.html (IE 6 and 7 only)

    I usually wouldn’t want to embed fonts this way (the HTML file is *huge*) but it’s neat to know this can be done. If we could work around the relative URL issue, maybe it could be used for HTML emails.

    2) I noticed that if you transfer your resultant HTML from a Windows to a non-Windows environment using FTP, it must be done in binary mode. Using an ASCII transfer will cause it not to work (I assume because IE’s implementation of MHTML only works with Windows “\r\n” carriage-return encoding).

  8. I created a test page for inlining fonts. It’s a HTML with inline CSS and 2 EOT font files base64 encoded and inlined in the CSS. I tested it on IE7 on Vista and it works fine.
    All in a single HTTP request.

    Test page is here with my notes on file size increase, effect of Gzip and some early conclusions: http://www.aaronpeters.nl/sandbox/base64-fonts-eot.html

    Yes, I know, I wrote in a previous comment that EOT files have to stay external. This appears not to be true. I’ve asked FontSquirrel for their opinion/experiences via Twitter. Will post their reponse here.

  9. Hi Stoyan,

    You mentioned that Hedger Wang’s site was down and you couldn’t find the page, it looks like Ajaxian did the same as well, but they copied some code. Is this what you were refering to?

    http://ajaxian.com/archives/using-base64-encoded-images-on-ie-too

    Thanks for the awesome article.

  10. Stoyan! Copying and pasting your code does not work … At least not in Webkit/Safari on a Macintosh. Because MHTML requires CRLF for the linebreaks .. but the CRLF’s disappears when copy-pasting. Took some rounds of scratching my head before I found out …

    The MIME standard [MIME2] requires that e-mailed documents of
    “Content-Type: Text/ MUST be in canonical form before a Content-
    Transfer-Encoding is applied, i.e. that line breaks are encoded as
    CRLFs, not as bare CRs or bare LFs or something else. This is in
    contrast to [HTTP] where section 3.6.1 allows other
    representations of line breaks.

  11. Stoyan: If you create a line break between the style element and the preceding element (title) in your your demo page, then the trick stops working. It somethiing related to the semantics of two consecutive CRLF line breaks in MHTML, I think.

    Zoltan Hawryluk: Your demo-page are invalid: neither HTML4, nor HTML5 or XHTML permit two hypens inside a comment.

    To avoid that arbitray double consecutive CRLF’s matter, then either one must only use CRLF where they matter (I have not tested this). But this is difficult to keep control over … Preferrably all line-breaks should be the same, no?

    Another alternative is to mak entire page a polyglot, then two consecutive CRLF’s are somehow ignored except when there is a preceding --boundary-string. Hence, the MHTML Content-Type string must be placed preferrably before the DOCTYPE. However, doing so, brings IE in Quirks-Mode, even when string is placed inside inside a comment … So, we must bring up our box of conditional comment tricks to hide it from IE … In addition, my goal was to be able to place the MHTML inside a script element. Below I show an HTML5 valid example.

    <!--[if !IE]><!--><!--
    Content-Type: multipart/related; boundary="=_data-uri"
    ;This is a MIME comment
    ;This page must use CRLF line-breaks in order to work.
    ;I did not care to offer DATA URIs here - this demo code is only for IE6 and IE7
    ;--><!--<![endif]-->
    <!DOCTYPE html>

    <title>And valid HTML too!</title>

    <style type="text/css">
    .image1 {
    width:100px;border:dotted 3px brown;
    background-image: expression("url(mhtml:" + document.location + "!locoloco)");/**/
    }
    </style>

    <p class="image1">hello<br>world</p>
    <script type="multipart/related">
    --=_data-uri
    Content-Location:locoloco
    Content-Transfer-Encoding:base64

    iVBORw0KGgoAAAANSUhEUgAAAAQAAAADCAIAAAA7ljmRAAAAGElEQVQIW2P4DwcMDAxAfBvMAhEQMYgcACEHG8ELxtbPAAAAAElFTkSuQmCC
    --=_data-uri--
    </script>

  12. PS: Stoyan, remember to replace – with two -- before you publish taht code … I am thinkig about the 3 occurrences of –=_data-uri, which each should be changed to --=_data-uri. Would be nice if you did something to your script. ;-)

  13. [...] summarize IMHO dataUris/MHTML of Stoyan is a better technique  because you save the sprite request. However, using dataUris you [...]

  14. [...] En mi opinion dataUris/MHTML de Stoyan es mucho mejor tecnica, porque nos ahorramos el request del sprite. Sin embargo cuando [...]

  15. Thank You for Sharing Your Knowledge

  16. It’s super page, I was looking for something like this

  17. It’s good page, I was looking for something like this

  18. [...] Для совсем маленьких изображений используем data:URI+MHTML: http://www.phpied.com/inline-mhtml-data-uris/ [...]

  19. [...] [...]

  20. [...] совсем маленьких изображений используем data:URI+MHTML: http://www.phpied.com/inline-mhtml-data-uris/ (это работает не во всех [...]

  21. [...] niitä varten pitää olla jokin toinen ratkaisu. Ongelman voi IE-selaimilla ratkaista esimerkiksi MHTML:n avulla. Käytännössä tämä tuo kuitenkin lisätyötä verrattuna CSS-spriteen, joka toimii [...]

  22. Hi all, here every person is ѕharing such
    knowledgе, thus it’s good to read this web site, and I used to pay a visit this blog daily.

  23. Hello there,

    Are you still correcting the posts of yours?

    I encountered 20 grammar mistakes.

    Sincerely,
    MarthaS

Leave a Reply