Suddenly structured articles

April 7th, 2006. Tagged: JavaScript

This post talks about a JavaScript that can be used on any web/blog page to auto-generate a Table of Contents.

Motivation

Here's the idea I've playing with: say you have a relatively long web page, where the content uses H1-H6 tags to structure the copy. How about running a JavaScript when the page is loaded and getting a Table of Contents (TOC) generated for you, based on the heading tags that you've used? Wikipedia does this kind of TOCs, but on the server side, using not H tags but the wiki tags.

Anyway, I decided it's a cool idea and rolled up the JS sleaves. Then once I had the TOC sorted out, I added a list of external references, meaning a list of all external links contained within the content.

Demo and files

Integration

If you like the idea, you're free to get the files and experiment. All you need to do is:

  1. include the JS
  2. create two divs in your document - one for the TOC (with an ID "suddenly-structured-toc") and one for the list of external links (with ID suddenly-structured-references)
  3. call suddenly_structured.parse();

The code

The script is not 100% finished yet, I was thinking of adding some more features, such as the ability to create TOCs only starting from H3 for example. But if you want to play with the code, you're more than welcome to.

What's happening in the code? There is an object/class called suddenly_structured, its main "controller" method is parse() that calls the other methods as needed. You can peek at the code for more but the basically the work is done by the methods:

  • init() - initializes the "environment" where (in which element ID) is the content, where to print the TOC and links.
  • traverse() - this goes through the document and if it finds a heading, it adds it to a list, but first makes sure that this heading has an ID. If there's no ID, a random one is generated.
  • generateTOC() - once we have a list of all the headings, we can generate a TOC tree.
  • appendReferences() goes through all the links, checks the URL protocol and host to make sure they are external links and adds to the list of external references. When generating the list, I'm using the title attribute of the A tag to make the list nicer.
/**
 * Script flow controller
 *
 * @param {string} start_id      The ID of the element that contains the content. 
 *                                  Default is the BODY element
 * @param {string} output_id     ID of the element that will contain 
 *                                  the result TOC
 * @param {string} output_ref_id ID of the element that will contain the result 
 *                                  list of external links
 * @param {int}    heading_level From which heading level to start (1 to 6), 
 *                                  not yet implemented
 */
parse: function (start_id, output_id, output_ref_id, heading_level)
{
    // initialize the environment pass all parameters
    this.init(start_id, output_id, output_ref_id, heading_level);
     // if the content is found, run through it to extract the headings
    if (this.the_element) {
        this.traverse();
    }
    // run through the extracted headings and generate TOC
    this.generateTOC();
     // add the TOC to the element specified
    if (this.toc_div) {
        this.toc_div.appendChild(this.stack[0].list);
    }

    // run through all the links and list them
    if (this.ref_div) {
        this.appendReferences();
    }
}

For the rest of the high-quality (*cough-cough*) JavaScript, check the source.

Misc

At some point I figured out that quirksmore.org also has an auto-generated TOC script. He grabs only the h2-h4 tags. His TOC is links with different styles, and not a semantic HTML list. Here's his post about how he coded the script. He also has a show/hide TOC which is a very slick idea.

I also did my TOC and references lists to show/hide and left the references hidden by default.

After I did the script (of course!) I decided to google other similar scripts. It turned out, quite a few exist. But I didn't see any one that uses UL or OL for the actual TOC. They all use DIVs and As with a different style to do the indentation. My script uses a semantically-correct list tags UL|OL (can be changed on the fly by calling suddenly_structured.list_type = 'ul' for example) and LIs. But that I guess because until recently no one was really losing any sleep over semantic markup. The web was young ... 😉

Thanks for reading!

That's it. Enjoy the script! Of course, any feedback is welcome.

I personally would like to integrate the script into this blog. I like using heading tags and this way my articles will become ... suddenly structured, TOC-ed and beautiful 😉

Tell your friends about this post on Facebook and Twitter

Sorry, comments disabled and hidden due to excessive spam.

Meanwhile, hit me up on twitter @stoyanstefanov