Suddenly structured articles

This post talks about a JavaScript that can be used on any web/blog page to auto-generate a Table of Contents.

Motivation

Here's the idea I've playing with: say you have a relatively long web page, where the content uses H1-H6 tags to structure the copy. How about running a JavaScript when the page is loaded and getting a Table of Contents (TOC) generated for you, based on the heading tags that you've used? Wikipedia does this kind of TOCs, but on the server side, using not H tags but the wiki tags.

Anyway, I decided it's a cool idea and rolled up the JS sleaves. Then once I had the TOC sorted out, I added a list of external references, meaning a list of all external links contained within the content.

Demo and files

Integration

If you like the idea, you're free to get the files and experiment. All you need to do is:

  1. include the JS
  2. create two divs in your document - one for the TOC (with an ID "suddenly-structured-toc") and one for the list of external links (with ID suddenly-structured-references)
  3. call suddenly_structured.parse();

The code

The script is not 100% finished yet, I was thinking of adding some more features, such as the ability to create TOCs only starting from H3 for example. But if you want to play with the code, you're more than welcome to.

What's happening in the code? There is an object/class called suddenly_structured, its main "controller" method is parse() that calls the other methods as needed. You can peek at the code for more but the basically the work is done by the methods:

  • init() - initializes the "environment" where (in which element ID) is the content, where to print the TOC and links.
  • traverse() - this goes through the document and if it finds a heading, it adds it to a list, but first makes sure that this heading has an ID. If there's no ID, a random one is generated.
  • generateTOC() - once we have a list of all the headings, we can generate a TOC tree.
  • appendReferences() goes through all the links, checks the URL protocol and host to make sure they are external links and adds to the list of external references. When generating the list, I'm using the title attribute of the A tag to make the list nicer.
/**
 * Script flow controller
 *
 * @param {string} start_id      The ID of the element that contains the content.
 *                                  Default is the BODY element
 * @param {string} output_id     ID of the element that will contain
 *                                  the result TOC
 * @param {string} output_ref_id ID of the element that will contain the result
 *                                  list of external links
 * @param {int}    heading_level From which heading level to start (1 to 6),
 *                                  not yet implemented
 */
parse: function (start_id, output_id, output_ref_id, heading_level)
{
    // initialize the environment pass all parameters
    this.init(start_id, output_id, output_ref_id, heading_level);
     // if the content is found, run through it to extract the headings
    if (this.the_element) {
        this.traverse();
    }
    // run through the extracted headings and generate TOC
    this.generateTOC();
     // add the TOC to the element specified
    if (this.toc_div) {
        this.toc_div.appendChild(this.stack[0].list);
    }

    // run through all the links and list them
    if (this.ref_div) {
        this.appendReferences();
    }
}

For the rest of the high-quality (*cough-cough*) JavaScript, check the source.

Misc

At some point I figured out that quirksmore.org also has an auto-generated TOC script. He grabs only the h2-h4 tags. His TOC is links with different styles, and not a semantic HTML list. Here's his post about how he coded the script. He also has a show/hide TOC which is a very slick idea.

I also did my TOC and references lists to show/hide and left the references hidden by default.

After I did the script (of course!) I decided to google other similar scripts. It turned out, quite a few exist. But I didn't see any one that uses UL or OL for the actual TOC. They all use DIVs and As with a different style to do the indentation. My script uses a semantically-correct list tags UL|OL (can be changed on the fly by calling suddenly_structured.list_type = 'ul' for example) and LIs. But that I guess because until recently no one was really losing any sleep over semantic markup. The web was young … ;)

Thanks for reading!

That's it. Enjoy the script! Of course, any feedback is welcome.

I personally would like to integrate the script into this blog. I like using heading tags and this way my articles will become … suddenly structured, TOC-ed and beautiful ;)


Post this entry to: » del.icio.us  » Digg  » Furl  » Newsvine  » reddit  » Y!

Somewhat related posts

11 Responses to “Suddenly structured articles”

  1. Phil Renaud Says:

    This is effing phenomenal.

    I intend to use this on a good many upcoming web projects. Thank you!

  2. Stoyan Says:

    Thanks, this is very flattering!

    If you need any tweaks or feature requests, do not hesitate to ask ;)

  3. Thame Says:

    This is simply amazing. I'm going to be doing an article soon that will use this. I tried making something myself but hit a brick wall about 1/100th of where I wanted to get…this is perfect!

  4. Stoyan Says:

    Cheers, Thame! Don't forget to drop a liink when you publish the article.

  5. Thame Says:

    I'll try to get the article done tonight…I'll definitely send you a little link love :D

  6. Stoyan Says:

    Cool. I actually meant the other way around, to give me the URL of your article, just for me to see my baby in action ;) But thanks anyway!

  7. andrew Says:

    Hey, cheers for this bit of code! I want to have a discography of a band, (Poor Old Lu), on my website, with links to lyrics. This is just what I needed!

  8. otro blog más » Unos cuantos de desarrollo web (XCII) Says:

    […] Cositas DHTML, comenzando por DOM Builder, una manera más de editar o añadir contenido a una página con Javascript. Seguimos con un generador de tablas de contenidos (a base de ir recorriendo las etiquetas de título de la página) y BoxOver, para hacer ‘tooltips’. Y cerramos la minisección con uno de “drag’n'drop” y un libro en línea: From DHTML to DOM scripting. […]

  9. John Drummond Says:

    Is this script supposed to fail when the header elements are not children of the body element? Header elements inside divs are not recognized.

  10. Stoyan Says:

    Thanks John, unforunatelly you're right, the script won't find headers that are in a div. If you look at the traverse() method, I have:
    this.the_element.nextSibling
    so only siblings of the first child of the body will be "inspected".

Leave a Reply