Using Track Changes With Version History

There is a common misconception that EditLive's track changes functionality is intended to be a substitute for the version history built into most content management systems. In reality, the two systems work together to achieve different aims.

Version history is the corporate compliance that you need, track changes is the collaboration that you want.

So where version history precisely tracks every change so that you can ensure accountability, track changes puts users in control, giving them a tool they can use to highlight important changes that need review. One of the common problems people have with version history is that when you make formatting changes to a large section of the document, that entire section shows up as a change and smaller changes to the actual content are often missed. With track changes, the author making the change can leave track changes off while making the formatting changes and then turn it on so just the important content changes are marked as changed. This ensures the reviewer can see the important changes and consider them appropriately.

Track changes also handles multiple people collaborating on a single document at once, with each person's changes being shown in a separate color it's easy to see who changed what. While version history can reveal this information by looking at what changed in each individual revision, but track changes allows you to see that information altogether in one place. When documents go through an extended collaboration process this can be extremely useful. You can also accept and reject track changes during the collaboration process as consensus is formed and the document takes shape, again focusing authors on just the changes that really matter, regardless of which version the change was made in.

Finally, the track changes data is shown right in the editing interface so users don't have to flip between editing the document and reviewing the previous changes.

So how does track changes work with version history at the technical level? Seamlessly usually. The track changes data is stored right in the HTML so it's versioned along with the content without any changes to the CMS. Additionally, the track changes mark up is designed to be completely hidden when viewed in a browser so you can see what the intended final version of the document was without the track changes markup getting in the way. We've also documented the track changes format so you can parse it and manipulate it in any way you want (it's based on the open document format's track changes but modified to work with HTML instead of ODF).

If you want to see track changes in action, try out our demos or watch the recording of our recent quarterly product update webinar which steps through a common use case for track changes.

Track Changes and Express Edit

One of the major functions of EditLive! that isn't available in Express Edit is track changes. Since the JavaScript based editor has no knowledge about how to work with the track changes information, the change information may become inaccurate or in some cases corrupted1. Fortunately, it is still possible to use track changes in an environment where Express Edit is in use by adding just a little bit of JavaScript logic when loading the editor.

First, there are two important techniques for working with track changes information that we'll use:

With these tools in hand we can make sure that if track changes information is detected, we always load the full editor:

var hasChangeData = documentContent.indexOf("<trackchanges ") >= 0;
editlive.setExpressEdit(!hasChangeData);

This is enough to ensure that the track changes data is safe, but it will cause the full editor to load even for users that don't have Java installed which is what Express Edit was designed to prevent in the first place. Another option is to warn the user that track changes information has been detected and offer to strip out the change information (using either JavaScript or XSLT). There are many different ways to prompt the user and which is best will depend on your particular appplication, but with the articles above and a bit of knowledge of JavaScript you should be able to create pretty much any interface you want.

1 - the way change information is stored is designed to be as robust as possible so that it doesn't require special handling when processed on the server side, but editing the document can still cause problems.

Accepting All Changes With JavaScript

In previous installments we've looked at ways to detect the presence of track changes data and accept the changes using XSLT. This time round, we'll use JavaScript string parsing to accept all the changes. This technique is useful for any environments where an XSLT processor isn't readily available1. It should be reasonably straight forward to port the example code to other programming languages.

The first step is to remove the <change> tags within the document content. As the tag is always an empty tag, this is a very simple regular expression:

source = source.replace(/<change[^>]*>/gim, '');

Firstly the pattern <change[^>]*> is a pattern we'll use regularly to match a specific tag regardless of what attributes it has. Secondly, the three flags we specific, gim, mean that the expression is a global match and so will match multiple tags on the same line (g), case insensitive (i) and matches across multiple lines (m).

The second step is to find the start of the div containing the track changes element:

var startTrackChangesDiv = source.search(/<div [^>]*>\s*<trackchanges [^>]*>/gim);

We again use the standard pattern for matching a specific tag, but this time we combine two of them separated by whitespace. Firstly, we find the div start tag (<div [^>]*>), then any amount of pure white space (\s*) then the trackchanges start tag (<trackchanges [^>]*>). The startTrackChangesDiv variable now contains the offset of the div start tag.

To remove the DIV we need to know where it ends. This is much simpler, since the DIV is always inserted as the last thing in the body, so it's always the last closing div tag in the document.

var endTrackChangesDiv = source.lastIndexOf('</div>') + '</div>'.length;

Finally, we remove the entire div by extracting just the content before the start of the div and after the end of the div:

source = source.substring(0, startTrackChangesDiv) +
        source.substring(endTrackChangesDiv);

The complete function winds up as:

function acceptAllChanges(source) {
    source = source.replace(/<change[^>]*>/gim, '');
    var startTrackChangesDiv = source.search(/<div [^>]*>\s+<trackchanges [^>]*>/gim);
    var endTrackChangesDiv = source.lastIndexOf('</div>') + '</div>'.length;
    source = source.substring(0, startTrackChangesDiv) +
        source.substring(endTrackChangesDiv);
    return source;
}

The end result is the document as if all the changes had been accepted in the editor, ready for publishing to the world.

1 - while XSLT is available in modern browsers via JavaScript, it's often easier to use this string parsing approach than to handle the browser incompatibilities with XSLT.

Accepting All Changes With XSLT

When publishing documents that contain track changes information, it's a good idea to make sure that all the changes have been accepted, before you push the document out to the world. We've previously shown how you can detect that change information is present, but we can go one step further and remove that information and publish the document as if all the changes had been accepted.

In this case, we'll use XSLT to remove the track changes markup, but it's also possible to use plain string parsing - we'll show that in a future article.

The first step is to set up a simple identity transform style sheet, so that by default all the content is copied over unchanged:

<?xml version="1.0" ?>
<xs:stylesheet xmlns:xs="http://www.w3.org/1999/XSL/Transform"
  version="1.0" xmlns:html="http://www.w3.org/1999/xhtml">
    <xs:template match="@*|node()">
      <xs:copy>
        <xs:apply-templates select="@*|node()" />
      </xs:copy>
    </xs:template>
</xsl:stylesheet>

Next, lets strip out the <change> elements that mark where changes occurred in the document. This is just a simple template that matches the change element and outputs nothing:

<xs:template match="html:change" />

Next, we'll use the same technique, to strip out the <div> element that contains the actual track changes information. The match attribute is slightly more complex here because we only want to match the <div> element that has a  <trackchanges> element as a child:

<xs:template match="html:div[./html:trackchanges]" />

We could of course just remove the track changes element, but that would leave an empty <div> tag around unnecessarily.

The complete style sheet is then:

<?xml version="1.0" ?>
<xs:stylesheet xmlns:xs="http://www.w3.org/1999/XSL/Transform"
       version="1.0" xmlns:html="http://www.w3.org/1999/xhtml">

    <xs:template match="html:change" />

    <xs:template match="html:div[./html:trackchanges]" />

    <xs:template match="@*|node()">
        <xs:copy>
            <xs:apply-templates select="@*|node()" />
        </xs:copy>
    </xs:template>
</xs:stylesheet>

One catch: you need to make sure that your document is being output as XML or XHTML instead of plain HTML. Use the outputXML and outputXHTML attributes of the htmlFilter element in your configuration file to control this.

Detecting Track Changes Markup In Documents

Before a document is published outside of a content management system, it is good policy to ensure that all track changes data has been handled - either accepting or rejecting the changes. While it's easy for users to select "Accept All" before publishing, it is often useful to include a check before publishing to ensure that changes haven't slipped through.

Detecting the presence of change information is simple, any document with track changes data includes a "trackchanges" element in a hidden div at the end of the body. In nearly all cases you can just check if the string "<trackchanges " is present in the document. In rare cases where there is a custom tag called "trackchanges" in the document, you may need to additionally check for the presence of the DIV before the trackchanges element.

In JavaScript you can implement this check with the code:

var hasChangeData = documentContent.indexOf("<trackchanges ") >= 0;

where the documentContent variable contains the content of the document. We'll be publishing a tip on stripping out the track changes data automatically when publishing, without having to first load it into the editor.