Amol Kher Microsoft Corporation July 2004 Applies to: the XML Diff and Patch GUI tool Summary: This article shows how to use the XmlDiff class to compare two XML files and show these differences as an HTML document in a.NET Framework 1.1 application. The article also shows how to build a WinForms application for comparing XML files. Click to download the code sample for this article. Contents Introduction There is no good command line tool that can be used to compare two XML files and view the differences. There is an online tool called XML Diff and Patch that's available on the GotDotNet website under the XML Tools section. For those who have not, you can find it. It is a very convenient tool for those who want to compare the difference between two XML files.
Comparing XML files is different from comparing regular text files because one wants to compare logical differences in the XML nodes not just differences in text. For example one may want to compare XML documents and ignore white space between elements, comments or processing instructions. The XML Diff and Patch tool allows one to perform such comparisons but it is primarily available as an online web application. We cannot take this tool and use it from command line. This article focuses on developing a command-line tool by reusing code from the XML Diff and Patch installation and samples. The tool works very similar to the WinDiff utility; it presents the differences in a separate window and highlights them.
The XML Diff and Patch tool contains a library that contains an XmlDiff class, which can be used to compare two XML documents. The Compare method on this class takes two files and either returns true, if the files are equal, or generates an output file called an XML diffgram containing a list of differences between the files.
You could copy paragraphs that you want to compare to 2 Word files. If you want to use Edit. Plus to code Auto. It, this page will show you how to set it up. Simply hold CTRL and select the two files you would like to compare.
The XmlDiff class can be supplied an options class XmlDiffOptions that can be used to set the various options for comparing files. An Overview of the XML Diff and Patch API The XmlDiff class implements a Compare method. XmlDiff.Options = XmlDiffOptions.IgnorePI XmlDiffOptions.IgnoreChildOrder; So much for a quick primer! I think we are ready to understand our simple app. To understand this article you should have an idea of the XmlDiff class and XmlDiffOptions XML Diff and Patch Meets Winforms We built a small Windows application, which comprises two forms. One form prompts the user to specify two files, and the other form hosts an Internet Explorer control, which displays the highlighted differences side-by-side between the two files, similar to any other file compare tool we know. The UI design is kept very simple and hence usable, since that's not the part we are focusing in this article.
You can always download the code and make it more usable for yourself. The idea of this article is to demo the XmlDiff code and show the differences in a nice IE control.
Figure 1 shows what our main screen looks like. The main screen The File menu has an Exit command. The Diff Options menu allows the user to select the options that will directly be passed on to the Compare method, which uses the XmlDiffOptions enumeration.
This keeps the utility simple and easy to understand. The following screen shot shows the options available. These are directly mapped to the XmlDiffOptions object.
Available options • Here's a quick primer on what some of the options look like and what they mean. For more detailed information visit the. • Ignore Processing instructions: Do not compare Processing instructions.
Thus and are both considered equal. • Ignore white spaces (normalize text values): Do not compare white space. This means insignificant white space. White space marked by xml:space='preserve' will be compared. But white space after element tags or any such possibly insignificant white space will be ignored. Thus and n n are both equal.
• Ignore prefixes: The prefixes of element and attribute names are not compared. When this option is selected then two names that have the same local name and namespace URI but with a different prefix are treated as the same names. The following two XML would be considered equal when this option is set. And are equal.
• Ignore Namespaces: The namespace URIs of the element and attribute names are not compared. This option also implies that the name prefixes are ignored. When this option is selected then two names with the same local name but a different namespace URI and prefix are treated as the same names. Thus and are equal under this option. • Ignore Child Order: The order of child nodes of each element is ignored. When this option is selected then two nodes with the same value that differ only by their position among sibling child nodes are treated as the same nodes. Thus and are equal.
The following is the basic control flow of the application. When the user clicks the Compare button the following actions take place. • Both the input files are verified to exist, since they could have been entered by hand and hence the path may be wrong. • The XmlDiffOptions enumeration is set using the values of the checked items on the Diff Options Menu drop-down. This is done using a SetDiffOptions method. • DoCompare is called which compares two files.
• The two files are compared and the diffgram is written out to a temporary file (vxd.out). This file is used to figure out the differences. • The samples code we mentioned earlier is called to figure out the differences. This code takes the original file and the diffgram file as inputs and generates the output, which consists of rows (HTML encoded) that show the side by side differences of the two files compared.
• HTML is written out to a temporary file and displayed in the IE Control in a separate window. This HTML shows the Diff in the desired manner.
Working with XML DiffGrams Before we move on to the samples code that gives us our HTML, we should discuss what the diffgram looks like. DiffGram doesn't really tell us the visual differences; it isn't the actual differences file. What it does tell us is that given a file A and a diffgram file, you can get to file B by applying the patches specified in the diffgram. In other words, the diffgram shows us how to incrementally build the target file, which is the file we compared against originally. The diffgram itself is written in XML, which can be parsed and used to apply on the original file to get the target file. The diffgram code consists of tags such as add, remove, and change.
For more information on the diffgram tags look at this. See the following sample taken from the XML Diff Patch site. Twincat 3 1 Keygen Download here. The concept would be similar to XPath users. Every tag has a match attribute which works like a select operation. It allows you to move to a specific location in the original file.
The other tags then work relative to the position you are placed. So for instance, match='2' would mean go to the second child node from this location. An add tag adds specific text or markup while a remove tag removes specific text or markup. There are other helper tags such as change, which is used to update the contents. Some text 4 Some text 5 Changed text new value changed attribute value As you can see, parsing this code and applying the changes specified in the diffgram is not trivial. However, thankfully we don't have to do all that ourselves.
The XmlDiff and Patch utility ships with samples code that does all this work for us. It can be found in the Samples XmlDiffView directory. We compiled that source code and then copied the generated library (XmlDiffPath.View.dll) out to our directory to reuse and link to it. It contains one class called XmlDiffView. XmlDiffView has a method called Load, which takes the original XML file and the DiffGram file.
Load internally loads the original file and applies the diffgram patches to it to reach the target file. While doing so, it also stores the HTML required to show the differences in two columns for each line that was read.
The desired output HTML is got by invoking the GetHTML method, which takes a TextWriter to write the HTML. For the interested reader, the bulk of parsing work is done in a private method found in XmlDiffView.cs file called ApplyDiffgram. I am quoting it here to see what's going on.