Comparing XSD Files

Comparing XML Schema Files

Comparing XSD files is a common mechanism for change impact analysis. However, doing it as XML or text can only go so far; the multitude of ways to do the same thing in XSD, the fact that there is no canonicalization method for XSD components, plus the number of technologies relying on XSD to generate even more artifacts, further make traditional approaches fall short of what is expected by change management.

Even a specialized XSD comparison engine, such as the one provided by QTAssistant XSR, may seem to fail some tests; it'll mostly happen when an expected result is hard to define, particularly due to the impact it has on external systems rather than the nature of the XML validated by the schema.

The two "tiny" XSD samples below illustrate a simple case of disagreement as to what should be the comparison outcome.

Tiny 1.0

<xsd:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" xmlns="http://tempuri.org/XMLSchema.xsd" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:complexType name="tiny">
    <xsd:sequence>
      <xsd:sequence>
        <xsd:element name="a"/>
      </xsd:sequence>
      <xsd:element name="b"/>
    </xsd:sequence>
  </xsd:complexType>
</xsd:schema>

Tiny 1.1

<xsd:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" xmlns="http://tempuri.org/XMLSchema.xsd" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:complexType name="tiny">
    <xsd:sequence>
      <xsd:element name="a"/>
      <xsd:element name="b"/>
    </xsd:sequence>
  </xsd:complexType>
</xsd:schema>

In terms of the described XML, these two XSDs are the same. However, for a custom binding file in a JAXB solution relying on the superfluous <xsd:sequence> to generate a special class, version 1.1 is no longer the same.

The rest of this document describes the basics of comparing XSD files on QTAssistant.

The XSD comparison capability available in QTAssistant XSR implements support for most common user requested XSD features. If you have requirements not met by the current version of QTAssistant XSR, we can easily evaluate them for inclusion in our product and/or in our next version of this paper.

Simple Compare

The following two XSD files are used to illustrate basic concepts.

Base.xsd

<xsd:schema targetNamespace="urn:paschidev-com:XSD" xmlns="urn:paschidev-com:XSD" attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="GlobalElem1" type="CType1"/>
  <xsd:complexType name="CType1">
    <xsd:sequence>
      <xsd:element name="Elem1" type="xsd:int"/>
    </xsd:sequence>
  </xsd:complexType>
  <xsd:element name="GlobalElem2">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="Elem1" type="xsd:int"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
  <xsd:element name="GlobalElem3" type="xsd:int"/>
  <xsd:element name="GlobalElem4" type="xsd:string"/>
</xsd:schema>

Revision.xsd

<xsd:schema xmlns="urn:paschidev-com:XSD" attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="urn:paschidev-com:XSD" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="GlobalElem1" type="CType1"/>
  <xsd:complexType name="CType1">
    <xsd:sequence>
      <xsd:element name="Elem1" type="xsd:string"/>
    </xsd:sequence>
  </xsd:complexType>
  <xsd:element name="GlobalElem2">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="Elem1"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
  <xsd:element name="GlobalElem3" type="xsd:int"/>
  <xsd:element name="GlobalElem5" type="xsd:string"/>
</xsd:schema>

To compare these files, the first step is to create a new XML Schema Refactoring file.

Create new XSR file
Create new XSR file
(Click to Enlarge)

 Using the context menu in the Document Explorer tool window, create a new XML Schema Collection, named Simple.

Steps to create a named XML Schema Collection
Steps to create a named XML Schema Collection
(Click to Enlarge)

Add Base.xsd to Version 1.0.0 of the collection.

Steps to add XSD files to Base version
Steps to add XSD files to Base version
(Click to Enlarge)

Create Version 1.1.0 of the Simple collection and attach Revision.xsd to it

Steps to add a new version and files to it
Steps to add a new version and files to it
(Click to Enlarge)

To compare, right click on Version 1.0.0, invoke Compare with Version command and at the prompt, select Version 1.1.0 to compare.

Invoking the Compare with Version command and the prompt
Invoking the Compare with Version command and the prompt
(Click to Enlarge)

 The generated report is shown below.

XSD Set Diff Report: Base vs. Revision
XSD Set Diff Report: Base vs. Revision
(Click to Enlarge)

 

A compare report has the following list of columns

Icon Graphical representation of the compared XSD component's type.
Name The local name of the XSD component being compared. The value is empty for a local type declaration.
Xmlns XML Namespace. The XML namespace of the XSD component being compared. The value is empty for unqualified names.
Xmlns Alias XML Namespace Alias. The XML namespace alias associated with the namespace of the name of the XSD component being compared.
Status The result of comparing the XSD component's definition. This field indicates if there was a change that directly affected the definition of the component. Possible values: Same, New, Deleted, Modified
Extended Status The result of comparing the XSD component's signature. This field indicates if there was a change in any of the dependent components. An empty value means "Not Applicable/Not evaluated"; a Depends Modified means that referenced global components, directly or indirectly, were Modified.  
Source URI The location of the XSD component. It points to a location from the base set when the XSD component exists in both sets, or when it was deleted. For added components, it points to a location from the revision set.
Line Number The line number of the XSD component at the Source URI.
Column Number The position in the line of the XSD component at the Source URI.
Compared Source URI The location of the other XSD component. This field is populated only when the XSD component exists in both sets and it will represents a location from the revision set.
Compared Line Number The line number of the XSD component at the Compared Source URI. This field is populated only when the XSD component exists in both sets.
Compared Column Number The position in the line of the XSD component at the Compared Source URI. This field is populated only when the XSD component exists in both sets.
Compared With The fully qualified name of the other XSD component. Normally this field is empty. 
Reason A free form text describing information about what comparison test caused the status to change.

 

The first row shows an empty Name cell - it means anonymous types were compared; the Icon indicates a complex type. Indeed, the source, line and column number indicate the anonymous complex type of the GlobalElem2 element. The Status is Modified since the type of the Elem1 element was changed from xsd:int to xsd:anyType.

CType1 is a complex type, defined globally. Similar to the anonymous complex type before, the Status is Modified since the type of the Elem1 element changed from an xsd:int to an xsd:string.

The first Elem1 corresponds to the local element definition nested within the CType1 definition. The Status is Modified since its type changed from an xsd:int to an xsd:string.

The second Elem1 corresponds to the local element definition nested within the anonymous complex type for the GlobalElem2 element. The Status is Modified since its type changed from xsd:int to xsd:anyType.

The GlobalElem1 element has a Status of Same. However, the Extended Status shows Depends Modified meaning that one of its dependencies was Modified - the underlying CType1 in this case. Most people seem to agree here. Our rationale was based on the fact that the XML markup describing this element did not change and because of this, most tools generating content should most likely not be affected here. The Extended Status provides the mechanism to indicate that overall though, considering all dependencies transitively, the GlobalElem1 element is not quite the same.

The GlobalElem2 element has a Status of Modified. Unlike the GlobalElem1 element Status, the major difference between these two is that with anonymous type definitions, the GlobalElem2 element's markup is changed. 

The GlobalElem3 with a Status of Same and empty Extended Status shows a case where a component was not changed, directly or transitively.

The GlobalElem4 with a Status of Deleted is in Base.xsd, and not in Revision.xsd.

The GlobalElem5 with a Status of New is not in Base.xsd, and in Revision.xsd. 

Real-life Example

The XSD comparison engine in QTAssistant XSR can compare file sets that have different layouts. In other words, the composition of XSD files through import/include/redefine doesn't matter. For illustration, we're be using some public industry standards.

 ACORD is a global, non-profit organization in the insurance industry. Version 2.21 of its Life & Annuity XSD was released as a multi-file set while version 2.26 (arbitrarily picked) was released as a single file XSD.

ACORD Life & Annuity v2.21 XSD files layout
ACORD Life & Annuity v2.21 XSD files layout
(Click to Enlarge)

Comparing these two versions is no different than the simple compare scenario; simply follow the same steps (one difference being adding more files for version 2.21) to get a report similar to the one below (shown grouped by the Status field).

ACORD Life & Annuity v2.21 vs. v2.26
ACORD Life & Annuity v2.21 vs. v2.26
(Click to Enlarge)