Got more questions? Find advice on: ASP | SQL | Regular Expressions | Windows
in Search
Welcome to XmlAdvice Sign in | Join | Help

Kirk Allen Evans' XML Blog

.NET From a Markup Perspective

XSLT Performance in .NET (or "Why to use XPathDocument for XSLT")

No idea why, but Mike Gunderloy's The Daily Grind 125 showed up as an unread item in SharpReader today, dated 12/22/2003. The original date of Mike's article was July 15, 2003.

I am glad the post re-emerged, because it includes a link to Dan Frumin's article “ XSLT Performance in .NET“ on O'Reilly's OnDotNet.com. The premise of the article is that XSLT is slow in .NET compared to MSXML, and the summary indicates that you should consider COM interop to MSXML instead of using XSLT natively in .NET. The comments for the article give Dan a pretty good thrashing as the conclusions he draws are fundamentally flawed. In the article, Dan describes performance using the System.Xml.XmlDocument type as the transformation source. This was documented in an KB article sometime ago, noting known XSLT performance issues and the appropriate use of different classes in the System.Xml stack.

I commend Dan for publishing the article and doing some comparison work, it shows that people really are still interested in XSLT, and are considering performance. For others considering XSLT best practices within the .NET Framework, see the XSLT Roadmap ( INFO: Roadmap for Executing XSLT Transformations in .NET Applications). This is a good jumping off point to find documentation (through link surfing) on the following:

  1. Do not use XmlDocument as the source of an XSLT transformation: use XPathDocument. This also cures the mentioned problem of using preceding-sibling axis with XmlDocument in Q325689: using XPathDocument avoids the problem altogether.
  2. XmlDataDocument inherits from XmlDocument. So, the same logic follows as in #1: do not use XmlDataDocument as the source of an XSLT transformation: use XPathDocument. XmlDataDocument is slower than XmlDocument as the source of XSLT transformations due to the DataSet synchronization that must occur.
  3. Do not use msxsl:script elements in your stylesheets in server-side applications (ASP.NET apps). If you must use msxsl:script, cache the XSLT transformation and reuse it.
  4. If you are running version 1.0 of the framework, make sure you double-check Q325689. It details the issue of using xsl:key and the preceding-sibling axis.

XPathDocument includes (essentially) a single method, XPathDocument.CreateNavigator, implementing the interface member IXPathNavigable.CreateNavigator. There really is not much you can do with an XPathDocument other than to call its CreateNaviator method and use the returned XPathNavigator type. So, XPathDocument is really uninteresting, the interesting actor is the XPathNavigator type.

XPathNavigator provides a read-only cursor model based on the XPath data model. Of course, the XPathNavigator implementation navigates over XML documents natively through XPathDocument, but you can also create specialized implementations that navigate over a variety of stores, such as zip files or any arbitrary object graph. The XPathDocument type is specifically optimized for XPath queries, which are read-only by nature. So, XPathDocument.CreateNavigator creates an XPathNavigator to navigate over a read-only store that is highly optimized for read operations using XPath as the search syntax. You can imagine that performance using XPathNavigator to navigate over other stores would not necessarily be nearly as optimized.

As we saw earlier, any type that implements IXPathNavigable supports the CreateNavigator method. Another type that implements this interface is the System.Xml.XmlNode type. An implementation of the XmlNode type is the most popular member of the System.Xml stack, the System.Xml.XmlDocument class. This class represents a tree that is used for in-memory updates to XML documents, representing the W3C Core DOM Level 1 and Level 2 recommendations. That means that we can create an XPathNavigator to navigate using a cursor model over a store that is optimized for in-memory edits.

Hopefully you noticed the last part of that sentence: XmlDocument is not optimized for read operations, it is optimized for in-memory edits, not read operations. Instead of being a jack-of-all-trades (such as the MSXML2.DOMDocument.4.0 COM class), the classes utilized as data stores in the System.Xml namespace are specialized for several different behaviors:

  • System.Xml.XPath.XPathDocument - optimized for reads involving XPath queries, requiring the entire document to be in-memory.
  • System.Xml.XmlDocument - optimized for in-memory editing of XML documents, requiring the entire document to be in-memory.
  • System.Xml.XmlReader - optimized for forward-only reads while minimizing memory usage.
  • System.Xml.XmlWriter - optimized for forward-only writing of XML documents while minimizing memory usage.

I did not list the XPathNavigator in this list because I qualified it with “classes utilized as data stores“. XPathNavigator only navigates over the underlying store, it does not actually contain data. As mentioned before, you can implement an XPathNavigator over a multitude of stores, including an XmlReader (allowing navigation over a forward-only store that is optimized for minimizing memory consumption). The point here is that XPathNavigator doesn't contain, it just serves as an interface for navigation.

With this specialization in mind, lets look at some of the overloads for XslTransform.

public void Transform(System.Xml.XPath.XPathNavigator input, System.Xml.Xsl.XsltArgumentList args, System.Xml.XmlWriter output, System.Xml.XmlResolver resolver);

The above method signature accepts an XPathNavigator as the source of the transformation. So, you can guess from the context of this post what happens if you pass in an XPathDocument type, because XPathDocument implements IXPathNavigable:

public void Transform(System.Xml.XPath.IXPathNavigable input, System.Xml.Xsl.XsltArgumentList args, System.Xml.XmlWriter output, System.Xml.XmlResolver resolver);

This method signature accepts any type that implements IXPathNavigable. Internally, the Transform method just calls the IXPathNavigable.CreateNavigator() method and forwards the call to the first signature. So, consider what happens if you just specify a string URL for the input file's location and the output file's location:

public void Transform(string inputfile, string outputfile, System.Xml.XmlResolver resolver);

I bet you can guess where I am going with this one. The inputfile parameter is used to create an XPathDocument, which is forwarded to the second method signature as an XPathDocument, which is then forwarded to the first method signature by calling the CreateNavigator method.

Look at the overload list for XslTransform.Transform(). You might notice that there is no overload for XslTransform.Transform() that explicitly accepts the XmlDocument type as one of its parameters. XmlDocument inherits from XmlNode, which in turn mplements IXPathNavigable. That means that, when you pass in an XmlDocument as the source for an XSLT transformation, internally the XslTransform calls CreateNavigator. So aren't you just using an XPathNavigator by proxy? The answer is a qualified “yes.“ Even though XPathNavigator is used as the source internally for the transformation, what really matters is what the XPathNavigator is navigating over. Remember that XPathNavigator is simply the cursor model over the underlying store, so it is the underlying store's speed that matters. Since XmlDocument is not optimized for reads, it is not as good a candidate for XSLT transformations or XPath queries as the XPathDocument type is.

Back to the beginning of this post, the XmlDocument is not the best source for an XSLT transformation, which is why the article on OnDotNet.com was flawed. For the best comparison, the set of best practices for each platform should be considered, which is why comparisons between XSLT processors are often and highly disputed.

An interesting point about the XPathDocument and XPathNavigator is that there are more reasons to favor them in your applications than simply XSLT and XPath capabilities. In the June 2003 MSDN TV episode, Don Box asserts that you should use the XPathNavigator to pass XML within appdomans in lieu of passing lower API types such as XmlDocument or XmlReader. The concept is that XPathNavigator is an abstraction to both, allowing you to change the internal implementation without creating versioning issues for yourself later. For now, your documents may be small and XmlDocument works fine, but you might find that using XmlTextReader or XmlValidatingReader (or SgmlReader, or XIncludeReader) might suit your implementation better.

If this isn't proof enough to use the XPathDocument (and, by proxy, the XPathNavigator), then you should also consider that Whidbey's implementation of XPathDocument will become the premier member in the System.Xml stack for XML storage. A new type of navigator called the XPathChangeNavigator will allow navigation over changes within a document, as well as a host of other capabilities.

Sponsor
Published Saturday, January 03, 2004 9:44 PM by kaevans
Filed under: , , ,

Comments

 

kaevans said:

Just to clear up the tiny mystery, the old Daily Grind showed up in the RSS feed because I added some extra information on the Microsoft Exchange problem that I mentioned in it - which continues to draw a steady stream of e-mail my way.
January 4, 2004 12:28 AM
 

kaevans said:

Kirk, thanks for this post. Good explaination of the internals!
January 6, 2004 12:33 AM
 

kaevans said:

There is also XmlDataDocument, which afaik is even slower than XmlDocument. Still used in XSLT generating perf-related user complaints.
Actually there is another side of the coin - XSLT not always is central spot of the application and choosing XML store to enable fast XSLT is not alwaws reasonable. Copying potentially huge XmlDocument into XPathDocument to run transformation may be even slower than transforming over XmlDocument, not to mention memory issue.
So tradeoffs, tradeoffs.
January 7, 2004 7:17 AM
 

kaevans said:

Sorry this probably isn't at all appropriate to post here, but I have been searching like made to find some things and Kirk seems to the one with answers.

Kirk you made a post a while back: "You can also use client-side JavaScript to do lots of cool stuff, including
using a client-side IE Web Service Behavior that can call web services from
client-side JavaScript. You can also use IE Data Islands and work with XML.
You can create an MSXML.DOMDocument object on the client and work with XML,
then use the MSXML.XMLHTTP class to post the data to your web server. There
is lots of stuff you might do on the client, but the real question is "do
you really need to?" The example above shows how to update the XML document
using the XmlDocument class, is that enough? Or do you really need lots of
client-side JavaScript to manipulate the DOM in the client?"

That really cool stuff is exactly what I would like to be able to do.
If you could point me in the right direction as to where some examples are of that really cool stuff, it would be greatly apreciated.

Thank you.
January 28, 2004 5:52 PM
 

kaevans said:

Your article is great.
Thanks.
May 28, 2005 1:39 AM
 

kaevans said:

Wow, superb!
Thanks.
June 5, 2005 7:13 PM
 

TrackBack said:

January 5, 2004 8:13 AM
 

TrackBack said:

January 6, 2004 12:39 AM
 

TrackBack said:

Kirk Allen has written a good article on XSLT Performance in NET. If you're looking to get the best out of your XSLT functions in .NET, I recommend you take a peek. It also discussed Dan Frumin's article last june on the same subject.
January 17, 2004 4:13 PM
 

TrackBack said:

January 30, 2004 11:37 AM
 

Work from home. said:

Work at home jobs. Work from home. Work at home http. Work from home mlm business opportunity.

July 8, 2008 11:39 AM
Anonymous comments are disabled

This Blog

Syndication

News

Looking for a place to talk about XML? Tired of the "main feed police" cracking about your interests in football and politics? Sign up for a free web log on XMLAdvice.com.