Wednesday, June 22, 2005

Overuse of the XPath "//" Shortcut?

I see a lot of people using XPaths with the "//" shortcut (lazy paths). But, consider what the processor must do when evaluating this. For example:

[test]
[a]
[b]
[c]
[d /]
[/c]
[/b]
[e]
[b]
[f /]
[/b]
[/e]
[/a]
[/test]



Using the XPath /test/a/b will return one node.
I'm in total control over which node(s) are returned,
so long as they follow my path exactly:

[b]
[c]
[d /]
[/c]
[/b]



However, using the XPath //b will return two nodes:

[b]
[c]
[d /]
[/c]
[/b]



and

[b]
[f /]
[/b]


This is okay if it's what I expected. However, I would guess that in most cases, having the second [b] node returned would not be expected, and could break an application's code.

Also, with the "//" shortcut, the processor must crawl every node of your tree in order to look for matches. By specifying an XPath that is rooted at the document element, you eliminate the need for the processor to perform this branch crawling, which greatly increases the query performance.

Oh, and as a side note, I'm starting to get irritated with Blogger trying to be smarter than me in handling my code snippets. I'm the human, dammit! You don't always have to change my formatting.