XPath


Fonto uses XPath 3.1 to query documents. XPath is a query language used by a lot of XML related standards like XSLT and is defined by W3C. The 3.1 update makes it especially powerful.

XPath selectors are often used as tests in components like fontoxml-families. These components will perform the traversal to find all nodes matching this selector. The selector should be thought of as being wrapped in a boolean() function.

The XPath expression self::someElement will resolve to true for any node having the nodeName ‘someElement’. The querying equivalent //someElement would resolve to true any node in a document having an element with a nodeName equal to someElement.

We host a playground containing fontoxpath, our XPath implementation on xpath.playground.fontoxml.com.

Type conversion

Most XPath 1.0 selectors are compatible with XPath 3.1. The main difference is the absence of implicit type convertion. This means selectors like "false" = false() will not return false, but it will throw a typing error. Also: 2.0 < '10' will not return false, but it will throw the same typing error.

XPath support

At the moment, not all features from XPath 3.1 are available. The missing features will be implemented in due time.

Supported language constructs

  • Absolute and relative paths
  • Variables using let and variables passed to the engine at runtime
  • Function calls
  • Dynamic function calls
  • Arrow function operator (=>)
  • The bang (!) operator
  • Quantified expressions (some and every)
  • Conditional expressions (the if operator)
  • All unary and binary operators
  • Maps and Arrays
  • Inline function declarations
  • The for operator

Unsupported features

  • The simplified map lookup operator ( ? )
  • The preceding and following axes

Datatypes

All datatypes used by the supported functions are supported. Note that since JavaScript is indifferent about all numeric types, all numeric types (xs:float, xs:double and xs:decimal and xs:integer) share the same precision.

The schema is not used to resolve attribute types, making all nodes (including attribute nodes) having the type xs:untypedAtomic.

Axes

ancestor Addresses all ancestors of the context node.
ancestor-or-self Addresses all ancestors, and the context node.
attribute Addresses all attributes of the contextNode. Is abbreviated by @.
child Addresses the childNodes of the context node. Is also the implicit axis.
descendant Addresses all descendants of the context node.
descendant-or-self Addresses all descendants, and the context node.
parent Addresses the immediate parentNode of the context node.
self Addresses the context node itself.
preceding-sibling Addresses all preceding siblings of the context node, in reverse document order.
following-sibling Addresses all following siblings of the context node.


The XPath spec defines the child axis as the implicit axis. Meaning the XPath expression * will resolve to all nodes having some child element defined, which is just about everything in and including the document. This does not match elements without any element children.

Node tests

The following node tests are supported:

  • document-node()
  • element()
  • element(NCName)
  • processing-instruction()
  • processing-instruction(NCName)
  • processing-instruction(StringLiteral)
  • node()
  • text()
  • comment()
  • attribute()
  • attribute(price)

The following node tests are not supported:

  • namespace-node()
  • schema-element()
  • element(NCName, TypeName)
  • attribute(NCName, TypeName)
  • document-node(element(book))

Built-in operators

The following operators are supported, all in their operator form ("1 eq 2" instead of "op:numeric-equal(1, 2)"):

Unsupported operators

The following operators are not supported:

As all date types are unsupported, the operators referencing them are also unsupported.

The binary types are unsupported:

The NOTATION type is unsupported:

The Node compares can not also not be used, yet:

Built-in functions

Functions with an fn prefix should be called without any prefix: the fn:true#0 function is called by executing true(). The following functions are supported:

The following math functions are supported:

The new Array an Map functions are supported, as well as the json functions. The array and map functions are available using the array and map prefixes, respectively.

The following functions are not supported:

The datetime functions are not supported:

Extra built-in functions

There are a couple of fonto-specific functions built in:


fonto:dita-class($arg0 as node(), $class as xs:string) as xs:boolean

Will match any node inheriting from the given class. Can be used like fonto:dita-class(., 'topic/ph').

For obvious reasons, this function is only available in DITA instances.

fonto:block-layout($arg0 as node()) as xs:boolean Matches any node having block layout. Nodes in a block-level family are of block layout, like frame, structure or block.
fonto:in-inline-layout($arg0 as node()) as xs:boolean Matches any node being inside an inline layout. Nodes allowing text nodes in them define inline layout. Nodes inside of these nodes will match this selector.
fonto:is-removed($arg0 as node()) as xs:boolean Matches any node which is configured as removed. This selector can be used to prevent matching things in a metadata part of the document, for instance.
fonto:markup-label($arg0 as node()) as xs:string Returns the markup label of the given node.
fonto:get-column-index($arg0 as node()) as xs:integer? Returns the table column index of a given table cell, taking columns spans into account. Returns the empty sequence when the cell is not known as a table cell.

Examples

The following XPath expressions are valid selector XPath expressions:

Match anything

self::node()

Match 'someElement'

self::someElement

Match anything having a 'someAttribute' set to 'someValue'

self::*[@someAttribute="someValue"]

Match any descendant of 'someAncestor'

ancestor::someAncestor

Match anything being a descendant of 'someAncestor', or having an attribute 'processAnyway'

self::*[ancestor::someAncestor or attribute::processAnyway]

Match anything matching some custom node test

//self::node()[fonto:some-custom-node-test(.)]

Match hovercrafts full of eels

self::hovercraft[eel and not(*[not(self::eel)])]

Match anything with nodeName equals hovercraft, containing an element with nodeName equals eel, and does not contain any children with a nodeName not equal to eel.

XPath 3.1 features

Comments

XPath 3.1 allows comments in XPath expressions.

(: This is a comment (: which allows nesting :) :)

Variables

XPath 3.1 introduces the let operator. Using this operator, it is possible to define variables. These variables make XPaths a lot more readable.

(let $x := 10, $y := 20 return $x * $y) = 30(: Retrieve all elements referencing the current context node, which do not have the marked attribute set. :) let $id := ./@id, $references := //*[@refid = $id] return $references[not(@marked)]

Arrow operator

The arrow operator can be used to pipe the output of a function to the next function

"A piece of text" => tokenize(' ') => count() => 4

Quantified expressions

(every $x in (1 to 10) satisfies $x < 10) = true()

some $x in //* satisfies $x/@someAttribute

Bang operator

("abc, "def")!concat("123", .) (: becomes "123abc", "123def" :)

Conditional expressions

if ($condition) then 'true' else 'false'

Maps and arrays

map {'a': 1, 'b', 2, 'c': [1,2,3]}('c') = [1,2,3] map:merge(//someNode!map:entry(@id, .)) array {1, 2, 3} [1, 2, [3, 4]](1) = 1

Do note that the lookup operator (?) has not yet been implemented.

Extensibility

To provide customer / schema specific extensibility, XPath selectors provide custom functions. These can be necessary when the result of an XPath should depend on state outside of the document.

For predictability of the function, refrain from accessing the XML DOM using APIs like fontoxml-selection/selectionManager#getCommonAncestorContainer in a function. Rather, use transforms like setContextNodeIdToSelectedElement and start querying from there.

Note that, for functions accepting a node, all DOM relations should be accessed using the domFacade argument, which is passed as a property of the dynamicContext parameter. This facade uses the same interface as Blueprint and can even be filled using a ReadOnlyBlueprint.

Custom XPath functions can be configured for an instance.

A simple, albeit quite useless custom test can be configured as such:

import registerCustomXPathFunction from 'fontoxml-selectors/src/registerCustomXPathFunction.js';

// Register a function accepting zero or more numbers and a single number as parameters, returning an xs:boolean.
registerCustomXPathFunction('fonto:possibly', ['xs:numeric*', 'xs:numeric'], 'xs:boolean', function(
	dynamicContext,
	a,
	b
) {
	return (
		a.reduce(function(n) {
			return Math.random();
		}) < 10
	);
});

// Register a function accepting one or more nodes, returning a string or null.
registerCustomXPathFunction('fonto:things', ['node()+'], 'xs:string?', function(
	dynamicContext,
	nodes
) {
	if (dynamicContext.domFacade.getParentNode(nodes[0]) === null) return null;

	return nodes.length > 3 ? null : 'Things';
});

These can then be used like this:

//*[fonto:possibly(3, 2)[fonto:possibly((1 to 6), 1)]] or fonto:things(//*)

Custom functions registered in an instance should be namespaced with an instance-specific prefix, to prevent duplicate function entries. The instance 'my-instance' should register custom tests like 'my-instance:possibly'.

Custom functions can be used anywhere, including in family configuration. This configuration can be evaluated when running commands, on nodes which are not in the document (yet). This means functions like DocumentsManager#getDocumentIdByNodeId may return null. Rather, use the relations in the blueprint to prevent unexpected behaviour.

Selector specificity

Selectors can be used to configure nodes, its family for example. Because a single node can match with multiple selectors, specificity is needed to prevent conflicts. Note that XPathPriority can be used to override specificity.

Rules

A specificity scheme like CSS is used. To compare the specificity of two selectors, follow the following steps:

  1. Group the tests of the selector by type:
    1. Functions. This includes not(), true() and false().
    2. Attribute tests, like @someAttribute or attribute::someAttribute="someValue".
    3. Node-name tests, like someElementName.
    4. Node-type tests, like comment() and processing-instruction().
    5. Universal tests: item() and node().
  2. Per group, in the list above, count the sub-expressions in each selector. If one has more tests of that type, is is more specific and will win. If both are equal, continue to the next group.
  3. If the selectors have equal specificity, the selector declared as last will win. If the selectors originate from different packages, use dependencies to force ordering between the packages.

The or operator uses the maximum specificity or the operands. @someAttribute or (@someOtherAttribute and @yetAnotherAttribute) has a specificty of two attributes.

Attribute existence tests and attribute value tests have equal specificity.

Examples

  • The selector self::X[@Y] is stronger than self::X.
  • The selector self::X[@Y and @Z] is stronger than self::X[@Y].
  • The selector ancestor::*[@Y] is as strong as self::X[@Y].
  • The selector fonto:something(.) is stronger than self::X[@Y="Z"].
  • The selector fonto:something(.) and self::fonto:something(.) is stronger than self::fonto:something(.).
  • The selector @someAttribute or (@someOtherAttribute and @yetAnotherAttribute) is stronger than @someAttribute.

Asynchronous selector compilation

Should an instance use a large amount of selectors (in the order of thousands), compiling them asynchronously can significantly reduce the load time of the instance. Compiling them asynchronously also persists them over reloads of the editor. This can be done using the precompileXPath function.

Evaluating XPaths

XPath queries can be evaluated using a number of functions. They all share the same signature: evaluateXPathTo...(XPathSelector: string, contextNode: Node, domFacade: IDomFacade, variables: Object).

If an XPath is ran in a context where no blueprint is available, a ReadOnlyBlueprint may be passed as a domFacade. In other cases, a normal Blueprint also works.

The following functions can be used to evaluate XPaths:

Use your programming skills

Using the CVK can result in repetitive instructions in your code for elements that allow variation on a number of configurable aspects while being the same on others. Use the configureProperties API for this. It is not advised to repeat the same configuration over and over.

The DITA <note> allows for a considerable number of types that add a particular connotation to it. Assigning each its own markup label and background color, while reusing two other configuration options (its default text container and the markup label widget), could be done like so:

var NOTE_VISUALIZATION_BY_TYPE = {
	attention: { label: 'Attention', backgroundColor: 'yellow' },
	danger: { label: 'Danger', backgroundColor: 'red' },
	warning: { label: 'Warning', backgroundColor: 'orange' }
};

configureAsFrame(sxModule, 'self::note', 'note', {
	defaultTextContainer: 'p',
	blockHeaderLeft: [
		createMarkupLabelWidget()
	]
});

Object.keys(NOTE_VISUALIZATION_BY_TYPE).forEach(function (noteType) {
	var noteVisualization = NOTE_VISUALIZATION_BY_TYPE[noteType];
	configureProperties(sxModule, 'self::note[@type="' + noteType + '"]', {
		backgroundColor: noteVisualization.backgroundColor,
		markupLabel: noteVisualization.label
	});
});

Care for speed

Be aware that certain combinations of selectors must traverse large parts of the document to find applicable nodes. This may cost a lot of processing power of the device the author is working on. Keep performance in mind at all times and find the most efficient way to reach your goal.