The basics

This page introduces some concepts that you will need to be familiar with before we can get started with the cool stuff. Other pages will regularly refer back to the concepts introduced on this page. This page is important!

Type system

XPath and XQuery share the same type system. The most abstract data type available in this system is called item. All other data types are (indirectly) derived from item. The type system can be divided into three groups of data types. One group consists of all node types, another group consists of function types, and the last group consists of all atomic types.

All node types are derived from the abstract data type node, which itself is derived from item. There are seven different types of nodes derived from the abstract node type: attribute, comment, document, element, namespace, processing instruction, and text.

Then there are functions. There is one abstract function data type, derived from item. All functions are derived from this data type. But not only functions are derived from this abstract data type. Maps and arrays are too. The map and array data types are abstract data types too.

The last group consists of all atomic types. There is one generic atomic type called xs:anyAtomicType. This data type is derived from item. The following table is an overview of all available simple types.

Document order

Document order is the order in which nodes appear in an XML document. There's also a reverse document order, which is simply the reverse of the document order.

The document order is stable. This means that the relative order of any two nodes will not change while processing a query.

Document order satisfies the following constraints within a tree.

1. The root node is the first node The root node, in this case <xml>, will always be the first node in document order.

XML

<xml>
	<paragraph>Paragraph content.</paragraph>
</xml>

Open example in playground

2. A node always occurs before its children and descendants The root node, in this case <xml>, will be followed by its child <paragraph>. The <paragraph> element will be followed by its text node child containing "Paragraph content.".

XML

<xml>
	<paragraph>Paragraph content.</paragraph>
</xml>

Open example in playground

3. Namespace nodes immediately follow their element node The relative order of namespace nodes is stable but implementation-dependent. The root node <xml> will be followed by its namespace node.

XML

<xml xmlns:prefix="http://example.com/prefix"></xml>

No playground example available.

4. Attribute nodes immediately follow their element node's namespace nodes The relative order of attribute nodes is stable but implementation-dependent. The root node <xml> will be directly followed by its attribute id and the <paragraph> element will be directly followed by its id attribute.

Other

<xml id="doc-1">
	<paragraph id="p-1">Paragraph content.</paragraph>
</xml>

No playground example available (yet).

5. The relative order of siblings is the order in which they occur in the children property of their parent node The <paragraph> elements will be sorted in the order they occur in the document.

XML

<xml>
	<paragraph id="p-1" />
	<paragraph id="p-2" />
	<paragraph id="p-3" />
</xml>

Open example in playground

6. Children and descendants occur before the node's following siblings The text nodes in the <paragraph> elements will occur directly after their parent, before their parent's following sibling.

Other

<xml>
	<paragraph>Paragraph content 1.</paragraph>
	<paragraph>Paragraph content 2.</paragraph>
	<paragraph>Paragraph content 3.</paragraph>
</xml>

Open example in playground

Expressions

The expression is the basic building block in both XPath and XQuery. Everything is an expression in both languages, apart from a few constructs used in XQuery to define modules.

Consider the following example:

Expression example

XQuery

1 + 2

Open example in playground

This example consists of a total of three expressions. The 1 and 2 are literal expressions. The + is an arithmetic expression. The arithmetic expression expects two operands. In this example, its operands are the two literal expressions.

The example illustrates how expressions are nested. The literal expressions are nested in the arithmetic expression. The arithmetic expression is the top-level expression in this example. It could, in turn, be nested in another expression. There usually is only one top-level expression.

Expression context

Every expression has its own context. The expression context consists of all the information that can affect the result of an expression. Information in the context can be organized into two categories called the static context and the dynamic context.

Context item

The context item is part of the dynamic context. It is the most important piece of information in the expression context. When the context item is a node, it may also be called the context node instead of the context item. Examples of how to use the context item will be given in later chapters.

Context position

The context position is part of the dynamic context. It indicates the position of the context item in the sequence currently being processed. The position changes whenever the context item changes. The context position is returned by the fn:position function.

Context size

The context size is part of the dynamic context. The context size is the number of items in the sequence currently being processed. Its value is always an integer greater than zero. The context size is returned by the fn:last function.

Sequences

All data types in XPath and XQuery are derived from item, except for the sequence. A sequence is a collection of zero or more items. A value returned by an expression is always a sequence.

Creating a sequence

Creating a sequence is simple. You'll only need one pair of parentheses:

Creating an empty sequence

XQuery

()

Open example in playground

This pair of parentheses is a valid expression. This expression will resolve to an empty sequence, and that's it. An empty sequence is often used to return nothing. This is similar to returning null in many other languages.

Sequences can also be created containing items. A sequence containing only one item is often called a singleton. An item is always equal to the singleton sequence containing that item. It's like the sequence is not there in that case.

Creating a singleton sequence

XQuery

(1)

Open example in playground

Creating a sequence

XQuery

(1, 2, 3)

Open example in playground

Creating a sequence with mixed data types

XQuery

("string", 123, true())

Open example in playground

The last example shows that the items in a sequence don't have to be of the same data type. Note that a lot of the operators and functions in XPath and XQuery do require all items in a sequence to be of the same type.

Accessing members

Accessing a member of a sequence can be done like this:

Accessing a sequence member

XQuery

("a", "b", "c", "d", "e")[1]

Open example in playground

Note that, in contrast to many other languages, indices in XPath and XQuery are one-based instead of zero-based. For this example, it means that the first item ("a") is returned instead of the second item as you might expect.

The syntax used here looks similar to accessing a member of an array in many other languages. But it actually is a filter expression.

Nested sequences

Sequences cannot exist as a nested structure. Consider the following example:

Nested sequences

XQuery

((1, 2), 3, ((("a"), "b"), "c"))

Open example in playground

The result of this expression will be a single sequence containing three integers followed by three strings. The sequences will all be flattened to one sequence.

Range expression

If you want to create a sequence containing a range of numbers, you can use the range expression to create such sequence.

Range expression

XQuery

(1 to 5)

Open example in playground

Atomization

Some operators depend on a process called atomization. Atomization is applied to a value when a sequence of atomic values is required. The result of atomizing is either a sequence containing atomic values or a type error (FOTY0012). Atomizing a value is defined as the result of invoking the fn:data function.

Atomization works according to these rules:

  • If the item is an atomic value, it is returned.

  • If the item is a node, its typed value is returned. If the node does not have a typed value, a type error (FOTY0012) is raised.

  • It the item is a function, including maps, excluding arrays, a type error (FOTY0013) is raised.

  • It the item is an array, atomization will be applied to each item in the array. This will happen recursively for nested arrays.

Effective boolean value

Some expressions use the effective boolean value of a value. The effective boolean value of a value is defined as the result of applying the fn:boolean function to the value. This works according to the following rules.

Empty sequence

The effective boolean value of an empty sequence is false.

Effective boolean value of an empty sequence

XQuery

boolean(())

Open example in playground

Nodes

The effective boolean value of a sequence whose first item is a node is true.

Effective boolean value of a singleton sequence containing one node

XQuery

boolean(/xml)

Open example in playground

Effective boolean value of a sequence containing nodes

XQuery

boolean(/xml/child::*)

Open example in playground

Effective boolean value of a sequence whose first item is a node

XQuery

boolean((/xml, false()))

Open example in playground

Boolean values

The effective boolean value of a singleton sequence containing a value of type xs:boolean or a value derived from xs:boolean will return that value unchanged.

Effective boolean value of true

XQuery

boolean(true())

Open example in playground

Effective boolean value of false

XQuery

boolean(false())

Open example in playground

String values

The effective boolean value of a singleton sequence containing a value of type xs:string, xs:anyURI, xs:untypedAtomic, or a type that derivces from one of these will return false if the value has a length of zero and true otherwise.

Effective boolean value of an empty string

XQuery

boolean("")

Open example in playground

Effective boolean value of a string

XQuery

boolean("string")

Open example in playground

Numeric values

The effective boolean value of a singleton sequence containing a numeric type will return false for NaN or zero and true otherwise.

Effective boolean value of zero

XQuery

boolean(0)

Open example in playground

Effective boolean value of NaN

XQuery

boolean(xs:double("NaN"))

Open example in playground

Effective boolean value of a positive integer

XQuery

boolean(10)

Open example in playground

Effective boolean value of a negative integer

XQuery

boolean(-10)

Open example in playground

Effective boolean value of a decimal value

XQuery

boolean(1.2345)

Open example in playground

Other values

Other values than the ones listed above will raise a type error (FORG0006) when getting their effective boolean value. This includes sequences with more than one item, except for sequences containing multiple nodes, and for functions, including maps and arrays.

Effective boolean value of a non-singleton sequence raises a type error

XQuery

boolean((1, 2))

Open example in playground

Effective boolean value of an array raises a type error

XQuery

boolean(array {})

Open example in playground

Effective boolean value of a map raises a type error

XQuery

boolean(map {})

Open example in playground

Effective boolean value of a function raises a type error

XQuery

boolean(function () {})

Open example in playground