The basics
This page introduces some concepts that you will need to be familiar with before we can get started with the cool stuff. Other pages will regularly refer back to the concepts introduced on this page. This page is important!
Type system
XPath and XQuery share the same type system. The most abstract data type available in this system is called item
. All other data types are (indirectly) derived from item
. The type system can be divided into three groups of data types. One group consists of all node types, another group consists of function types, and the last group consists of all atomic types.
All node types are derived from the abstract data type node
, which itself is derived from item
. There are seven different types of nodes derived from the abstract node
type: attribute
, comment
, document
, element
, namespace
, processing instruction
, and text
.
Then there are functions. There is one abstract function
data type, derived from item
. All functions are derived from this data type. But not only functions are derived from this abstract data type. Maps and arrays are too. The map
and array
data types are abstract data types too.
The last group consists of all atomic types. There is one generic atomic type called xs:any
. This data type is derived from item
. The following table is an overview of all available simple types.
String types |
Number types |
Time types |
Other types |
---|---|---|---|
Document order
Document order is the order in which nodes appear in an XML document. There's also a reverse document order, which is simply the reverse of the document order.
The document order is stable. This means that the relative order of any two nodes will not change while processing a query.
Document order satisfies the following constraints within a tree.
1. The root node is the first node The root node, in this case <xml>
, will always be the first node in document order.
XML
<xml>
<paragraph>Paragraph content.</paragraph>
</xml>
2. A node always occurs before its children and descendants The root node, in this case <xml>
, will be followed by its child <paragraph>
. The <paragraph>
element will be followed by its text node child containing "Paragraph content.".
XML
<xml>
<paragraph>Paragraph content.</paragraph>
</xml>
3. Namespace nodes immediately follow their element node The relative order of namespace nodes is stable but implementation-dependent. The root node <xml>
will be followed by its namespace node.
XML
<xml xmlns:prefix="http://example.com/prefix"></xml>
No playground example available.
4. Attribute nodes immediately follow their element node's namespace nodes The relative order of attribute nodes is stable but implementation-dependent. The root node <xml>
will be directly followed by its attribute id
and the <paragraph>
element will be directly followed by its id
attribute.
Other
<xml id="doc-1">
<paragraph id="p-1">Paragraph content.</paragraph>
</xml>
No playground example available (yet).
5. The relative order of siblings is the order in which they occur in the children property of their parent node The <paragraph>
elements will be sorted in the order they occur in the document.
XML
<xml>
<paragraph id="p-1" />
<paragraph id="p-2" />
<paragraph id="p-3" />
</xml>
6. Children and descendants occur before the node's following siblings The text nodes in the <paragraph>
elements will occur directly after their parent, before their parent's following sibling.
Other
<xml>
<paragraph>Paragraph content 1.</paragraph>
<paragraph>Paragraph content 2.</paragraph>
<paragraph>Paragraph content 3.</paragraph>
</xml>
Expressions
The expression is the basic building block in both XPath and XQuery. Everything is an expression in both languages, apart from a few constructs used in XQuery to define modules.
Consider the following example:
Expression example
XQuery
1 + 2
This example consists of a total of three expressions. The 1
and 2
are literal expressions. The +
is an arithmetic expression. The arithmetic expression expects two operands. In this example, its operands are the two literal expressions.
The example illustrates how expressions are nested. The literal expressions are nested in the arithmetic expression. The arithmetic expression is the top-level expression in this example. It could, in turn, be nested in another expression. There usually is only one top-level expression.
Expression context
Every expression has its own context. The expression context consists of all the information that can affect the result of an expression. Information in the context can be organized into two categories called the static context and the dynamic context.
Context item
The context item is part of the dynamic context. It is the most important piece of information in the expression context. When the context item is a node, it may also be called the context node instead of the context item. Examples of how to use the context item will be given in later chapters.
Context position
The context position is part of the dynamic context. It indicates the position of the context item in the sequence currently being processed. The position changes whenever the context item changes. The context position is returned by the fn:position
function.
Context size
The context size is part of the dynamic context. The context size is the number of items in the sequence currently being processed. Its value is always an integer greater than zero. The context size is returned by the fn:last
function.
Sequences
All data types in XPath and XQuery are derived from item, except for the sequence. A sequence is a collection of zero or more items. A value returned by an expression is always a sequence.
Creating a sequence
Creating a sequence is simple. You'll only need one pair of parentheses:
Creating an empty sequence
XQuery
()
This pair of parentheses is a valid expression. This expression will resolve to an empty sequence, and that's it. An empty sequence is often used to return nothing. This is similar to returning null
in many other languages.
Sequences can also be created containing items. A sequence containing only one item is often called a singleton. An item is always equal to the singleton sequence containing that item. It's like the sequence is not there in that case.
Creating a singleton sequence
XQuery
(1)
Creating a sequence
XQuery
(1, 2, 3)
Creating a sequence with mixed data types
XQuery
("string", 123, true())
The last example shows that the items in a sequence don't have to be of the same data type. Note that a lot of the operators and functions in XPath and XQuery do require all items in a sequence to be of the same type.
Accessing members
Accessing a member of a sequence can be done like this:
Accessing a sequence member
XQuery
("a", "b", "c", "d", "e")[1]
Note that, in contrast to many other languages, indices in XPath and XQuery are one-based instead of zero-based. For this example, it means that the first item ("a"
) is returned instead of the second item as you might expect.
The syntax used here looks similar to accessing a member of an array in many other languages. But it actually is a filter expression.
Nested sequences
Sequences cannot exist as a nested structure. Consider the following example:
Nested sequences
XQuery
((1, 2), 3, ((("a"), "b"), "c"))
The result of this expression will be a single sequence containing three integers followed by three strings. The sequences will all be flattened to one sequence.
Range expression
If you want to create a sequence containing a range of numbers, you can use the range expression to create such sequence.
Range expression
XQuery
(1 to 5)
Atomization
Some operators depend on a process called atomization. Atomization is applied to a value when a sequence of atomic values is required. The result of atomizing is either a sequence containing atomic values or a type error (FOTY0012). Atomizing a value is defined as the result of invoking the fn:data
function.
Atomization works according to these rules:
-
If the item is an atomic value, it is returned.
-
If the item is a node, its typed value is returned. If the node does not have a typed value, a type error (FOTY0012) is raised.
-
It the item is a function, including maps, excluding arrays, a type error (FOTY0013) is raised.
-
It the item is an array, atomization will be applied to each item in the array. This will happen recursively for nested arrays.
Effective boolean value
Some expressions use the effective boolean value of a value. The effective boolean value of a value is defined as the result of applying the fn:boolean
function to the value. This works according to the following rules.
Empty sequence
The effective boolean value of an empty sequence is false.
Effective boolean value of an empty sequence
XQuery
boolean(())
Nodes
The effective boolean value of a sequence whose first item is a node is true.
Effective boolean value of a singleton sequence containing one node
XQuery
boolean(/xml)
Effective boolean value of a sequence containing nodes
XQuery
boolean(/xml/child::*)
Effective boolean value of a sequence whose first item is a node
XQuery
boolean((/xml, false()))
Boolean values
The effective boolean value of a singleton sequence containing a value of type xs:boolean or a value derived from xs:boolean will return that value unchanged.
Effective boolean value of true
XQuery
boolean(true())
Effective boolean value of false
XQuery
boolean(false())
String values
The effective boolean value of a singleton sequence containing a value of type xs:string, xs:anyURI, xs:untypedAtomic, or a type that derivces from one of these will return false if the value has a length of zero and true otherwise.
Effective boolean value of an empty string
XQuery
boolean("")
Effective boolean value of a string
XQuery
boolean("string")
Numeric values
The effective boolean value of a singleton sequence containing a numeric type will return false for NaN or zero and true otherwise.
Effective boolean value of zero
XQuery
boolean(0)
Effective boolean value of NaN
XQuery
boolean(xs:double("NaN"))
Effective boolean value of a positive integer
XQuery
boolean(10)
Effective boolean value of a negative integer
XQuery
boolean(-10)
Effective boolean value of a decimal value
XQuery
boolean(1.2345)
Other values
Other values than the ones listed above will raise a type error (FORG0006) when getting their effective boolean value. This includes sequences with more than one item, except for sequences containing multiple nodes, and for functions, including maps and arrays.
Effective boolean value of a non-singleton sequence raises a type error
XQuery
boolean((1, 2))
Effective boolean value of an array raises a type error
XQuery
boolean(array {})
Effective boolean value of a map raises a type error
XQuery
boolean(map {})
Effective boolean value of a function raises a type error
XQuery
boolean(function () {})