Compositors

Compositors are types of analytics that are composed of other analytics. They distinguish themselves from each other based on the way the execute their analytics.

Sequential analytic

The sequential analytic executes the containing analytics in order.

The following example shows a Sequential compositor containing a RegexAnnotator that annotates Fonto email addresses, followed by a RemoveTextAnnotationsIntersectingXmlElements filter that removes the annotations that overlap with an Anchor element.

XML

<sequential>
	<regexAnnotator annotationTypeId="fontoxml-email" pattern="[a-zA-Z0-9_.+-]+@fontoxml\.com"/>
	<removeTextAnnotationsIntersectingXmlElements elements="a"/>
</sequential>

Parallel analytic

The parallel analytic executes the containing analytics in parallel. It is recommended to use this for non-CPU bound analytics that have no dependency on each other. No dependency means that not a single analytic inside the parallel analytic requires the annotations of an annotator that is also inside that same parallel analytic.

The following example shows a Parallel compositor containing two HttpApiAnnotators. By adding them in a parallel compositor the HTTP requests will be executed in parallel improving the performance.

XML

<parallel>
	<httpApiAnnotator endpoint="https://my-custom-annotator/annotate"/>
	<httpApiAnnotator endpoint="https://my-other-custom-annotator/annotate"/>
</parallel>

Partitioner analytic

The partitioner analytic partitions the fragments that are being analysed based on a given set of annotations and exposes those new fragments within the scoped context of the analytic. Annotators within a partitioner will only be able to analyse fragments that overlap with the given set of annotations. It is useful for annotating only the fragments that overlap with a certain annotation, for example executing a RegexAnnotator only on the text that overlaps with the result of an XPathAnnotator.

  • The annotation that is used to partition the fragments will not be included within the scope of the analytic.

  • Only newly added annotations will be made available after executing the partitioner. Any preexisting annotations that are removed within the partitioner, will not be removed after the execution of the partitioner finished.

The following example shows an XPathAnnotator that annotates <h1> elements, followed by a Partitioner that partitions based on those annotated headings. Within the partitioner is a RegexAnnotator that only annotates numbers within the headings. The result is that only the numbers that are within a <h1> will be annotated.

XML

<sequential>
	<xpathAnnotator annotationTypeId="heading" test="self::h1" />
	<partitioner annotationTypeIds="heading">
		<regexAnnotator annotationTypeId="heading-number" pattern="[0-9]+" />
	</partitioner>
</sequential>

If analytic

The if analytic executes it's child analytic if the given condition is satisfied. It is useful, for example, when you only want to check for spelling errors if the user selected the spelling category. You could achieve the same with filters but that means you'd be wasting CPU cycles calculating uncessary work.

Configuration

condition

The XPath 1.0 expression that will be used to determine whether the child analytic should be executed (true()) or not (false()).

The context node passed to this XPath 1.0 expression is the first node within the scoped context of the if analytic.

If the condition does not return a xs:boolean result type, it will be wrapped in the boolean() function to get the effective boolean value of the expression.

This if analtyic is primerily design to work with category filtering.

The enabled-categories-includes() function is available in the namespace http://schemas.fontoxml.com/content-quality/1.0/functions.xsd. It takes a single, required, argument categoryId of type xs:string. It returns true() if the category is selected by the user, otherwise false(). If no category is selected by the user, it implicitly means all categories are selected.

Yes

N/A

Example

The following example uses the if analytic to only check for spelling errors using the spellCheckAnnotator if the category spelling is enabled by the user:

XML

<?xml version="1.0" encoding="utf-8"?>
<analysis
	xmlns="http://schemas.fontoxml.com/content-quality/1.0/analysis-configuration.xsd"
	xmlns:functions="http://schemas.fontoxml.com/content-quality/1.0/functions.xsd">

	<if condition="functions:enabled-categories-includes('spelling')">
		<spellCheckAnnotator languages="en" />
	</if>

</analysis>

Combining compositors

It is also possible to combine compositors. In the following example a Parallel compositor is executing two branches in parallel; The Sequential compositor (containing the RegexAnnotators) and the LanguageToolAnnotator.

XML

<parallel>
	<sequential>
		<regexAnnotator annotationTypeId="fontoxml-email" pattern="[a-zA-Z0-9_.+-]+@fontoxml\.com"/>
		<regexAnnotator annotationTypeId="google-email" pattern="[a-zA-Z0-9_.+-]+@google\.com"/>
		<removeTextAnnotationsIntersectingXmlElements elements="a"/>
	</sequential>

	<languageToolAnnotator baseUrl="https://my-languagetool-instance:8010/v2/">
		<spellingErrorMapping categories="TYPOS"/>
	</languageToolAnnotator>
</parallel>