Compositors are types of analytics that are composed of other analytics. They distinguish themselves from each other based on the way the execute their analytics.
The sequential analytic executes the containing analytics in order.
The following example shows a Sequential compositor containing a RegexAnnotator that annotates Fonto email addresses, followed by a RemoveTextAnnotationsIntersectingXmlElements filter that removes the annotations that overlap with an Anchor element.
<sequential> <regexAnnotator annotationTypeId="fontoxml-email" pattern="[a-zA-Z0-9_.+-]+@fontoxml\.com"/> <removeTextAnnotationsIntersectingXmlElements elements="a"/> </sequential>
The parallel analytic executes the containing analytics in parallel. It is recommended to use this for non-CPU bound analytics that have no dependency on each other. No dependency means that not a single analytic inside the parallel analytic requires the annotations of an annotator that is also inside that same parallel analytic.
The following example shows a Parallel compositor containing two HttpApiAnnotators. By adding them in a parallel compositor the HTTP requests will be executed in parallel improving the performance.
<parallel> <httpApiAnnotator endpoint="https://my-custom-annotator/annotate"/> <httpApiAnnotator endpoint="https://my-other-custom-annotator/annotate"/> </parallel>
The partitioner analytic partitions the fragments that are being analysed based on a given set of annotations and exposes those new fragments within the scoped context of the analytic. Annotators within a partitioner will only be able to analyse fragments that overlap with the given set of annotations. It is useful for annotating only the fragments that overlap with a certain annotation, for example executing a RegexAnnotator only on the text that overlaps with the result of an XPathAnnotator.
The annotation that is used to partition the fragments will not be included within the scope of the analytic.
Only newly added annotations will be made available after executing the partitioner. Any preexisting annotations that are removed within the partitioner, will not be removed after the execution of the partitioner finished.
The following example shows an XPathAnnotator that annotates <h1> elements, followed by a Partitioner that partitions based on those annotated headings. Within the partitioner is a RegexAnnotator that only annotates numbers within the headings. The result is that only the numbers that are within a <h1> will be annotated.
<sequential> <xpathAnnotator annotationTypeId="heading" test="self::h1" /> <partitioner annotationTypeIds="heading"> <regexAnnotator annotationTypeId="heading-number" pattern="[0-9]+" /> </partitioner> </sequential>
if analytic executes it's child analytic if the given
condition is satisfied. It is useful, for example, when you only want to check for spelling errors if the user selected the spelling category. You could achieve the same with filters but that means you'd be wasting CPU cycles calculating uncessary work.
The XPath 1.0 expression that will be used to determine whether the child analytic should be executed (
The context node passed to this XPath 1.0 expression is the first node within the scoped context of the
If the condition does not return a
The following example uses the
if analytic to only check for spelling errors using the
spell if the category
spelling is enabled by the user:
<?xml version="1.0" encoding="utf-8"?> <analysis xmlns="http://schemas.fontoxml.com/content-quality/1.0/analysis-configuration.xsd" xmlns:functions="http://schemas.fontoxml.com/content-quality/1.0/functions.xsd"> <if condition="functions:enabled-categories-includes('spelling')"> <spellCheckAnnotator languages="en" /> </if> </analysis>
It is also possible to combine compositors. In the following example a Parallel compositor is executing two branches in parallel; The Sequential compositor (containing the RegexAnnotators) and the LanguageToolAnnotator.
<parallel> <sequential> <regexAnnotator annotationTypeId="fontoxml-email" pattern="[a-zA-Z0-9_.+-]+@fontoxml\.com"/> <regexAnnotator annotationTypeId="google-email" pattern="[a-zA-Z0-9_.+-]+@google\.com"/> <removeTextAnnotationsIntersectingXmlElements elements="a"/> </sequential> <languageToolAnnotator baseUrl="https://my-languagetool-instance:8010/v2/"> <spellingErrorMapping categories="TYPOS"/> </languageToolAnnotator> </parallel>