Friday, 6 February 2009

FLOWR - A Beauty Of Nature

If you ever plan to write XQuery the FLOWR expression will in a way become your bread and butter.

What does FLOW stand for :

F - for
L - let
O - order by
W - where
R - return

You could say it is just a conventional procedural language for loop with a couple of SQL like behavior features, in "order by" and "where". Here's an example

for $child in ("Merlyn", "Kai", "Eden")
return <child name="{$child}"/>

the above FLOWR expression returns

<child name="Merlyn"/>
<child name="Kai"/>
<child name="Eden"/>

Let's now try a more complex example using some of the other features of FLOWR

let $children as xs:string* := ("Merlyn", "Kai", "Eden")
element children
for $child as xs:string at $position in $children
let $older-sibling as xs:string? := $children[($position - 1)]
where fn:contains($child,"e")
order by $position descending
return element child
attribute name {$child},
if ($older-sibling)
then attribute older-sibling {$older-sibling}
else ()

This FLOWR expression returns :

<child name="Eden" older-sibling="Kai"/>
<child name="Merlyn"/>

A number of interesting things are happening in this expression. The goal is to get a sequence of child elements in reverse order, containing the attribute older-sibling representing the child's oldest sibling. Oh to throw in some random complexity, we
only want children who contain the character "e".

First the variable $children is declared, bound to the xs:string data type. You'll notice the * character appended, this represents one or more occurrences of that type, in other words a sequence of strings, equivalent to an array or vector of strings in other languages. $children will hold a sequence of my children names in descending order as Merlyn is my oldest son, and Eden my youngest daughter.

The next step is to create the children element as the root element, holding the result sequence returned by the FLOWR expression. Inside it resides our beloved FLOWR.

The FLOWR expression starts by declaring the $child variable bound to a single string in each iteration of the loop.

Then you may notice the "at $position" variable declaration, this variable holds the item position in the loop's input sequence. If you're used to XSLT, you'll find this to be the equivalent to context()/position() in an xsl:template match.

Lastly, you'll notice that in this example we're using the $children variable instead of the hard coded sequence expression. The $children variable could be a reference to some xml document, or even the output of a module's function (we'll cover that in a later blog), the result of an XPath expression, etc.

Now that the "F" in FLOWR as been initiated, it is possible to start filtering and modeling it's output.

We'll first use the "L" in FLOWR, by declaring the variable $older-sibling, which will hold a reference to the older sibling of each child, using the $position value to select the equivalent to it's preceding-sibling axis. Only in this case we're dealing with strings, so can't use axis, if the item() sequence was a node rather than an atomic type, then we could use axis.

Next the "W" in FLOWR. This is where FLOWR allows to express filters in the input sequence, and where our FLOWR example filters children that contain the "e" character, using the fn:contains() function. To be fair I usually prefer to use predicates in the input sequence, . . . $children[fn:contains(.,"e")] . . . same thing.

I'm not sure if the "O" in FLOWR was swapped round for aesthetic reasons, but the fact is that if you define the "order by" before the "where" you'll get a - syntax error in #...scending where fn:contains#: expected "return", found "null"
So order by $position descending, is self explanatory, gime stuff in reverse order . . .

And now to the meat. The big "R", return me stuff. Here that's where we format the output of the FLOWR expression. In this case we are building a sequence of children elements, containing a sequence of attributes, in this case the name, and older-sibling attributes.

You may notice in the output that Merlyn hasn't got an older-sibling attribute. That's thanks to the conditional statement in the attribute sequence that checks weather $older-sibling exists.
Because the $children[($position - 1)] expression doesn't return anything (...or returns an empty sequence = false()), and the "?" character in the type declaration of $older-sibling is allowing the possibility for an empty sequence to be returned, then the older-sibling attribute won't be present in the child element for "Merlyn".
Of course Merlyn is my oldest son, and hasn't got any older siblings . . . not that I'm aware anyway.

So there, what a beautiful FLOWR can do for you in XQuery. Enjoy . . .

1 comment:

  1. Great stuff. Not a lot of great FLOWR out there. The only thing I would say is that given the nature of xml, with children, siblings, descendants, etc, that the example variables NOT inlcude child or sibling as element names. But that is a quibble. Thx, kaiser