Sort operator 2
I started the day with an interesting talk by Michael Schwartzbach about 'The design space of type checkers for XML transformations'. But the rest of the day was not quite enough to have the sort operator optimisation working. I will continue working on it tomorrow. But in the meantime, here is a little comment about this operator:
Consider this SLinks expression: 'sort_up [set x.#a | ^x <bag (table "tata" with {#a:int,#b:int} from db)];'. Optimising the 'sort_up' operator into the query would yield something like that: '[lst x.#a | ^x <lst (table "tata" with {#a:int,#b:int} order [#a:asc] from db)];'. The problem that needs to be solved for this optimisation is on what column the ordering must be made, and to do this, there must be a system to inspect the expression of the body of the list comprehension to determine exactly what the list is made of. Using the type of the list is not sufficient (it can easily be seen in the above example).
However, if one considers this expression: 'sort_up [lst x | ^x <bag (table "tata" with {#a:int,#b:int} from db)];', the problem is different: what is the correct ordering? It depends on how the comparison between two records is made for the ordering. Probably the most intuitive solution is to optimise it to '[lst x | ^x <lst (table "tata" with {#a:int,#b:int} order [#a:asc,#b:asc] from db)];'. This implies that the comparison between two records with fields a and b must be implemented as 'if (a1 < a2) then true else (b1 < b2)'. This has two problems: Firstly, it is by no means clear whether this is the right comparison for a particular data type. Secondly (and more importantly), since records have (conceptually at least) unordered fields, this comparison is not well-defined. Or more precisely, it requires to make assumptions about the implementation (that the original ordering of fields is kept).