« System F | Main | A happy Christmas »

December 22, 2004

Typing, typing and retyping

I continued working on the type-based SQL projection optimisation. I haven't finished but I believe I know exactly where I am going; and I would like to make a few comments:

If you consider the following example where type-based optimisation would be superior to the current term-based version (as soon as a function is applied to an iteration variable, the current optimisation goes to a default worst-case state where all fields are selected): [bag (fun ^x -> x.#c)(x) | ^x <bag (table "tata" with {#a:int,#b:string,#c:float} from db)]. When running the optimiser, and when the apply 'fun(x)' node is processed, the optimiser needs to look at the type of the function to determine what 'x' needs to be. Unfortunately, the type kept in the AST is a neatly unified type: '{#a:int,#b:string,#c:float} -> ;float'. This doesn't give any information: I know that x can have an a, b and c field because the table provides them, that's not the point. What I want is the type of the function alone ('{#c:'1,'2} -> '1'). This means that I have to re-type this sub-expression every time (or change the type information stored in the AST).

Secondly, In this example, we are happy to have the function directly available. But in this case '[bag extract(x) | ^x <bag (table "tata" with {#a:int,#b:string,#c:float} from db)]' it isn't. If we want to have the type of the 'extract' function, we need to have access to the typing environment (or in the case of a compiled program, to a 'memory' of earlier values). This adds a new dependency to the optimiser.

For now, I am working on a solution with re-typing and environment-passing because it's easier to implement rapidly, and because it shows how to solve the problem (if the type is recalculated or is stored in the AST, it doesn't change the principle), but it is a solution that really isn't pretty. Yesterday's post was speaking about System F. I don't know if this is exactly the right solution for type information in the AST, but something like this will certainly need to be done in the long term.