« September 2004 | Main | November 2004 »

October 29, 2004

Yet another creeping bug

Unfortunately, it wasn't my most efficient day of work today. I didn't do as much as I hoped.

In particular, I discovered a new bug with my unification algorithm. The cases where it causes problem are not straightforward and were not properly tested earlier. The problem is that when unifying two rows that have row variables, my old algorithm created a substitution that transformed one of the row variable into the other. This is incorrect because the rows created by the returned substitution do not have either the left row variable or the right one, but the union of the two, which is a new row variable. Unfortunately even though it seems a simple modification, it will require rewriting the row unification algorithm, and I hadn't have time to start this yet.

But there are also good news: Val Tannen has promised that I would get my hands on Kleisli's source code shortly. I hope it will be the case.

October 28, 2004

Comprehension syntax

I continued thinking about collections in SLinks. I did not find the perfect solution yet in spite of discussing it with Jeremy. I checked out P. Buneman's (and others) paper 'Comprehension Syntax' which makes me understand better some things in CPL, but doesn't solve my problem. I also (re-)read P. Wadler's 'Comprehending Monads'. I must admit (maybe I shouldn't say that considering who will read this post) that the monad concept still isn't completely natural for me. I am getting closer to a solution though and hopefully tomorrow I will have a first draft of what I might look like.

I also started to modify my system to accept more base types (integers to start with), and corrected a resistant bug that I hadn't noticed in the typing algorithm.

October 27, 2004

Back to work: lists

After a lengthened week-end (for anyone who likes Scottish nature, the Cairngorm mountains is the right place to go), I came back to work late today. I started looking for the right way to add collections to SLinks by checking out the references about collections in various Kleisli papers.

I am not quite sure how to do it yet: in Kleisli, the core operators are very limited, and are not sufficient by themselves (for example, there is no operator to extract an element of a list). The language becomes useful when used with it's 'external primitives': but there are many of them (26 just for lists). I have to find a compromise: a small number of core operators but with sufficient expressive power. Furthermore, with the database access part in mind, I probably should consider it's particular needs: using only the 'cons', 'head' and 'tail' operators, even though they would be sufficient, do not seem very suitable to produce an SQL mapping.

October 22, 2004

Sweetening things up

The last bugs with variants are now dead and buried (hopefully). I have also modified the SLinks interpreter to accept 'Sweet SLinks' expression, which are SLinks expression with expanded, easier to use, operators for records and variants. The internal representation of expressions has not been modified to do this (except for a little bug that appeared when testing these new operators), it is purely syntactic sugar. The latest version of the SLinks language reference is available here and defines core and sweet grammar, translation from sweet to core grammar, as well as type inference rules (with a number of corrections since the last version).

October 21, 2004

It's starting to look pretty

If one doesn't use variants (why would anyone do that?), the typing-algorithm works beautifully now. It is able to calculate lovely types like (#tata:pre['23], '22) -> ('22) (the function from a record with field '#tata' and possibly more to a record with the same possibly more). Unfortunately, the expression fun v . case v of <#tata = x>.x | y.y;, which should receive type <#tata:pre[<'1>], '1> -> <'1> doesn't. But I am pretty sure there is just a unification or substitution missing in the variant selection part of algorithm W. Too late to find it tonight though, but I hope to have the whole thing working early tomorrow.

October 20, 2004

There was no point in hoping

I got a first executable version as expected. But I couldn't squash all bugs today: there was no point in hoping, I guess. I will continue this tomorrow. Once I have finished this, I will add the syntactic sugar to the language to support 'usable' records and variants (with access to more than one element at a time). Then I will have to add support for lists. I am not sure yet what base operators for list I should support (comprehension concatenation and head maybe). I will have to think about this a bit more, and look around for papers about it.

October 19, 2004

Nearly there (hopefully)

I continued implementing the SLinks typing algorithm. I hope to have a first executable version tomorrow morning: I only have to tie up some loose ends in my code and it should compile. I hope I didn't make too many bad mistakes and that debugging won't be too difficult.

October 18, 2004

Luis Damas

I continued implementing the SLinks type system. I also discovered today that Milner's original W algorithm had been upgraded in a 1982 paper by Luis Damas and himself called 'Principal type-schemes for functional programs'. The W algorithm is nearly the same, except for the environment which uses type-schemes instead of the original types with a let or lambda flag attached. Type schemes are simply types with the generic variables marked. This is a much better way to handle generic variables than what had been done in the original paper. I changed (nearly: there is still some work to do tomorrow) all the code to use that new system, since I had some troubles with the original system in my case.

October 15, 2004

The type of types

I continued the implementation of the SLinks type system. Not much to say, except that I didn't advance as much as I would have liked. The first implementation I started used a type structure to represent SLinks types that proved to be impractical, and had to be rewritten.

October 14, 2004

Writing code again

After more than a week of reading, I have started coding again. I started implementing a type system for a simple lambda calculus with records and variants, that I call SLinks (simple links). The grammar and type inference rules can be found here. All operators on records or variants are applied to only one field at a time, I intend to implement operators on many fields as syntactic sugar. I use a type representation based on Rémy's 1994 paper, but the typing algorithms should be simpler since SLinks does not support free record extension or record concatenation.

I also had news from Limsoon Wong about the typing algorithm he used for Kleisli. He confirms his algorithm doesn't use 'lacks' constraints on fields, or any other fancy typing mechanism (except for the minimalist row variables described in his papers). In order to implement the SLinks syntax (the record extension), the SLinks type system as described above will actually be more complicated than Kleisli's.

October 13, 2004

Some housekeeping and more reading

B. Pierce and M. Jones both answered Philip Wadler's request for a simple typing algorithm. Jones pointed to his article 'A polymorphic Type System for Extensible Records and Variants' that describes an implementation for his 1994 type system. After reading it, I don't believe it is simpler than other systems. Pierce proposed to read the last chapter in a book he just published. The chapter is written by D. Rémy and should give an somewhat easy introduction to state-of-the-art typing systems. It is very long and I didn't have time to read it yet. I am slightly sceptical that I will find a simple, easy, type inference algorithm for records in there. I suspect it will mostly be a rewrite of Rémy's and other's already published algorithms. I am waiting for an answer of the 'Kleisli bunch' about their type system. I hope this one will actually really be simpler.

I also spent quite a lot of time today doing 'housekeeping' tasks: classifying articles, reinstalling and relearning LaTeX, etc. It is scary how fast one forgets LaTeX when not using it for a little while.

October 12, 2004

Kinds, flagged fields and subtyping

I finished the reading of the long article list. Bad news is that none of the list provides a simple algorithm for a simple case of record/variant typing. Rémy's 'Type Inference for Records in a Natural Extension of ML' and it's complement 'Typing Record Concatenation for Free' is in my opinion the most satisfactory version with row variables. It is particularly attractive since it allows to modify the behaviour of basic operators without changing the types or type inference algorithm. It is quite complicated though, especially the version with concatenation (which is by no means 'free').

It has been decided with Prof. Wadler that I should try to implement a simpler type inference algorithm from scratch. I also should try to obtain the inference algorithm of Kleisli which is said to be simpler than those from the list.

October 11, 2004

Reading, reading and more reading

Not much to say for today. I continued reading the articles from October the sixth's list. I have added a new article to the nominated list: 'D. Rémy; Type Inference for Records in a Natural Extension of ML, 1994'. It had been dropped from the original list because I thought it was a reprint of an older article with a similar title, which was not true.

October 08, 2004

Plenty to read

I have been given a little more time to read all the papers. So, that is what the day was all about. A short post for once: more to come when the reading is over.

October 07, 2004

One day is not enough

I had been given one day to find the perfect (at least for now) typing algorithm for records and variants. I fear that wasn't quite enough. Typing records is a subject that many people have written long articles about: there simply was to much to read for me. The printer didn't want to cooperate either.

From all possible papers, obtained by looking on potential author's web sites, and through referencing by other potential papers, I've nominated 13 for the perfect algorithm final. Currently, Buneman & Ohori's 'Polymorphism and Type Inference in Database Programming' is the favourite, since it claims that Kleisli's type system derives from it's own. I still have to read it in more detail as well as some other nominated articles that I didn't have time to read in detail.

The 13 finalists are (from oldest to youngest):

  • M. Wand; Complete type inference for Simple Objects, 1987.
  • A. Ohori & P. Buneman; Type Inference in a Database Programming Language, 1988.
  • A. Jategaonkar & C. Mitchell; ML with Extended Pattern Matching and Subtypes, 1988.
  • D. Rémy; Type-checking records and variants in a natural extension of ML, 1989.
  • M. Wand; Type Inference for record Concatenation and Multiple Inheritance, 1991.
  • R. Harper & B. Pierce; A Record Calculus Based on Symmetric Concatenation, 1991.
  • L. Cardelli & J. Mitchell; Operations on Records, 1991.
  • A. Ohori & P. Buneman & V. Tannen; Database Programming in Machiavelli - a Polymorphic Language with Static type Inference, 1992.
  • D. Rémy; Typing Record Concatenation for Free, 1993.
  • P. Buneman & A. Ohori; Polymorphism and Type Inference in Database Programming, 1994.
  • A. Ohori; A Polymorphic Record Calculus and it's Compilation, 1995.
  • J. Garrigue; Programming with Polymorphic Variants, 1998.
  • J. Garrigue; Simple Type Inference for Structural Polymorphism, 2002.

October 06, 2004

More typing algorithms

I discussed today with Prof. Wadler what path to follow for the typing of the first prototypal Links language, so that I can start the database part of my work. Even though Rémy's and the updated Wand algorithm provide good typing results, they might be quite complicated to implement.

Wand's original algorithm is wrong because of one operator in the language it must support: it allows a 'with' operator on records that can both update an existing field or add a new field, two operations that have to be typed differently. Currently, Links is not supposed to support such an operator and CPL doesn't provide at all such a complex record typing mechanism.

I therefore started today looking for yet another typing algorithm that would be at the same time simple enough (like Wand's original algorithm), correct and powerful. I have plenty of papers to look at, and I am by far not finished yet. I will publish what I find in tomorrow's post here.

October 05, 2004

Type constraints and record well-typings

What I thought yesterday ended up being wrong today: Wand's corrigendum was indeed important, and I need to implement another algorithm to type records. In order to find the right algorithm to use for typing my records, today was a reading day.

I studied Rémy's 1989 paper 'Typechecking records and variants in a natural extension of ML' which proposes a solution to Wand's 1987 algorithm's problem. This solution doesn't use normal types but uses 'kinds' instead that add a notion of 'flag' to the standard types which more or less define whether a particular field exists for a particular row. The problem with this solution is that record concatenations are not handled correctly and that it only supports a finite set of labels, known in advance.

I also read Wand's 1991 paper 'Type Inference for Record Concatenation and Multiple Inheritance' which generalises Rémy's algorithm to support concatenation and to solve the label limitation. He uses yet another type representation that adds a notion of constraint, which is a set of equations that a type must solve to be valid.

Finally, I tried pushing further my quest for the ultimate algorithm: after all, if three algorithm have been proposed, there is no reason why there weren't more. However, I didn't really find any algorithm that claimed to be more general than the 1991 Wand algorithm.

October 04, 2004

Wand's simple objects

I finished Wand's 1987 paper 'Complete type inference for Simple Objects'. I am a little puzzled about the corrigendum, which more or less says that Wand's simple objects actually can have multiple types at the same time. This is not taken into account at all in the original paper. I haven't seen anywhere a type system that produces multiple well-typings for a single expression. I don't know whether this corrigendum raises an interesting theoretical problem that I can forget about for my (practical) work, or whether I actually have to implement yet another algorithm.

Since I didn't find anything that looked like the promised revised version of Wand's paper (and since 1987, he would have had enough time to write one if it was really important), I ignored for now this problem and started the implementation of the original algorithm later that day. It is not ready yet, but I know more or less what to do.

October 01, 2004

Limsoon Wong and a squashed bug

I met Limsoon Wong, the creator of Kleisli. He presented shortly the basic principle of optimisation in Kleisli. It seems that from the dozens of potential optimisations described in his thesis, he actually only used four in the real implementation of Kleisli (at least for SQL optimisation): two to push row or column selection into the SQL query, one to use a SQL join for overlapping queries on the same database, and one to optimise joins on different databases. This is good news because there were really too many in the original thesis for my liking.

I then went to a talk by Wong about his current research. He is working on a technique to reduce errors in a particular biological experiment by applying data-mining techniques to the raw experimental data. Very interesting presentation, even though it doesn't really have much to do with my work here.

With all this, my day was well started already, but I finally squashed the naughty bug in my algorithm W implementation. Of course, the bug was not related to the implementation of the 'let' itself or the subsequent variable generalisation, but was merely revealed in this case. It was instead due to a faulty behaviour of the unify method which produced in some cases ugly substitutions like {α=β, α=γ}.