Thursday, July 07, 2005

Administrative Issues

I spent the day working on various administrative issues. Since allocation sites inside function templates are 'thrown' away after each CPA iteration, we need to record previously deduced type information for them, which have to be inserted again if similar templates are created in the next iteration. This becomes quite some administration, because these allocation may be located in templates created for argument types whose contours may be split or merged in between iterations.

First, each allocation site is given an (AST node, argument product) identifier. Based on this, a (currently global) table is maintained with currently deduced type information for each allocation site, and updated as contours change. The following more complicated example code is now analyzed correctly. The allocation sites inside 'dupl' templates are dependent on the 'y' argument, whose contour at some point is split. It goes too far to explain all this in detail, so here is the code:

def ident(x):                            # x: [list(A)]r
return x # [list(A)]
a = [] # [list(int)]
a = [] # [list(int)]
a = [] # [list(int)]
b = [] # [list(float)]

ident(a).append(1) # []
ident(b).append(1.0) # []

def dupl(y): # y: [list(A)]
k = [] # [list(float)]
k.append(1.0) # []
l = [] # [list(A)]
l.append(y[0]) # []
return l # [list(A)]

dupl(a) # [list(int)]
dupl(b) # [list(float)]

All lists in the example contain either integers or floats, so we would expect the analysis to end up with two list contours, one for each type. However, the compiler does not merge contours yet that end up with the same contents. Clearly, it can happen that when a contour becomes more precise because of CPA, there already is another contour available with the same precision. Of course then, these contours should be merged, because sharing of contours is what this whole iterative analysis is about..

In the previous situation, six object contours would have been created (one for each syntactic allocation site), potentially blowing up CPA (as it works with the cross-product of actual arguments). Additionally, the types for 'dupl(a)' and 'dupl(b)' would have been imprecise, because a single contour would be used for the different incarnations of 'l' in the different templates for 'dupl'.. It's already nice to see both improved precision and efficiency, although I still have much to do, and probably especially to debug.. :P

No comments: