I have been thinking a lot about this project lately. From start I have thought that I want a database as the base for my new model compiler. This kind of thinking is based on the fact that you need a meta-model for your modelling language. Since UML is my modelling language the metamodel can also be expressed easily in UML itself and the class diagrams from UML maps easily to a database. Once you have a metamodel for your modelling language you can easily navigate around it and generate the target code. Well, in my case code, if you look at the OMG MDA stuff they tend to talk about model to model transformations and call code a model. Well, sure it is but it turns too many heads around so lets call it code.
So how about the code templates and the navigation? My current model compiler uses XSLT which has both the templates and navigation. They are mixed in the same file, which may not be optimal, but it works. The problem with XSLT is that its but for transforming XML. I should correct myself again before I get complaints. It's built to transform tree structures. XMI and XMI-Light are XML and can be handled by XSLT, no problem. Output does not need to be XML. In UMT output is XML but just so much that UMT knows what to emit to different files.
Here comes the problem. The UML metamodel is not a tree. UML models may occationally be possible to represent as a tree, but the UML metamodel is not one of those. In order to represent UML models in XMI (and XMI-Light) a mechanism to link nodes using id and idref attributes in XML has been choosen. This works, but it's awkward to navigate using XSLT. XSLT is a quite different language to learn and normal querying is complicated enough without having to mess with the id/idref stuff aswell.
This is where my brain has started ticking. Why not combine the best of two worlds.
How hard can it be? The XSLT engines I have come across all work on a DOM XML parser. DOM is hierarchical in its roots but really it's just a matter of tweaking the interior a bit so that instead of beeing so XML-ish it becomes more DB-ish. It should be possible to keep the DOM interface to it as far as the XSLT engine cares.
The second problem is that XSLT uses axes to do its queries on the XML document it tries to transform. One axis is child, in which you navigte to children of an element. In UML lingo this is kind of similar to navigating from onc class to another. Other axes are parent and sibling. When the structure is no longer hierarchical these just make no sense anymore. So the XSLT parser need to be hacked to stumble on these.
Of cource the database does not have to be a database in the traditional sense. It may well be an in memory database. It probably will get us a much faster model compilation if it is.
Another thing that needs to be hacked is the XML-loading. We really need to resolve these id/idref stuff and replace with associations between objects instead. This is probably more complicated then the rest of the changes all together.
This really does not sound too complicated. Does it? I have done some internet searches to see what's already been done but I have not yet found something really interesting.