Soffit progress report #2

Github repository: https://github.com/mgritter/soffit

Previous entry: "My #Procjam Project: Soffit"

I was supposed to go out and have fun this weekend; instead I both got sick and handled two customer escalations. But I made a little progress on the next task for my graph rewrite engine: I implemented pattern matching!

The code below pretty closely follows the example from my prototype using or-tools and I'm happy at how it came out. The right-hand rule will require some additional constraints in the model, to avoid leaving dangling edges or simultaneously deleting and not deleting a vertex or edge.

A design choice to explore once I have actual users: do they expect that a rule A--B matches any edge, even one with a tag? As implemented, it only matches edges without a tag. I was thinking of putting in a wildcard match whose syntax would be something like A--B [$x], as shown in my design exploration document.

    def leftSide( self, leftGraph ):        
        """Specify the left side of a rule; that is, a graph to match."""
        self.checkCompatible( leftGraph )

        self.left = leftGraph
        maxVertex = max( self.graph.nodes )

        # Build a variable for each vertex that must be matched.
        # The convention is to allow non-injective matches, so
        # two variables could match to the same vertex.
        for n in leftGraph.nodes:
            # FIXME: how permissive is this on what may be in a string?
            v = self.model.NewIntVar( 0, maxVertex, n )
            self.variables[n] = v

            # Add a contraint to only assign to nodes with identical tag.
            # FIXME: no tag is *not* a wildcard, does that match expectations?
            tag = leftGraph.nodes[n].get( 'tag', None )
            matchingTag = [ i for i in self.graph.nodes
                            if self.graph.nodes[i].get( 'tag', None ) == tag ]
            if self.verbose:
                print( "node", n, "matchingTag", matchingTag )
            if len( matchingTag ) == 0:
                self.impossible = True
                return

            # If node choice is unconstrained, don't bother adding it as
            # a constraint!
            if len( matchingTag ) != len( self.graph.nodes ):
                self.model.AddAllowedAssignments(
                    [v],
                    [ (x,) for x in matchingTag ] )

        # Add an allowed assignment for each edge that must be matched,
        # again limiting to just exact matching tags.
        for (a,b) in leftGraph.edges:
            tag = leftGraph.edges[a,b].get( 'tag', None )
            
            matchingTag = [ i for i in self.graph.edges
                            if self.graph.edges[i].get( 'tag', None ) == tag ]
            if self.verbose:
                print( "edge", (a,b), "matchingTag", matchingTag )
            if len( matchingTag ) == 0:
                self.impossible = True
                return

            # Allow reverse matches if the graph is undirected
            if not nx.is_directed( leftGraph ):
                revEdges = [ (b,a) for (a,b) in matchingTag ]
                matchingTag += revEdges

            self.model.AddAllowedAssignments( [self.variables[a], self.variables[b]],
                                              matchingTag )

Unfortunately, I found two bugs in or-tools:

The solver returned the same solution multiple times. I didn't file an issue because I didn't know if this was expected, though some of the examples suggested it was not. But a maintainer confirmed it's a bug. This one I can easily work around by putting the matches found into a set to deduplicate them.

I also crashed the C++ library with an unsatisfiable instance. This will be harder to work around. If the library was pure Python I could just catch the error, but the C++ assert kills the process. It looks like the non-callback function Solve doesn't crash, so maybe I can try calling that first.

I also found bugs in my own code, of course, and reworked the parser to create a canonical version of a graph after merging has been done. The previous version had some nodes that were aliases as separate objects in the networkx graph, and I found myself writing a separate "make it canonical" step which made no sense (and would have made tag handling a lot more complicated.)

Next steps:

Workaround or-tools bug, or fix it in the original sources
Write the additional constraints imposed by the right-hand side rule
Perform the graph rewriting!