Three-Vertex Induced Subgraph Isomorphism in DAGs
Problem Setting
This blog post is about a simple observation I stumbled upon recently. It contrasts the time complexity of two, at first glance, very similar computational tasks on directed acyclic graphs (DAGs). On the one hand, the problem of checking whether a DAG is transitive, i.e., whether implies and, on the other hand, the problem of checking whether a DAG is moral1, that is implies that and are adjacent. The latter notion stems from the graphical models literature. For an example, consider the following graphs:
The graph on the left is not transitive, for example due to , while implies non-morality of the right graph, in both cases because and are non-adjacent.
As the example illustrates, the two problems can be expressed as induced subgraph isomorphism problems, that is detecting whether a graph contains another (in this case fixed) graph as an induced subgraph2: We have non-transitivity iff and non-morality iff is an induced subgraph. Surprisingly, and the main point of this article is that computationally these two problems behave differently:
Main Observation:
Checking morality of a graph is (easily) possible in linear-time in the size of the graph, while no linear-time algorithm is known for checking transitivity (and such an algorithm appears firmly out-of-reach).
In the following, I will give evidence for this statement. Before we start a short disclaimer: Most (if not all) results I present here are already known, though somewhat spread out over the literature. I give references whenever possible, in case I missed something relevant, feel free to contact me :)
Undirected Graphs
Before we dive into it, let’s briefly revisit the more widely studied problem of induced subgraph isomorphism for undirected graphs. Here, the phenomenon, that different choices of with the same number of vertices, lead to different time complexities is well-documented. Consider the following two graphs: Here, the subgraph isomorphism problem is hard in case of in the sense that no linear-time algorithm is known. In fact, the only known subcubic algorithms are based on fast matrix multiplication, which imply that, at least in theory, the problem is solvable in time (however these algorithms are not practical). Generally, it is conjectured that there can be no algorithm for any , which is known as the sparse triangle hypothesis (see Wikipedia for a more detailed overview). Conversely, for the problem is almost trivially solvable in linear-time due to the following lemma.
Lemma (Folklore)
Let be a connected undirected graph. Then, contains an induced subgraph if, and only if, it is not complete (i.e. fully connected/a clique).
Proof.
We show two directions. If contains an induced subgraph , then it is not complete. Conversely, assume is not complete. Hence, there exist and , which are non-adjacent. Note that there exists a shortest path from to by connectedness of . Clearly, this path is induced, else it would not be shortest. Hence, either we have for some or there exists such a triple on the path.
It immediately follows that, to detect whether a graph contains an induced subgraph , it suffices to find the connected components of and check whether each of them is a clique (this can be done by counting edges).
Indeed, this observation completes the picture for three-vertex undirected graphs as the remaining graphs are analogues of the discussed cases obtainable by changing edges into non-edges and vice versa (complement graphs). Interestingly, similar (more nuanced) effects also appear on four vertices as discussed in detail by Williams, Wang, Williams, Yu (2015), which studies all possible cases in this setting.
We will now get back to DAGs. The two induced subgraphs and , on the surface, both look similar to the case for undirected graphs, which would be obtained when ignoring edge directions. The following shows that directions matter.
Triangle Hardness of Checking Transitivity
We begin by showing that detecting induced subgraph is at least as hard as detecting a triangle in an undirected graph. This is possible by giving a reduction from the latter to the former problem. This reduction has been described e.g. in this StackExchange post.
We give a version of the reduction, which is a bit easier to illustrate, starting with the triangle detection problem on tripartite graphs. This is justified by the fact that it is as hard to find a triangle in a general graph, as it is to find one in a tripartite one (the reduction is straightforward and works by creating three copies of the initial vertex set).
Hence, let be an instance of the triangle detection problem with partitions . Then, construct a directed graph with the same vertex set and edges
- if ,
- if and
- if ,
where , respectively , are assumed to be from partition . Here is an illustration of this reduction:
Observe that the resulting graph is acyclic by definition. Moreover, if contains a triangle on vertices , and , then by construction contains with and non-adjacent. Conversely if contains an induced subgraph , then observe that are in different copies of (this is crucial). It follows that contains a triangle and thus:
Lemma
If there is an algorithm for checking DAG transitivity, then triangle detection in undirected graphs can be solved in time .
Proof. Construct graph in time . Then, run the algorithm for DAG transitivity to obtain the answer (as argued above).
Linear-Time Algorithm for Checking Morality
In contrast, checking whether a graph is moral is possible in linear-time. The algorithm given below is based on the one for checking whether an ordered graph has zero fill-in3 given here. Simplified and slightly adapted to our setting, it proceeds as follows:
- Compute a topological ordering of the vertices of (that is a linear ordering of the vertices such that for every edge it holds that precedes in the ordering).
- For each vertex , compute the vertex with such that is closest to in the topological ordering. Denote this vertex as .
- For each vertex and every , check that and are adjacent (for ). If this is not the case, terminate and return “not moral”.
- Return “moral”.
Lemma [Rose, Tarjan, Lueker (1976); Tarjan, Yannakakis (1984); Whitten (1978, unpublished)]
Let be a DAG. Testing whether is moral can be done in linear time in the size of (that is , when has vertices and edges).
Proof.
For the time complexity, observe that the algorithm performs at most one adjacency check per edge. We assume here that adjacency tests are possible in constant time4.
We now show correctness. Assume the algorithm returns not moral. Then, there exists with and non-adjacent in and thus the output is correct. Conversely, assume is not moral. Then, there exist non-adjacent vertices and (w.l.o.g. assume precedes in the topological ordering computed by the algorithm) such that there is a vertex with and . Choose a triple such that the vertices and are closest of all “non-moral” triples. Clearly, the algorithm outputs not moral if . For this not to be the case, there would need to be a different vertex . It would be adjacent to and by assumption that is closest to of all vertices participating in immoral triples. However, then would be a non-moral triple with closer to than to . A contradiction.
Another line of argument is that, when imagining that the algorithm iterates over the topological ordering from left to right, we can assume by induction, when handling vertex and assuming the algorithm has not terminated yet, that the parents of are moral. Then, it is clear that it suffices to check whether all other parents of are connected to to certify that the parents of are fully connected as, by induction hypothesis, the parents of are. This is illustrated in the following picture where , , are fully connected by induction as parents of .
Further Remarks
To reflect why checking morality is easier than checking transitivity, let us try to apply the same reduction from triangle detection. Hence, we construct, in similar spirit as above, for undirected graph , a DAG with the same vertex set and edges
- if ,
- if and
- if .
By definition the graph is acyclic and, as before, a triangle in will lead to a in . The issue lies in the reverse direction. If we find in , it might be that and are both in . Then, this does not imply a triangle in . See the following example as illustration:
Here, does not contain a triangle, however contains an immorality (marked in orange). As the algorithm above suggests, it is likely not possible to get such a reduction to work. Intuitively, the reason is that allows us to distinguish between and as one is the child and the other the parent of , whereas for vertices and cannot be distinguished which makes the problem combinatorially easier. It would be interesting to further formalize this argument, e.g., for larger graphs .
Another question is what happens when dropping the acyclicity requirement? The linear-time algorithm for checking morality explicitly uses the topological ordering of the vertices and thus cannot be used anymore. If the graph is cyclic (let’s say there is still only one edge between any pair of vertices, such graphs are known as oriented graphs), it is unclear whether it is possible to solve the problem in linear-time. I tend to think it has similar hardness to triangle-detection, but it is not directly clear how to prove this by reduction. I would be very interested if you have any ideas how to tackle this question :)
Citation
You can cite this blog post as:
@article{wienoebst2023subgraphisomorphism,
title = {Three-Vertex Induced Subgraph Isomorphism in DAGs},
author = {Wien{\"o}bst, Marcel},
journal = {Blog post at mwien.github.io},
year = {2023},
howpublished = {\url{mwien.github.io/blog/subgraph-isomorphism-dags/}}
}
References
Rose, D. J.; Tarjan, R. E.; and Lueker, G. S. 1976. Algorithmic Aspects of Vertex Elimination on Graphs. In SIAM Journal on Computing 5(2).
Tarjan, R.E.; and Yannakakis, M. 1984. Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. In SIAM Journal on computing, 13(3).
Whitten, G.. 1978. Presentation at the SIAM Symposium on Sparse Matrix Computations.
Williams, V. V.; Wang, J. R.; Williams, R.; and Yu, H. 2015. Finding Four-Node Subgraphs in Triangle Time. In Proceedings of the 2015 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA).
Footnotes
Footnotes
-
The name moral graph alludes to the fact that parents of a vertex need to be adjacent, that is married. As often, this historical name is less-than-ideal in hindsight. I still decided to keep it so that this blog post is google-able. ↩
-
Induced means that the subgraph of needs to have the same non-edges as as well. See this article as a reference for the considered problem. ↩
-
We will not go into more details here, but essentially, for a DAG , the undirected graph obtained by ignoring edge directions of has zero fill-in with regard to any topological ordering of if, and only if, does not contain as induced subgraph. Hence, any algorithm for testing zero fill-in directly translates to one for checking morality. ↩
-
For sparse graphs, adjacency checks can be performed in expected constant time using space when storing the neighbors of each vertex in a hash-table. Adjacency tests can also be avoided altogether, yielding worst-case time (see e.g. the “Test For Zero Fill-In” algorithm in Tarjan, Yannakakis [1984]). I chose the presentation above for clarity. ↩