Caterpillars were first studied in a series of papers by Harary and Schwenk. The name was suggested by A. Hobbs. As Harary & Schwenk (1973) colorfully write, "A caterpillar is a tree which metamorphoses into a path when its cocoon of endpoints is removed."
The following characterizations all describe the caterpillar trees:
- They are the trees for which removing the leaves and incident edges produces a path graph.
- They are the trees in which there exists a path that contains every vertex of degree two or more.
- They are the trees in which every vertex of degree at least three has at most two non-leaf neighbors.
- They are the trees that do not contain as a subgraph the graph formed by replacing every edge in the star graph K1,3 by a path of length two.
- They are the connected graphs that can be drawn with their vertices on two parallel lines, with edges represented as non-crossing line segments that have one endpoint on each line.
- They are the trees whose square is a Hamiltonian graph. That is, in a caterpillar, there exists a cyclic sequence of all the vertices in which each adjacent pair of vertices in the sequence is at distance one or two from each other, and trees that are not caterpillars do not have such a sequence. A cycle of this type may be obtained by drawing the caterpillar on two parallel lines and concatenating the sequence of vertices on one line with the reverse of the sequence on the other line.
- They are the trees whose line graphs contain a Hamiltonian path; such a path may be obtained by the ordering of the edges in a two-line drawing of the tree. More generally the number of edges that need to be added to the line graph of an arbitrary tree so that it contains a Hamiltonian path (the size of its Hamiltonian completion) equals the minimum number of edge-disjoint caterpillars that the edges of the tree can be decomposed into.
- They are the connected graphs of pathwidth one.
- They are the connected triangle-free interval graphs.
A k-tree is a chordal graph with exactly n − k maximal cliques, each containing k + 1 vertices; in a k-tree that is not itself a (k + 1)-clique, each maximal clique either separates the graph into two or more components, or it contains a single leaf vertex, a vertex that belongs to only a single maximal clique. A k-path is a k-tree with at most two leaves, and a k-caterpillar is a k-tree that can be partitioned into a k-path and some k-leaves, each adjacent to a separator k-clique of the k-path. In this terminology, a 1-caterpillar is the same thing as a caterpillar tree, and k-caterpillars are the edge-maximal graphs with pathwidth k.
For n = 1, 2, 3, ... the numbers of n-vertex caterpillars are
- 1, 1, 1, 2, 3, 6, 10, 20, 36, 72, 136, 272, 528, 1056, 2080, 4160, ... (sequence A005418 in the OEIS).
Finding a spanning caterpillar in a graph is NP-complete. A related optimization problem is the Minimum Spanning Caterpillar Problem (MSCP) , where a graph has dual costs over its edges and the goal is to ﬁnd a caterpillar tree that spans the input graph and has the smallest overall cost. Here the cost of the caterpillar is defined as the sum of the costs of its edges, where each edge takes one of the two costs based on its role as a leaf edge or an internal one. There is no f(n)-approximation algorithm for the MSCP unless P = NP. Here f(n) is any polynomial-time computable function of n, the number of vertices of a graph.
There is a parametrized algorithm that ﬁnds an optimal solution for the MSCP in bounded treewidth graphs. So both the Spanning Caterpillar Problem and the MSCP have linear time algorithms if a graph is an outerplanar, a series-parallel, or a Halin graph.
Caterpillar trees have been used in chemical graph theory to represent the structure of benzenoid hydrocarbon molecules. In this representation, one forms a caterpillar in which each edge corresponds to a 6-carbon ring in the molecular structure, and two edges are incident at a vertex whenever the corresponding rings belong to a sequence of rings connected end-to-end in the structure. El-Basil (1987) writes, "It is amazing that nearly all graphs that played an important role in what is now called "chemical graph theory" may be related to caterpillar trees." In this context, caterpillar trees are also known as benzenoid trees and Gutman trees, after the work of Ivan Gutman in this area.
- Harary, Frank; Schwenk, Allen J. (1973), "The number of caterpillars", Discrete Mathematics, 6 (4): 359–365, doi:10.1016/0012-365x(73)90067-8.
- El-Basil, Sherif (1987), "Applications of caterpillar trees in chemistry and physics", Journal of Mathematical Chemistry, 1 (2): 153–174, doi:10.1007/BF01205666.
- Harary, Frank; Schwenk, Allen J. (1971), "Trees with Hamiltonian square", Mathematika, 18: 138–140, doi:10.1112/S0025579300008494.
- Harary, Frank; Schwenk, Allen J. (1972), "A new crossing number for bipartite graphs", Utilitas Math., 1: 203–209.
- Raychaudhuri, Arundhati (1995), "The total interval number of a tree and the Hamiltonian completion number of its line graph", Information Processing Letters, 56 (6): 299–306, doi:10.1016/0020-0190(95)00163-8.
- Proskurowski, Andrzej; Telle, Jan Arne (1999), "Classes of graphs with restricted interval models" (PDF), Discrete Mathematics and Theoretical Computer Science, 3: 167–176.
- Eckhoff, Jürgen (1993), "Extremal interval graphs", Journal of Graph Theory, 17 (1): 117–127, doi:10.1002/jgt.3190170112.
- Weisstein, Eric W. "Lobster". MathWorld.
- Khosravani, Masoud (2011). Searching for optimal caterpillars in general and bounded treewidth graphs (Ph.D.). University of Auckland.
- Gutman, Ivan (1977), "Topological properties of benzenoid systems", Theoretica Chimica Acta, 45 (4): 309–315, doi:10.1007/BF00554539.
- El-Basil, Sherif (1990), "Caterpillar (Gutman) trees in chemical graph theory", in Gutman, I.; Cyvin, S. J., Advances in the Theory of Benzenoid Hydrocarbons, Topics in Current Chemistry, 153, pp. 273–289, doi:10.1007/3-540-51505-4_28.