Hierarchy is everywhere. How can we model it using symbolic regression?
Given a hierarchical system with N state variables (v1,v2, ..., vN) and K inputs (s1, s2, ..., sK)

figure 1

symbolic regression will return a set of flat expressions:
v3 = v2 * (s1 - s2)
v2 = s3 * s4
v1 = s1 - s2
The more complex the system is, the less intelligible the models will be. Here, we proposed to build functional dependency graphs based on symbolic regression.

From Data to Hierarchy: the Naive Approach

figure 2

The Hierarchical Approach

Instead of the naive approach, we pursue a bottom-up approach. At the beginning, only the stimuli (input) variables are included in the set of independent variables. Until all state variables are modeled:
  • at each epoch, the state variable that is best explained by a subset of the current independent variables is selected
  • these independent variables are removed from the set of independent variables and the selected state variable is added instead
  • the identifed functional relationship is marked on the adjacency matrix
We proposed two algorithms (h1 and h2) that differed only in the method of selection of the best modeled variable in each epoch (figure 1). Algorithm h1 selects the best model and then extracts the corresponding dependent and independent variables from that model. Algorithm h2 first identifies the best pareto set among the independent symbolic regression runs and then identifies the most frequently used variables for modeling the corresponding variable (more details can be found in our paper).
figure 3

Our hypotheses

  • the naive approach does not guarantee discovery of hierarchy
  • low (mean) prediction error does not necessarily mean that the correct topology is captured
  • if you want to capture hierarchy, you need to explicitly seek hierarchy
The following example shows a binary hierarchical system and two models generated using the naive and hierarchical approaches. The naive approach generates a model with high predictive power, however the topology of the graph is not even remotely similar to the hidden target system. On the other hand, the hierarchical approach enforces a hierarchical topology, however misplacing one edge has caused a larger prediction error.

figure 4

Experimental Results

The results are reported in terms of three measures:
  • The percentage of correct edges recovered (1)
  • Error on testing dataset (2)
  • Percentage of correct edges recovered versus error on testing dataset (3)
The results for a set of binary hierarchical systems are shown in figure 5. For the complete set of results, please see our paper.

figure 5


  • for binary systems, the hierarchical approaches continue to outperform the naive approach as the layers of hierarchy in the hidden target system increases
  • for three-layer hierarchical hidden target systems, the hierarchical approach outperforms the naive approach until arity=5
  • the diffculty of scaling (all three approaches) to hierarchical systems with higher arity is due to the inherent issue of feature selection in symbolic regression (being studied separately)


[1] Icke, I., and Bongard, J., Automatic identification of hierarchy in multivariate data, GECCO 2013, Amsterdam, Netherlands poster (pdf)
[2] Icke, I., and Bongard, J., Modeling Hierarchy Using Symbolic Regression, IEEE CEC 2013, Cancun, Mexico