Traversing Trees for Rubyists

# Learning to Traverse a Tree

For those of you who live in Colorado like I do, I'm sure that you most certainly understand one meaning of a 'Tree', but I would like to show you another meaning in the world of computer science and software development. Whether you realize it or not you've probably implemented some code that represents a tree structure.

I've made a decision to learn more about data structures and algorithms because I know that one day I will need to answer an interview question about tree traversal that will make or break the success of my career. I used to think that this knowledge is unnecessary and that its absurd to get asked questions about Trees in an interview. Getting turned down time and time again, I decided to stop complaining and just learn it. What better way to learn about something than to blog about it. Little did I know that it would actually have a positive impact on my current work.

# What is a Tree?

I would like to explain the meaning of various methods of traversing a tree such as in-order, pre-order, post-order, and breadth-first traversal. First, we need to understand what a tree is and more specifically what a binary tree is. A tree in computer science is much like a tree in nature. It's got a root node which is the trunk and expands out into branches. Each split in a tree branch is considered to be a node. A binary tree is one in which a node only has two children: a left node and a right node. Here is an example of a binary tree which we will use throughout as an example:

      I
     / \
    O   H
   / \ / \
  L  R T  M
 / \
A   G

# Creating the Tree

The class for a node is really simple in Ruby, but be sure that you first understand OOP concepts. The only attributes we need for a node are left and right which correspond to the left and right child of the node and the value at the node. Most algorithms that involve actually changing the structure of a tree do so by changing a nodes pointer, not its value. This is why we left the value attribute as a reader only. Some attributes that are often included in a node class may be the parent and the depth as well.

class Node
  attr_accessor :left, :right
  attr_reader :val
  def initialize(val)
    @val = val
  end
end

# Build the Tree

We can certainly use one of the algorithms we are going to discuss to build the tree, but I'm just going to manually set the left and right pointers manually starting at the root node and work my way down each level.

root = Node.new("I")

# first level
root.left = Node.new("O")
root.right = Node.new("H")

# second level
root.left.left = Node.new("L")
root.left.right = Node.new("R")
root.right.left = Node.new("T")
root.right.right = Node.new("M")

# third level
root.left.left.left = Node.new("A")
root.left.left.right = Node.new("G")

# Traversing the Tree

The idea behind tree traversal algorithms is to print or get all the values of the tree but differ by the order in which those values are retrieved. There are certainly others, but I'm going to talk about four in particular. Pre Order, In Order, and Post Order are depth first algorithms meaning that they start at the top and go all the way down to the bottom of the tree before visiting other nodes in higher levels. Breadth First traversal is a form of a level order traversal in which you start at the top and traverse the nodes at each level.

Enough talk, lets see what it looks like. I'll first show the result of each algorithm first so you can see what order of values is actually returned. Results are returned as an array, but typically you might use these algorithms to just print out values, do some filtering, or even some data processing.

# Pre Order

# Visual Aid w/ indices of results:
#      I            0
#     / \          / \
#    O   H        1   6
#   / \ / \      / \ / \
#  L  R T  M    2  5 7  8
# / \          / \
#A   G        3   4
# result for pre-order traversal
pre_order(root)
=> ["I", "O", "L", "A", "G", "R", "H", "T", "M"]

You can see that in pre order we first value returned is "I" which is the root node, then down the left hand side down to A. After this, the right node of "L" which is "G", the right node of "O" which is "R", and so on.

Let's take a look at the solution. The idea is that we first visit the root, then recursively visit the left nodes, then recursively search the right nodes.

def pre_order(node)
  return [] if node.nil?
  results = []
  results << node.val
  results.concat pre_order(node.left)
  results.concat pre_order(node.right)
  results
end

# In Order

# Visual Aid w/ indices of results:
#      I            5
#     / \          / \
#    O   H        3   7
#   / \ / \      / \ / \
#  L  R T  M    1  4 6  8
# / \          / \
#A   G        0   2
# what's that spell?
# result for in-order traversal
in_order(root)
=> ["A", "L", "G", "O", "R", "I", "T", "H", "M"]

Do you see a pattern in the results? You can see that the order in which the tree is traversed occurs from left to right and that for any node in the indices tree the left child is a value smaller than the right child. This is the definition of a Binary Search Tree, one in which the values are certain to be in this order. You can see how in-order traversal is particularly useful for binary search trees.

Now the solution, In Order follows the pattern of first recursively visiting the left node, then the node itself, then the right node.

def in_order(node)
  return [] if node.nil?
  results = []
  results.concat in_order(node.left)
  results << node.val
  results.concat in_order(node.right)
  results
end

# Post Order

# Visual Aid w/ indices of results:
#      I            8
#     / \          / \
#    O   H        4   7
#   / \ / \      / \ / \
#  L  R T  M    2  3 5  6
# / \          / \
#A   G        0   1
# result for post-order traversal
post_order(root)
=> ["A", "G", "L", "R", "O", "T", "M", "H", "I"]

In Post Order traversal we see that the root node is the last item in the array. Based on the first two algorithms I'm sure you can see where this is going. We first recursively visit the left node, then the right node, then the node itself.

def post_order(node)
  return [] if node.nil?
  results = []
  results.concat post_order(node.left)
  results.concat post_order(node.right)
  results << node.val
  results
end

# Breadth First

# Visual Aid w/ indices of results:
#      I            0
#     / \          / \
#    O   H        1   2
#   / \ / \      / \ / \
#  L  R T  M    3  4 5  6
# / \          / \
#A   G        7   8
# result for breadth-first traversal
breadth_first(root)
=> ["I", "O", "H", "L", "R", "T", "M", "A", "G"]

Breadth First search is just like the order in which we manually built the tree above. We visit each from left to right at each level before going down to the next level. Implementing this is a bit tricky. The method that I am familiar with is to use a queue. A queue is a data structure in which you add items onto the end of it and remove items from the beginning, also known as a FIRST-IN-FIRST-OUT (FIFO) structure. We can use a ruby array to act like a queue.

The solution involves first adding the root node to the queue and continue a loop while the queue is not empty. Call shift on the queue which returns the next node, then add that nodes left child if it exists and that nodes right child if it exists. The value of the next node is added to results immediately after it's polled from the queue.

def breadth_first(node)
  results = []
  queue = []
  return [] if node.nil?
  queue << node
  while !queue.empty?
    next_node = queue.shift
    results << next_node.val
    if !next_node.left.nil?
      queue << next_node.left
    end
    if !next_node.right.nil?
      queue << next_node.right
    end
  end
  results
end

# What the hell do I need to know this for?

Give yourself a pat on the back, you made through it and if you look at it, it's really not that hard. If you are a web developer, then you have most certainly come across a tree-like structure such as HTML, JSON, XML. The best example I can think of is a web application in which you need to dynamically build a navigation menu with a depth of 2 or more levels. In this case you are not working with a binary tree, but a tree in which nodes could have zero or more children. Now that you have a better understanding of Tree data structures, I'm sure that it will make your job much easier as a developer.

Source: https://github.com/rlafranchi/trees/blob/master/traversal.rb