Traversing Trees for Rubyists
Learning to Traverse a Tree
For those of you who live in Colorado like I do, I'm sure that you most certainly understand one meaning of a 'Tree', but I would like to show you another meaning in the world of computer science and software development. Whether you realize it or not you've probably implemented some code that represents a tree structure.
I've made a decision to learn more about data structures and algorithms because I know that one day I will need to answer an interview question about tree traversal that will make or break the success of my career. I used to think that this knowledge is unnecessary and that its absurd to get asked questions about Trees in an interview. Getting turned down time and time again, I decided to stop complaining and just learn it. What better way to learn about something than to blog about it. Little did I know that it would actually have a positive impact on my current work.
What is a Tree?
I would like to explain the meaning of various methods of traversing a tree such as in-order, pre-order, post-order, and breadth-first traversal. First, we need to understand what a tree is and more specifically what a binary tree is. A tree in computer science is much like a tree in nature. It's got a root node which is the trunk and expands out into branches. Each split in a tree branch is considered to be a node. A binary tree is one in which a node only has two children: a left node and a right node. Here is an example of a binary tree which we will use throughout as an example:
I / \ O H / \ / \ L R T M / \ A G
Creating the Tree
The class for a node is really simple in Ruby, but be sure that you first understand OOP concepts. The only attributes we need for a node are left and right which correspond to the left and right child of the node and the value at the node. Most algorithms that involve actually changing the structure of a tree do so by changing a nodes pointer, not its value. This is why we left the value attribute as a reader only. Some attributes that are often included in a node class may be the parent and the depth as well.
class Node attr_accessor :left, :right attr_reader :val def initialize(val) @val = val end end
Build the Tree
We can certainly use one of the algorithms we are going to discuss to build the tree, but I'm just going to manually set the left and right pointers manually starting at the root node and work my way down each level.
root = Node.new("I") # first level root.left = Node.new("O") root.right = Node.new("H") # second level root.left.left = Node.new("L") root.left.right = Node.new("R") root.right.left = Node.new("T") root.right.right = Node.new("M") # third level root.left.left.left = Node.new("A") root.left.left.right = Node.new("G")
Traversing the Tree
The idea behind tree traversal algorithms is to print or get all the values of the tree but differ by the order in which those values are retrieved. There are certainly others, but I'm going to talk about four in particular. Pre Order, In Order, and Post Order are depth first algorithms meaning that they start at the top and go all the way down to the bottom of the tree before visiting other nodes in higher levels. Breadth First traversal is a form of a level order traversal in which you start at the top and traverse the nodes at each level.
Enough talk, lets see what it looks like. I'll first show the result of each algorithm first so you can see what order of values is actually returned. Results are returned as an array, but typically you might use these algorithms to just print out values, do some filtering, or even some data processing.
# Visual Aid w/ indices of results: # I 0 # / \ / \ # O H 1 6 # / \ / \ / \ / \ # L R T M 2 5 7 8 # / \ / \ #A G 3 4 # result for pre-order traversal pre_order(root) => ["I", "O", "L", "A", "G", "R", "H", "T", "M"]
You can see that in pre order we first value returned is "I" which is the root node, then down the left hand side down to A. After this, the right node of "L" which is "G", the right node of "O" which is "R", and so on.
Let's take a look at the solution. The idea is that we first visit the root, then recursively visit the left nodes, then recursively search the right nodes.
def pre_order(node) return  if node.nil? results =  results << node.val results.concat pre_order(node.left) results.concat pre_order(node.right) results end
# Visual Aid w/ indices of results: # I 5 # / \ / \ # O H 3 7 # / \ / \ / \ / \ # L R T M 1 4 6 8 # / \ / \ #A G 0 2 # what's that spell? # result for in-order traversal in_order(root) => ["A", "L", "G", "O", "R", "I", "T", "H", "M"]
Do you see a pattern in the results? You can see that the order in which the tree is traversed occurs from left to right and that for any node in the indices tree the left child is a value smaller than the right child. This is the definition of a Binary Search Tree, one in which the values are certain to be in this order. You can see how in-order traversal is particularly useful for binary search trees.
Now the solution, In Order follows the pattern of first recursively visiting the left node, then the node itself, then the right node.
def in_order(node) return  if node.nil? results =  results.concat in_order(node.left) results << node.val results.concat in_order(node.right) results end
# Visual Aid w/ indices of results: # I 8 # / \ / \ # O H 4 7 # / \ / \ / \ / \ # L R T M 2 3 5 6 # / \ / \ #A G 0 1 # result for post-order traversal post_order(root) => ["A", "G", "L", "R", "O", "T", "M", "H", "I"]
In Post Order traversal we see that the root node is the last item in the array. Based on the first two algorithms I'm sure you can see where this is going. We first recursively visit the left node, then the right node, then the node itself.
def post_order(node) return  if node.nil? results =  results.concat post_order(node.left) results.concat post_order(node.right) results << node.val results end
# Visual Aid w/ indices of results: # I 0 # / \ / \ # O H 1 2 # / \ / \ / \ / \ # L R T M 3 4 5 6 # / \ / \ #A G 7 8 # result for breadth-first traversal breadth_first(root) => ["I", "O", "H", "L", "R", "T", "M", "A", "G"]
Breadth First search is just like the order in which we manually built the tree above. We visit each from left to right at each level before going down to the next level. Implementing this is a bit tricky. The method that I am familiar with is to use a queue. A queue is a data structure in which you add items onto the end of it and remove items from the beginning, also known as a FIRST-IN-FIRST-OUT (FIFO) structure. We can use a ruby array to act like a queue.
The solution involves first adding the root node to the queue and continue a loop while the queue is not empty. Call shift on the queue which returns the next node, then add that nodes left child if it exists and that nodes right child if it exists. The value of the next node is added to results immediately after it's polled from the queue.
def breadth_first(node) results =  queue =  return  if node.nil? queue << node while !queue.empty? next_node = queue.shift results << next_node.val if !next_node.left.nil? queue << next_node.left end if !next_node.right.nil? queue << next_node.right end end results end
What the hell do I need to know this for?
Give yourself a pat on the back, you made through it and if you look at it, it's really not that hard. If you are a web developer, then you have most certainly come across a tree-like structure such as HTML, JSON, XML. The best example I can think of is a web application in which you need to dynamically build a navigation menu with a depth of 2 or more levels. In this case you are not working with a binary tree, but a tree in which nodes could have zero or more children. Now that you have a better understanding of Tree data structures, I'm sure that it will make your job much easier as a developer.