Trees

4 minute read

In a Tree, the running time for most operations is O(logN)

4.1 Preliminaries

  • A tree can be defined in several ways. One natural way to define a tree is recursively.
  • A tree is a collection of nodes.
  • If not empty, a tree consists of a distinguished node r, called the root, and zero or more nonempty (sub)trees T1, T2, …. , Tk, each of whose roots are connected by a directed edge from r
  • The root of each subtree is said to be a child of r, and r is the parent of each subtree root.

tree

Figure 4.1 A tree

4.1.1 Implementation of Trees

The solution is simple: Keep the children of each node in a linked list of tree nodes.

#define ElementType int
typedef struct TreeNode* PtrToNode;

struct TreeNode
{
	ElementType Element;
	PtrToNode FirstChild;
	PtrToNode NextSibling;
};

FirstChild points to the head of its child NextSibling points to its brother in the linked list

4.1.2 Tree Traversals with an Application

The structure of a Unix file system. /usr Codes to imply this idea:

static void ListDir( DirectoryOrFile D, int Depth )
{
	if (D is a legitimate entry)
	{
		PrintName(D, Depth);
		if (D is an directory )
		{
			for each child, C of D
				ListDir(C, Depth+1);
		}
	}
}

void ListDirectory( DirectoryOrFile D )
{
	ListDir(D, 0);
}

static void SizeOfDirectory( DirectoryOrFile D )
{
    int TotalSize;
    
    TotalSize = 0;
    if( D is a legitimate entry )
    {
        TotalSize = FileSize( D );
        if( D is a directory )
            for each child C of D
                TotalSize += SizeOfDirectory( C );
    }
    return TotalSize;
}

4.2 Binary Trees

A binary tree is a tree in which no node can have more than two children.

4.2.1 Implementation

  • Because a binary tree has most two children, we can keep direct pointers to them.
  • In that a node is a structure consists of the Key information plus two pointers (left and right) to other nodes.
typedef struct TreeNode* PtrToNode;
typedef struct PtrToNode Tree;

struct TreeNode
{
	ElementType Element;
	Tree Left;
	Tree Right;
};

4.2.2 Expression Trees

tree We can produce an (overly parenthesized) infix expression by recursively producing a parenthesized left expression, then print out the operator at the root, and finally recursively producing a right parenthesized expression.

This general strategy is known as an inorder traversal; it is easy to remember because of the type of expression it produces.

There are two alternative strategies. See the book for more examples.

Constructing an Expression Tree

We now give an algorithm to convert a postfix expression into an expression tree. As an example, suppose the input is: a b + c d e + * * Codes can be found here.

4.3 AVL Tree

An AVL tree is a search tree with a balance condition.

  • Require the left and right subtrees have the same height
  • Every node must have left and right trees with the same height.

An AVL tree is identical to a binary search tree, except that for every node in the tree, the height of the left and right subtrees can differ at most 1. (The height of a empty tree is defined to be -1) AVL-Tree-1 Figure: A AVL tree

Note: Balanced factor (Wikipedia) In a binary tree the balance factor of a node N is defined to be the height difference

BalanceFactor(N) := Height(RightSubtree(N)) – Height(LeftSubtree(N))

of its two child subtrees. A binary tree is defined to be an AVL tree if the invariant

BalanceFactor(N) ∈ {–1,0,+1}

holds for every node N in the tree.

A node N with BalanceFactor(N) < 0 is called “left-heavy”, one with BalanceFactor(N) > 0 is called “right-heavy”, and one with BalanceFactor(N) = 0 is sometimes simply called “balanced”.

The minimum number of nodes, S(h) , in an AVL tree of height h is given by

S(h) = S(h-1) + S(h-2) + 1 
  • Thus, all the tree operations can be performed in O() time, except possibly insertion( we will assume lazy deletion ). When we do a insertion, we need to update all the balancing information for the nodes on the path back to the root, but the reason that insertion is potentially difficult is that inserting a node could violate the AVL Tree property.
  • If this is the case, then the property has to be resorted before the insertion step is considered over. It turns out that this can always be done with a simple modification to the tree, known as a rotation
  • After insertion, only nodes that are on the path from the insertion point to the root might have their balance altered because only those nodes have their subtrees altered.
  • As we follow the path up to the root and update the balancing information, we may find a node whose new balance violates the AVL condition.

Lets call the node that must be rebalanced . Since any node has at most two children, and a height imbalance requires that ’s two subtrees’ height differ by two, it is easy to see that a violation might occur in four cases:

  1. An insertion into the left subtree of the left child of ;
  2. An insertion into the right subtree of the left child of .
  3. An insertion into the left subtree of the right child of .
  4. An insertion into the left subtree of the right child of .

The first case (left-left right-right) is fixed by a single rotation of the tree and the second case (left-right right-left) is handled by a more complex double rotation.

AVL rotation