May 8, 1995 Introduction The use of multimedia tools in the classroom is becoming increasingly frequent as technology advances. Our motivation is to take advantage of this interest and help undergraduate students in Computer Science to understand better some basic concepts of Data Structures. Algorithm visualization is one way of presenting this relatively abstract material in very clear and concrete terms. The main goal is to present the concept of data structures using three different approaches: - Textual definition and explanation, providing the theoretical basis necessary for understanding such algorithms. - Pseudo-code presentation and trace control of the algorithms. - Algorithm visualization, linking symbolic representation of the data structure and the actual execution of the algorithm in the code window. Background Very often in Computer Science applications, there is a need to discover if a collection contains a specific element. These types of problems are called search problems, and specially for large data sets it is very important that search algorithms be as efficient as possible. Whereas binary search is an appropriate technique for the problem of static searching, where the data values are known in advance and it is not possible to add or delete elements from the collection (or where addition and removal of elements is very infrequent), the more general case involves dynamic searching schemes. In these cases a vector may not be an appropriate structure, since it requires O(n) operations in the worst case. Using alternative data structures, we can reduce this complexity significantly. One such data structure is the binary search tree. A binary search tree is organized, as the name suggests, in a binary tree, as shown in Figure 1.1. Such a tree can be represented by a linked data structure in which each node is an object. In addition to a key field, each node contains fields left and right that point to the nodes corresponding to its left and its right child respectively. If a child is missing, the appropriate field contains the value NIL.  Figure 1.1 : A very balanced tree Binary Search Tree Properties The keys in a binary search tree are always stored in such a way to satisfy the binary search tree property: "Let x be a node in a binary search tree. If y is a node in the left subtree of x, then value[y]<=value[x]. If y is a node in the right subtree of x, then value[x]<=value[y]."[2] Thus, in Figure 1.2 a, the value of the root is 5, the values 2, 3, and 4 in its left subtree are no larger than 5, and the values 7 and 8 in its right subtree are no smaller than 5. The same property holds for every node in the tree. For example, the value 3 in the Figure 1.2 b is no smaller than the value 2 in its left subtree and no larger than the value 5 in its right subtree.   (a) (b) Figure 1.2 Examples of Binary Trees Operations on Binary Search Trees The most common operation performed on a binary search tree is searching for a key (value) stored in the tree. Besides the FIND operation, binary search trees support dynamic set operations like insertion and deletion (clearing is just deleting all the nodes in the tree). In our examples, we are going to show animations of the insertion and deletion operations mentioned above. Besides FIND, INSERT, and DELETE, the CLEAR operation deletes all the nodes of a binary search tree. Further explanation about each algorithm can be found under the different menu options. It is important to note that these operations can all be supported in time O(h) on a binary tree of height h, which is a major improvement over the binary search using a vector implementation (which takes O(n) time) . Insertion To insert a new value "value" into a binary search tree T, we use the procedure TREE_INSERT. The procedure receives the value to insert and the tree pointer T. It traverses the tree recursively until it finds the right place of insertion, and updates the tree structure accordingly. The pseudo-code shown below illustrates how TREE_INSERT works. The procedure starts at the root of the tree and traces a recursive path downward. The if statement at line 3 tests if the value to be inserted is less than the actual value in the node pointed by T. If this is true, the procedure calls itself recursively, but this time it passes the left subtree of T as a parameter. The reason for that follows straight from the Binary Search Tree property, which states that smaller values should always be left-descendents of their respective ancestors. Conversely, if "value" is greater than the value of T, it passes the right subtree of T as a parameter for the next recursive call in line 5. TREE_INSERT terminates after it finds a NIL child (a new leaf is always created as a result of this procedure) and it backtracks from its recursive calls, updating the tree structure. TREE_INSERT (T, value) begin 1. if (T = NIL) then 2. T = alloc_node(NIL); 3. else if (val < T->value) then 4. TREE_INSERT (T->left, val); 5. else TREE_INSERT(T->right, val); 6. end. Deletion The procedure for deleting a given node with value "value" from a binary search tree takes a pointer to T and the value to be deleted. This procedure is somewhat more complex because it deals with three different possibilities, instead of just one. If the node pointed by "value" has no children, the conditions on lines 7., 8. and 9. are false, temp gets the value of T, and is deleted in line 13. In other words, if the node to be deleted has no children, we simply delete it. If the node has only one child, either one of the two conditions in line 7 or 8 will be true, and the statements t= t->right or t=t->left will have the effect of "splicing out" the node by making a link between its child and its parent. After the link is make, the node is deleted in statement 13. The most complex situation arises when the node has two children. In this case, the procedure slices out the node's successor DELETE_MIN (T->right) and replace the contents of T with the contents of this successor. This is represented in the code by the statements in lines 9-13. Notice that procedure DELETE_MIN, when passed the argument (T->right), finds and deletes the minimum value in the right subtree of T. Since this tree contains only contains values greater than T->value (and greater values cannot be anywhere else in the tree structure, ) finding its minimum corresponds to finding the successor of T. DELETE (T, value) NODE_PTR temp; begin 1. if (T = NIL) then print("Value not in tree") 2. else if (value < T -> value) then DELETE (T->left, value) 3. else if (value > T -> value) then DELETE (T->right,value) 4. else begin /* After finding the node then delete it */ 5. alloc_node(temp); 6. temp = T; 7. if (T->left = NIL) then T = T -> right 8. else if (T->right = NIL) then T = T -> left 9. else begin 10. temp = DELETE_MIN (T->right); 11. T -> value = temp -> value; 12. end else; 13. free_node(temp); 14 end else; 15.end begin. Clear Clearing the tree is a useful operation and it helps us to understand the recursive structure of the algorithms presented here. The procedure receives a node T that traverses the tree downward starting from left until it reaches its lowest level. It then traverses the subtree at the right, deleting a node whenever it reaches a NIL node in the bottom. This particular procedure performs what is knows as post-order traversal, because it first goes through both the left and right subtrees before reaching back the root. TREE_CLEAR (value) begin 1. if (x <> NIL) then 2. begin 3. TREE_CLEAR(x->left); 4. TREE_CLEAR(x->right); 5. DELETE(x); 6. endif 7. end. Find We use the following procedure to search for a node with a predefined value in a binary search tree. Given a pointer to the root of the tree and a number,"value", FIND returns a pointer to a node with the same value if one exists. Otherwise, it returns NIL. The procedure begins its search at the root and traces a path downward in the tree. For each node T encountered, it compares the value passed with T->value. If the two keys are equal, the search terminates. If T is smaller than value, the search continues in the left subtree of x, since the binary search tree property implies that k could not be stored in the right subtree. Symmetrically, if value is larger than T-> value, the search continues in the right subtree. The node encountered during the recursion forms a path downward from the root of the tree. FIND (T, value) 1. if (T = NIL) or (T->value = value) 2. then return T 3. if (k < T->value) 4. then return FIND(T->left, value) 5. else return FIND(T->right, value); Credits  Dr. Ronald D. Kriz is an associate professor of the Department of Engineering Science and Mechanics, and of the Department of Material Science and Engineering. His areas of research include solid mechanics, materials research, structural design, and nondestructive testing.  Gordon G. Miller, III is currently the director of the Multimedia Lab at Virginia Tech, a cross-platform multimedia development facility supported by the College of Engineering and the College of Architecture and Urban Studies. References [1] Bove, Tony; Rhodes, Cheryl. Official Macromedia Director Studio, Random House/New Media Series, 1994. [2] Cormen, Thomas H.; Leiserson Charles E.; Rivest Ronald L. Introduction to Algorithms., MIT Press, 1990. [3] Macromedia Director version 4, Using Lingo, Macromedia Inc., 1994. [4] Yang, Jun; Shaffer Clifford A.; Heath Lenwood S. The Swan User's Manual, version 0.1, Department of Computer Science, Virginia Tech, 1994.