Assignment 2: Binary trees

Deadline: Nov 18, 2024 14:00 CET

In this set of exercises you will practice working with binary trees. We will work with examples of binary phrase structure (syntactic analyses of natural language sentences).

Exercises

Implement the following methods in the BinaryTree class in the template, which represents a binary tree as a Python list (using it as an array). Please follow the (standard) procedure for implementing binary trees with arrays as described during the lectures.

Throughout the implementation, we use indices of the array where a node is stored as the ID of the node.

For empty slots in the array (the locations corresponding to non-existing nodes), use None as a placeholder. The size of the list should be the minimum size required by the tree represented.

As usual a set of test cases are provided.

Penn Treebank format

A popular format for storing parse trees is so-called Penn treebank notation. The nodes represent either a phrase label (like NP below), or a word, or terminal (like John below). The format includes a hierarchical organization of parenthetical expressions. The label of an internal node is indicated just after an opening parentheses. The leaf nodes, the words, are not placed inside parenthesis, and indicated next to their immediate parent. The unary rules (internal nodes with a single child) are only allowed if the child is a leaf node (word). Hence, every expression inside the parentheses are either two labels (a leaf node preceded by its parent, no parentheses) or a node label followed by its left and right subtree (both subtrees surrounded by parentheses). An example parse tree and its Penn treebank representation are provided below.

(S (NP John)
   (VP (V ate)
       (NP (Det a)
           (N pizza)
       )
   )
)

Note that we are only concerned with binary trees of the form above (actual Penn treebank format allows n-ary trees with arbitrary n). We also assume that the terminal nodes (the words) are always the left child of their parent.

Wrapping up

Do not forget to