Before You Code

Understanding Abstract Syntax Trees

Before building AST analyzers in OCaml, let's develop intuition for how code becomes trees, why tree structure matters, and the key operations (traversals, symbol tables, transformations) you'll implement in the exercises.

Why Do We Need ASTs?

Source code is just a string of characters. To analyze or transform it, we need a structured representation that captures the meaning — not just the syntax. Consider this expression:

The Precedence Problem
3 + 4 * 2

Does this equal 14 (evaluating left-to-right: 7*2) or 11 (respecting precedence: 3+8)? A flat string can't tell us. But a tree makes the structure explicit:

      *          ← evaluates last
     / \
    +   2        ← 3+4=7, then 7*2=14
   / \
  3   4

  Result: 14  ← WRONG!

The Problem

Reading left-to-right ignores precedence. Without tree structure, a flat scan of "3 + 4 * 2" would compute (3+4)*2 = 14 instead of 3+(4*2) = 11.

Structure over Syntax

ASTs strip away irrelevant details (whitespace, parentheses, semicolons) and keep the semantic structure.

Foundation for Analysis

Every analysis tool — linters, type checkers, optimizers — works on ASTs. It's the universal intermediate representation.

Enable Transformations

Refactoring, optimization, and code generation all work by transforming one AST into another.

From Code to Tree

Watch how a parser builds an AST from a single line of code, step by step. Notice how operator precedence is captured by tree depth.

Step 1/6

Source

let x = 3 + 4 * 2

AST

Program
└── LetStmt ???

Start: identify the statement

The parser sees `let x = ...` and creates a LetStmt node. But it hasn't parsed the right-hand side yet.

Tree Traversals

Once you have an AST, you need to walk it. The order you visit nodes determines what you can compute. Step through three traversal orders on the expression tree for 3 + 4 * 2.

Order:
+
/ \
3 *
/ \
4 2
1/5

pre-order

Visit the node BEFORE its children. Root first, then recurse left, then right. Used for: copying trees, prefix notation, serialization.

Visit order

Symbol Tables & Scope Lookup

A symbol table maps variable names to their declarations. When code has nested scopes, the same name can refer to different things. Click a lookup scenario to see how the scope chain resolves variables.

1let x = 10 // line 1: global scope
2let y = x + 1 // line 2: global scope
3{ // line 3: enter inner scope
4 let x = 20 // line 4: inner scope (shadows!)
5 let z = x + y // line 5: inner scope
6} // line 6: exit inner scope
7let w = x + 2 // line 7: back in global scope

Looking up x on line 2

1.globalFound!x = 10

Result

x = 10 (from global)

On line 2, `x` is referenced. The symbol table looks in the current scope (global) and immediately finds `x` declared on line 1 with value 10.

AST Transformations

Once you can build and traverse ASTs, you can transform them — rewriting the tree to optimize, simplify, or prepare code for further analysis. Here are three fundamental transforms you'll implement in the exercises.

Constant Folding

Evaluate compile-time constants

When both operands of an operator are literals, the compiler can compute the result at compile time instead of at runtime.

Before

1let x = 3 + 4 * 2
2let y = x + 1

After

1let x = 11
2let y = x + 1

How It Works

The compiler walks the AST bottom-up (post-order). At `4 * 2`, both children are IntLit — so it folds to IntLit(8). Then `3 + 8` folds to IntLit(11). This is why post-order traversal matters for evaluation!

Ready to Build Your Own?

You now understand how ASTs represent code, how to traverse them, how symbol tables resolve names, and how transformations rewrite trees. Time to implement these concepts in OCaml.