How Computers Math: AST – David Browning

The answer to how a computer solves a math problem is not obvious, hence why I’m writing this and you’re reading it. I assume you know about order-of-operations and have used a complex calculator. First, let’s look at why math is not so straight-forward:

Solve this expression: $8 + 2 / 2 - 3$

Hopefully you got 6. You cannot solve this by purely going left-to-right. Do the division first, then read left-to-right. Computers must do the same thing except: how does the computer know what parts to do first? And the expressions can get more complicated: $(1 + 3^2) / 2$

Some Words About Trees

Someone smarter than me came up with abstract syntax trees (AST). I’ll stop you now if you are thinking about the green and brown things that we use to make paper. A tree, in computer science, is a way of organizing pieces of information. It’s built using nodes, where each node can have any number of things attached to it but can only be attached to one thing. It looks like the branches of a tree. Its fall right now, so pause and look out your window at the naked trees.

Since my description is a little abstract, look at this picture.

Traversing a tree is when your code visits all the nodes. For example, to count the number of green ornaments, we must visit every node, check if it’s green, and if it is, make a tally mark. When traversing a tree, we start at the root node, but there are a few ways to go from there. We could look at the node and then look at its left child and then its right child. Following that path, here is the order in which we visit all the nodes for my example tree:

A, B, C, D, E, F, G, H

We could also traverse it by looking at the node’s left child, then the right, then the node itself. Here is the order in which we visit the nodes:

C, D, B, G, F, H, E, A

When you traverse a tree starting with a child, you end up starting at the bottom of the tree and working your way up.

How a Computer Solves the Tree

When a computer solves a simple math expression, it doesn’t read it the same way a person does. People read 1 + 2 but computers read + 1 2. (That's just how x86 CPUs work) Here is the assembly code for solving 1 + 2 and putting the answer in memory:

ADC RAX, 1, 2

I use Intel assembly, so the destination register comes after the opcode.

Here's the part that gets me out of bed in the morning to write code. If you traverse the tree differently, you get the code to solve the tree. Assembly code requires you to read parent node, left node, then right node. There is a catch, unlike my earlier example, you must start at the bottom-most nodes. Purely following parent, left, right, gives you this invalid code:

ADC RAX, IMUL RBX, 2, 3, RBX, 2

For that AST we solved earlier, starting at the bottom-most parent and using RAX as the memory where we put the answer, here is the code:

IMUL RBX, 2, 3,
ADC RAX, RBX, 2

The answer is in RAX now.

For another example, let’s look at Adobe PostScript. Instead of discreet pieces of memory like RAX, RBX, etc., PostScript stacks the numbers on top of each other. When you do an operation, the computer pops the top two numbers off, does the operation, and puts the answer on top of the stack. (It’s called a stack machine) You need to traverse the tree so that you get code like this:

1 2 add

Traverse the tree going left node, right node, parent node. Here is the PostScript code you get for our example tree:

2 3 mul
2 add

The answer is on top of the stack.

Some Words About Trees

How a Human Solves the Tree

How a Computer Solves the Tree