Parsing Tokens to Expressions and Statements
Learning Objectives
- You understand what expressions and statements are and what they do.
- You can define enums for operators, expressions, and statements.
- You know how to parse primary expressions (numbers, strings, variables).
- You can parse binary operations into expression trees.
- You can parse different statement types (PRINT, LET, IF, GOTO).
From Tokens to Expressions and Statements
Before we can parse tokens into expressions and statements, we need to define what those expressions and statements look like in our code. We’ll distinguish between expressions (which produce values) and statements (which perform actions). We’ll further need a set of operators to represent operations like addition and comparison.
Operators
Operators are symbols that represent operations. We’ll define enums for arithmetic and comparison operators. Modify the src/ast.rs file to match the following:
#[derive(Debug, Clone, Copy)]
pub enum ArithOp {
Add,
Sub,
Mul,
Div,
}
#[derive(Debug, Clone, Copy)]
pub enum ComparisonOp {
Equal,
Greater,
Less,
}
The above enums represent the basic arithmetic operations (+, -, *, /) and comparison operations (=, >, <) that we’ll need for our BASIC interpreter.
Expressions
Expressions are constructs that produce values. In our BASIC interpreter, expressions can be numbers, strings, variables, or binary operations (like addition). We’ll define an Expr enum to represent these different kinds of expressions. Modify the src/ast.rs file to include the following:
#[derive(Debug, Clone)]
pub enum Expr {
Number(i32),
String(String),
Variable(String),
BinOp(Box<Expr>, ArithOp, Box<Expr>),
}
Note how we use Box<Expr> for the left and right operands of the BinOp variant. This is necessary because expressions can be nested, and using Box allows us to avoid infinite size issues with recursive types (as we’ve discussed in the chapter on Structs, Enums, and Error andling in the previous part).
Statements
Statements are constructs that perform actions. In BASIC, statements can include printing values, assigning values to variables, reading input, conditional jumps, and program termination. We’ll define a Statement enum to represent these different kinds of statements. Modify the src/ast.rs file to include the following:
#[derive(Debug, Clone)]
pub enum Statement {
Print(Expr),
Let(String, Expr),
If {
left: Expr,
op: ComparisonOp,
right: Expr,
target: usize,
},
Goto(usize),
End,
}
A statement can include expressions as part of its structure. For example, the Print statement takes an Expr to print, and the Let statement assigns an Expr to a variable. The If statement uses expressions for both sides of the comparison. However, an expression cannot contain statements.
Full ast.rs File
The full src/ast.rs file should look similar to the following:
#[derive(Debug, Clone, Copy)]
pub enum ArithOp {
Add,
Sub,
Mul,
Div,
}
#[derive(Debug, Clone, Copy)]
pub enum ComparisonOp {
Equal,
Greater,
Less,
}
#[derive(Debug, Clone)]
pub enum Expr {
Number(i32),
String(String),
Variable(String),
BinOp(Box<Expr>, ArithOp, Box<Expr>),
}
#[derive(Debug, Clone)]
pub enum Statement {
Print(Expr),
Let(String, Expr),
If {
left: Expr,
op: ComparisonOp,
right: Expr,
target: usize,
},
Goto(usize),
End,
}
As an example, the statement IF X > 10 THEN 50 could be written as follows:
let stmt = Statement::If {
left: Expr::Variable("X".to_string()),
op: ComparisonOp::Greater,
right: Expr::Number(10),
target: 50,
};
and, visualized as shown in Figure 1 below.
Similarly, the statement LET X = 5 + 3 could be written as follows
let stmt = Statement::Let(
"X".to_string(),
Expr::BinOp(
Box::new(Expr::Number(5)),
ArithOp::Add,
Box::new(Expr::Number(3)),
),
);
and visualized as follows:
In functional programming languages, statements are often replaced by expressions. Instead of having separate constructs for actions (statements) and values (expressions), everything is treated as an expression that produces a value. This can lead to a more uniform and composable way of structuring code.
Adding Tests
Next, append the following tests to the ast.rs file.
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_expr_number() {
let expr = Expr::Number(42);
let cloned = expr.clone();
println!("{:?}", cloned);
}
#[test]
fn test_addition() {
let expr = Expr::BinOp(
Box::new(Expr::Number(2)),
ArithOp::Add,
Box::new(Expr::Number(3)),
);
println!("{:#?}", expr);
}
#[test]
fn test_nested_expression() {
let expr = Expr::BinOp(
Box::new(Expr::BinOp(
Box::new(Expr::Number(2)),
ArithOp::Add,
Box::new(Expr::Number(3)),
)),
ArithOp::Mul,
Box::new(Expr::Number(4)),
);
println!("{:#?}", expr);
}
#[test]
fn test_variable_expression() {
let expr = Expr::Variable("X".to_string());
match expr {
Expr::Variable(name) => assert_eq!(name, "X"),
_ => panic!("Expected Variable"),
}
}
#[test]
fn test_string_expression() {
let expr = Expr::String("Hello".to_string());
match expr {
Expr::String(s) => assert_eq!(s, "Hello"),
_ => panic!("Expected String"),
}
}
#[test]
fn test_let_statement() {
let stmt = Statement::Let("X".to_string(), Expr::Number(5));
match stmt {
Statement::Let(var, _) => assert_eq!(var, "X"),
_ => panic!("Expected Let"),
}
}
#[test]
fn test_print_statement() {
let stmt = Statement::Print(Expr::Number(42));
match stmt {
Statement::Print(_expr) => {}
_ => panic!("Expected Print"),
}
}
#[test]
fn test_if_statement() {
let stmt = Statement::If {
left: Expr::Variable("X".to_string()),
op: ComparisonOp::Greater,
right: Expr::Number(10),
target: 50,
};
match stmt {
Statement::If { target, .. } => assert_eq!(target, 50),
_ => panic!("Expected If"),
}
}
#[test]
fn test_goto_statement() {
let stmt = Statement::Goto(20);
match stmt {
Statement::Goto(line) => assert_eq!(line, 20),
_ => panic!("Expected Goto"),
}
}
#[test]
fn test_end_statement() {
let stmt = Statement::End;
match stmt {
Statement::End => {}
_ => panic!("Expected End"),
}
}
#[test]
fn test_all_operators() {
// Test that all operators can be created
let _add = ArithOp::Add;
let _sub = ArithOp::Sub;
let _mul = ArithOp::Mul;
let _div = ArithOp::Div;
let _eq = ComparisonOp::Equal;
let _gt = ComparisonOp::Greater;
let _lt = ComparisonOp::Less;
}
#[test]
fn test_operator_copy() {
let op1 = ArithOp::Add;
let op2 = op1;
// Both should still be usable
println!("{:?} {:?}", op1, op2);
}
#[test]
fn test_expression_clone() {
let expr1 = Expr::BinOp(
Box::new(Expr::Number(2)),
ArithOp::Add,
Box::new(Expr::Number(3)),
);
let expr2 = expr1.clone();
// Both should still be usable
println!("{:?}", expr1);
println!("{:?}", expr2);
}
#[test]
fn test_deeply_nested() {
// ((1 + 2) * (3 + 4)) + ((5 + 6) * (7 + 8))
let expr = Expr::BinOp(
Box::new(Expr::BinOp(
Box::new(Expr::BinOp(
Box::new(Expr::Number(1)),
ArithOp::Add,
Box::new(Expr::Number(2)),
)),
ArithOp::Mul,
Box::new(Expr::BinOp(
Box::new(Expr::Number(3)),
ArithOp::Add,
Box::new(Expr::Number(4)),
)),
)),
ArithOp::Add,
Box::new(Expr::BinOp(
Box::new(Expr::BinOp(
Box::new(Expr::Number(5)),
ArithOp::Add,
Box::new(Expr::Number(6)),
)),
ArithOp::Mul,
Box::new(Expr::BinOp(
Box::new(Expr::Number(7)),
ArithOp::Add,
Box::new(Expr::Number(8)),
)),
)),
);
// Should compile
println!("{:#?}", expr);
}
}
When you run the tests with cargo test ast, you’ll notice that the tests pass.
When you study the above tests, you may notice that many tests don’t have assertions. This is intentional! The purpose of these tests is to ensure that we can create and manipulate our AST nodes without runtime errors. The println! statements help us visually inspect the structure of the nodes during development.
In Rust, because of the strong type system, if we can successfully create and clone these nodes, it gives us confidence that our AST definitions are correct.
Parser Flow and Structure
Now that we have defined the statements and expressions, it’s time to create the parser. The parser is responsible for parsing, which is the process of transforming tokens to expressions and statements.
In effect, while the lexer provides us a list of tokens:
[Number(10), Print, Variable("X"), Plus, Number(5)]
the parser produces a structured representation of these:
Statement::Print(vec![
Expr::BinOp(
Box::new(Expr::Variable("X")),
ArithOp::Add,
Box::new(Expr::Number(5)),
),
])
Parsing Flow
When we think of BASIC code, each line starts with a line number followed by a statement. The statements are identified by their first token; for example, a PRINT token indicates a print statement, while a LET token indicates a variable assignment.
To parse such code, we’ll start by looking at the token after the line number to determine how to parse the line. This is typically done using a function like parse_statement(), which examines the current token and dispatches to the appropriate parsing function based on that token. For example, if the current token is PRINT, it will call parse_print(), and if it’s LET, it will call parse_let().
Each of these functions will then handle the specifics of parsing that particular statement type, including any expressions that may be part of the statement.
Parser Structure
Let’s build the parser structure. The parser is given a set of tokens and it processes them to produce statements and expressions. When processing the tokens, the parser needs to keep track of its current position in the token stream.
Add the following code to parser.rs:
use crate::ast::*;
use crate::token::Token;
pub struct Parser {
tokens: Vec<Token>,
pos: usize,
}
impl Parser {
pub fn new(tokens: Vec<Token>) -> Self {
Parser { tokens, pos: 0 }
}
fn is_at_end(&self) -> bool {
self.pos >= self.tokens.len()
}
fn peek(&self) -> Option<&Token> {
self.tokens.get(self.pos)
}
fn advance(&mut self) -> Option<&Token> {
if !self.is_at_end() {
let token = &self.tokens[self.pos];
self.pos += 1;
Some(token)
} else {
None
}
}
}
This provides a basic structure for the parser, including methods to check if we’ve reached the end of the token stream (is_at_end()), peek at the current token without consuming it (peek()), and advance to the next token (advance()).
Parsers also need a helper function for expecting specific tokens. For example, when parsing a LET statement, we expect to see an = token after the variable name. If the expected token is not found, we should return an error. Add the following function to the Parser implementation:
fn expect(&mut self, expected: &Token) -> Result<(), String> {
match self.peek() {
Some(token) if token == expected => {
self.advance();
Ok(())
}
Some(token) => Err(format!(
"Expected {:?}, found {:?}",
expected, token
)),
None => Err(format!(
"Expected {:?}, found end of input",
expected
)),
}
}
This function checks if the current token matches the expected token. If it does, it advances the parser; if not, it returns an error message.
Parsing Expressions
Let’s next add the functionality for parsing expressions. Expressions are the building blocks of statements, and they can be nested. As an example, in the following BASIC line:
10 LET X = 5 + 3
The 5 + 3 is an expression. We’ll start with parsing primary expressions (numbers, strings, variables) and then binary operations (addition, subtraction, multiplication, division).
Parsing Primary Expressions
Primary expressions denote values: numbers, strings, and variables. If the token is a number, we’ll turn it into a Expr::Number. If it’s a string, we’ll create an Expr::String. If it’s a variable, we’ll create an Expr::Variable.
Add the following parse_primary_expr() function to the Parser implementation:
fn parse_primary_expr(&mut self) -> Result<Expr, String> {
match self.advance() {
Some(Token::Number(n)) => Ok(Expr::Number(*n)),
Some(Token::String(s)) => Ok(Expr::String(s.clone())),
Some(Token::Variable(v)) => Ok(Expr::Variable(v.clone())),
Some(token) => Err(format!(
"Unexpected token in primary expression: {:?}",
token
)),
None => Err(format!(
"Unexpected end of input in primary expression"
)),
}
}
The function is straightforward: we take a token and match on its type, creating the appropriate expression. If we encounter an unexpected token or reach the end of input, we return an error.
Parsing Binary Operations
Now the interesting part: handling operators. We need to check if there’s an operator after the primary expression. If there is, we take the operator type and the right-hand side expression, and create a Expr::BinOp.
fn parse_expr(&mut self) -> Result<Expr, String> {
let left = self.parse_primary_expr()?;
if let Some(op_token) = self.peek() {
let op = match op_token {
Token::Plus => Some(ArithOp::Add),
Token::Minus => Some(ArithOp::Sub),
Token::Star => Some(ArithOp::Mul),
Token::Slash => Some(ArithOp::Div),
_ => None,
};
if let Some(op) = op {
let right = self.parse_primary_expr()?;
return Ok(Expr::BinOp(Box::new(left), op, Box::new(right)));
}
}
Ok(left)
}
With this, the parser can now handle simple binary operations. If it sees an operator after a primary expression, it parses the right-hand side and constructs a binary operation expression. If no operator is found, it simply returns the primary expression.
More Complex Expressions
What if the expression has more than one operator, like 2 + 3 * 4? Our current implementation only handles a single binary operation. To handle multiple operations, we need to make parse_expr() recursive. After parsing the left-hand side and checking for an operator, if we find one, we recursively call parse_expr() to parse the right-hand side. This way, we can parse nested binary operations.
Change the lines
if let Some(op) = op {
let right = self.parse_primary_expr()?;
return Ok(Expr::BinOp(Box::new(left), op, Box::new(right)));
}
to
if let Some(op) = op {
self.advance();
let right = self.parse_expr()?;
return Ok(Expr::BinOp(Box::new(left), op, Box::new(right)));
}
Now, the parser can handle also more complex expressions with multiple operations!
Note: Our approach does not account for operator precedence or associativity. We’ll look into that in future parts.
Parsing Statements
Now that we can parse expressions, let’s parse statements. We’ll first create the functions for parsing individual statement types, and then we’ll create a dispatcher function that calls the appropriate parsing function based on the first token of the statement.
In the individual parsing functions, we’ll assume that the first token (like PRINT, LET, etc.) has already been consumed by the dispatcher function.
Parsing PRINT Statements
A PRINT statement can print an expression (we’re omitting the comma-separated multiple expressions for simplicity).
PRINT X + 5
To parse the PRINT statement, we need to parse the expression following the PRINT token.
Here’s how to parse a PRINT statement, given that we already have taken consumed the PRINT token:
fn parse_print(&mut self) -> Result<Statement, String> {
let expr = self.parse_expr()?;
Ok(Statement::Print(expr))
}
Parsing LET Statements
A LET assigns a value to a variable.
LET X = 10 + 5
Given that we have already consumed the LET token, we need to expect a variable name, an equals sign, and then parse the expression to assign.
fn parse_let(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Variable(var)) => {
let var_name = var.clone();
self.expect(&Token::Equal)?;
let expr = self.parse_expr()?;
Ok(Statement::Let(var_name, expr))
}
_ => Err(format!(
"Expected variable name after LET"
)),
}
}
Parsing GOTO Statements
A GOTO jumps to a line number. Parsing the GOTO statement is straightforward — we just expect a number token after the GOTO token.
fn parse_goto(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Number(line)) => Ok(Statement::Goto(*line as usize)),
_ => Err(format!(
"Expected line number after GOTO"
)),
}
}
Parsing IF Statements
IF statements are a bit more complex as they involve a comparison between two expressions, followed by a THEN and a target line number.
IF X > 10 THEN 50
To parse this, we first identify the left-hand-side expression, then the comparison operator, followed by the right-hand-side expression, the THEN token, and finally the target line number.
This looks as follows:
fn parse_if(&mut self) -> Result<Statement, String> {
let left = self.parse_expr()?;
let op = match self.advance() {
Some(Token::Equal) => ComparisonOp::Equal,
Some(Token::Greater) => ComparisonOp::Greater,
Some(Token::Less) => ComparisonOp::Less,
Some(token) => {
return Err(format!(
"Expected comparison operator, found {:?}",
token
))
}
None => {
return Err(format!(
"Expected comparison operator, found end of input"
))
}
};
let right = self.parse_expr()?;
self.expect(&Token::Then)?;
match self.advance() {
Some(Token::Number(line)) => Ok(Statement::If {
left,
op,
right,
target: *line as usize,
}),
_ => Err(format!(
"Expected line number after THEN"
)),
}
}
Parsing Entry Point
Now, let’s tie all of the functionality together, and create the main entry point for parsing statements. The entrypoint function will look at the first token of the statement and dispatch to the appropriate parsing function.
pub fn parse_statement(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Print) => self.parse_print(),
Some(Token::Let) => self.parse_let(),
Some(Token::If) => self.parse_if(),
Some(Token::Goto) => self.parse_goto(),
Some(Token::End) => Ok(Statement::End),
Some(token) => Err(format!(
"Unexpected token at start of statement: {:?}",
token
)),
None => Err("Empty statement".to_string()),
}
}
Complete Parser
The whole parser looks as follows:
use crate::ast::*;
use crate::token::Token;
pub struct Parser {
tokens: Vec<Token>,
pos: usize,
}
impl Parser {
pub fn new(tokens: Vec<Token>) -> Self {
Parser { tokens, pos: 0 }
}
fn is_at_end(&self) -> bool {
self.pos >= self.tokens.len()
}
fn peek(&self) -> Option<&Token> {
self.tokens.get(self.pos)
}
fn advance(&mut self) -> Option<&Token> {
if !self.is_at_end() {
let token = &self.tokens[self.pos];
self.pos += 1;
Some(token)
} else {
None
}
}
fn expect(&mut self, expected: &Token) -> Result<(), String> {
match self.peek() {
Some(token) if token == expected => {
self.advance();
Ok(())
}
Some(token) => Err(format!("Expected {:?}, found {:?}", expected, token)),
None => Err(format!("Expected {:?}, found end of input", expected)),
}
}
fn parse_primary_expr(&mut self) -> Result<Expr, String> {
match self.advance() {
Some(Token::Number(n)) => Ok(Expr::Number(*n)),
Some(Token::String(s)) => Ok(Expr::String(s.clone())),
Some(Token::Variable(v)) => Ok(Expr::Variable(v.clone())),
Some(token) => Err(format!(
"Unexpected token in primary expression: {:?}",
token
)),
None => Err(format!("Unexpected end of input in primary expression")),
}
}
fn parse_expr(&mut self) -> Result<Expr, String> {
let left = self.parse_primary_expr()?;
if let Some(op_token) = self.peek() {
let op = match op_token {
Token::Plus => Some(ArithOp::Add),
Token::Minus => Some(ArithOp::Sub),
Token::Star => Some(ArithOp::Mul),
Token::Slash => Some(ArithOp::Div),
_ => None,
};
if let Some(op) = op {
self.advance();
let right = self.parse_expr()?;
return Ok(Expr::BinOp(Box::new(left), op, Box::new(right)));
}
}
Ok(left)
}
pub fn parse_statement(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Print) => self.parse_print(),
Some(Token::Let) => self.parse_let(),
Some(Token::If) => self.parse_if(),
Some(Token::Goto) => self.parse_goto(),
Some(Token::End) => Ok(Statement::End),
Some(token) => Err(format!(
"Unexpected token at start of statement: {:?}",
token
)),
None => Err("Empty statement".to_string()),
}
}
fn parse_print(&mut self) -> Result<Statement, String> {
let expr = self.parse_expr()?;
Ok(Statement::Print(expr))
}
fn parse_let(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Variable(var)) => {
let var_name = var.clone();
self.expect(&Token::Equal)?;
let expr = self.parse_expr()?;
Ok(Statement::Let(var_name, expr))
}
_ => Err(format!("Expected variable name after LET")),
}
}
fn parse_if(&mut self) -> Result<Statement, String> {
let left = self.parse_expr()?;
let op = match self.advance() {
Some(Token::Equal) => ComparisonOp::Equal,
Some(Token::Greater) => ComparisonOp::Greater,
Some(Token::Less) => ComparisonOp::Less,
Some(token) => return Err(format!("Expected comparison operator, found {:?}", token)),
None => return Err(format!("Expected comparison operator, found end of input")),
};
let right = self.parse_expr()?;
self.expect(&Token::Then)?;
match self.advance() {
Some(Token::Number(line)) => Ok(Statement::If {
left,
op,
right,
target: *line as usize,
}),
_ => Err(format!("Expected line number after THEN")),
}
}
fn parse_goto(&mut self) -> Result<Statement, String> {
match self.advance() {
Some(Token::Number(line)) => Ok(Statement::Goto(*line as usize)),
_ => Err(format!("Expected line number after GOTO")),
}
}
}
Testing the Parser
Finally, add tests for the parser to ensure it works as expected. Append the following code to parser.rs:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_number() {
let tokens = vec![Token::Number(42)];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
assert!(matches!(expr, Expr::Number(42)));
}
#[test]
fn test_parse_string() {
let tokens = vec![Token::String("Hello".to_string())];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
assert!(matches!(expr, Expr::String(_)));
}
#[test]
fn test_parse_variable() {
let tokens = vec![Token::Variable("X".to_string())];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
assert!(matches!(expr, Expr::Variable(_)));
}
#[test]
fn test_parse_addition() {
let tokens = vec![Token::Number(2), Token::Plus, Token::Number(3)];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
assert!(matches!(expr, Expr::BinOp(_, ArithOp::Add, _)));
}
#[test]
fn test_parse_multiplication() {
let tokens = vec![Token::Number(3), Token::Star, Token::Number(4)];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
assert!(matches!(expr, Expr::BinOp(_, ArithOp::Mul, _)));
}
#[test]
fn test_parse_nested_expression() {
// 2 + 3 * 4
let tokens = vec![
Token::Number(2),
Token::Plus,
Token::Number(3),
Token::Star,
Token::Number(4),
];
let mut parser = Parser::new(tokens);
let expr = parser.parse_expr().unwrap();
// Should create BinOp(2, Add, BinOp(3, Mul, 4))
match expr {
Expr::BinOp(left, ArithOp::Add, right) => {
assert!(matches!(*left, Expr::Number(2)));
assert!(matches!(*right, Expr::BinOp(_, ArithOp::Mul, _)));
}
_ => panic!("Expected BinOp with Add"),
}
}
#[test]
fn test_parse_print() {
let tokens = vec![Token::Print, Token::String("Hello".to_string())];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::Print(expr) => {}
_ => panic!("Expected Print statement"),
}
}
#[test]
fn test_parse_let() {
let tokens = vec![
Token::Let,
Token::Variable("X".to_string()),
Token::Equal,
Token::Number(42),
];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::Let(var, _) => assert_eq!(var, "X"),
_ => panic!("Expected Let statement"),
}
}
#[test]
fn test_parse_let_with_expression() {
let tokens = vec![
Token::Let,
Token::Variable("X".to_string()),
Token::Equal,
Token::Number(5),
Token::Plus,
Token::Number(3),
];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::Let(var, expr) => {
assert_eq!(var, "X");
assert!(matches!(expr, Expr::BinOp(_, ArithOp::Add, _)));
}
_ => panic!("Expected Let statement"),
}
}
#[test]
fn test_parse_if() {
let tokens = vec![
Token::If,
Token::Variable("X".to_string()),
Token::Greater,
Token::Number(10),
Token::Then,
Token::Number(50),
];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::If { target, .. } => assert_eq!(target, 50),
_ => panic!("Expected If statement"),
}
}
#[test]
fn test_parse_if_with_expression() {
let tokens = vec![
Token::If,
Token::Variable("X".to_string()),
Token::Plus,
Token::Number(5),
Token::Greater,
Token::Number(10),
Token::Then,
Token::Number(50),
];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::If { left, .. } => {
assert!(matches!(left, Expr::BinOp(_, ArithOp::Add, _)));
}
_ => panic!("Expected If statement"),
}
}
#[test]
fn test_parse_goto() {
let tokens = vec![Token::Goto, Token::Number(20)];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
match stmt {
Statement::Goto(line) => assert_eq!(line, 20),
_ => panic!("Expected Goto statement"),
}
}
#[test]
fn test_parse_end() {
let tokens = vec![Token::End];
let mut parser = Parser::new(tokens);
let stmt = parser.parse_statement().unwrap();
assert!(matches!(stmt, Statement::End));
}
#[test]
fn test_parse_error_invalid_token() {
let tokens = vec![Token::Plus]; // Can't start statement with Plus
let mut parser = Parser::new(tokens);
let result = parser.parse_statement();
assert!(result.is_err());
}
#[test]
fn test_parse_error_empty() {
let tokens = vec![];
let mut parser = Parser::new(tokens);
let result = parser.parse_statement();
assert!(result.is_err());
}
#[test]
fn test_parse_error_missing_equal() {
let tokens = vec![
Token::Let,
Token::Variable("X".to_string()),
// Missing Equal token
Token::Number(5),
];
let mut parser = Parser::new(tokens);
let result = parser.parse_statement();
assert!(result.is_err());
}
#[test]
fn test_parse_error_missing_then() {
let tokens = vec![
Token::If,
Token::Variable("X".to_string()),
Token::Greater,
Token::Number(10),
// Missing Then token
Token::Number(50),
];
let mut parser = Parser::new(tokens);
let result = parser.parse_statement();
assert!(result.is_err());
}
}
Now, when you run the tests with cargo test parser, you should see all tests passing
Testing the Pipeline
Finally, let’s test the full pipeline from text input to a statement. Adjust the main.rs to match the following:
use basic_interpreter::lexer::Lexer;
use basic_interpreter::parser::Parser;
fn main() {
let input = r#"10 PRINT "Hello, World!""#;
println!("Input: {}", input);
println!();
let mut lexer = Lexer::new(input);
let tokens = match lexer.tokenize() {
Ok(t) => {
println!("Tokens:");
for token in &t {
println!(" {:?}", token);
}
t
}
Err(e) => {
println!("Error: {}", e);
return;
}
};
println!();
// skipping line number token
let stmt_tokens = tokens[1..].to_vec();
let mut parser = Parser::new(stmt_tokens);
match parser.parse_statement() {
Ok(stmt) => {
println!("Statement:");
println!("{:#?}", stmt);
}
Err(e) => {
println!("Parser error: {}", e);
}
}
}
When you run the program with cargo run, you should see an output similar to:
Input: 10 PRINT "Hello, World!"
Tokens:
Number(10)
Print
String("HELLO, WORLD!")
Statement:
Print(
String(
"HELLO, WORLD!",
),
)
Similarly, when you change the input to:
let input = r#"20 LET X = 5 + 3 * 2"#;
you should see:
Input: 20 LET X = 5 + 3 * 2
Tokens:
Number(20)
Let
Variable("X")
Equal
Number(5)
Plus
Number(3)
Star
Number(2)
Statement:
Let(
"X",
BinOp(
Number(
5,
),
Add,
BinOp(
Number(
3,
),
Mul,
Number(
2,
),
),
),
)
Summary
In this chapter, we first designed the data types for expressions and statements, and then, we built a parser that transforms flat token sequences into them. To summarize:
- Expressions represent values and computations.
- Statements represent commands and control flow.
- A parser transforms a sequence of tokens into structured code representations.
Next, we’ll look into creating an interpreter that can execute statements.