History

Brainf**k hereafter referred to as BF, is a turing complete, esoteric programming language created by Urban Müller. For the uninitiated, esoteric languages are programming languages that are designed specifically to be unique, difficult to program in, or be just plain weird.

Crash Course in BF

OpCodeInstruction
>Increment the data pointer
<Decrement the data pointer
[If the byte at the data pointer is zero, jump instruction pointer forward to the command after the matching ] command
]If the byte at the data pointer is nonzero, move instruction pointer to the command after the matching [ command
,One byte Input at the data pointer
.Output the byte at the data pointer
+Increment the byte at the data pointer
-Decrement the byte at the data pointer

This is how we can print Hello World! in BF.

1
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

Interpreter

Creating an interpreter for any common programming language is a difficult task but, BF having only 8 commands makes interpreter much easier. We can break down the task of writing a BF interpreter into 3 tasks.

1️⃣ Lexer
2️⃣ Parser
3️⃣ Interpreter

0️⃣ Imports

1
2
3
4
5
use std::{env, io::Read, fs::File};
// Or
// use std::env;
// use std::io::Read;
// use std::fs::File;

1️⃣ Lexer

This is a relatively straightforward task. As we know, there are 8 commands in BF which we can represent as Enum.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#[derive(Clone, Debug)]
enum OpCode {
    IncrementPointer,   // >
    DecrementPointer,   // <
    Increment,          // +
    Decrement,          // -
    Write,              // .
    Read,               // ,
    LoopBegin,          // [
    LoopEnd,            // ]
}

Let’s write the Lexer as function using the defined OpCode. Lexer takes in a String and returns a Vector of OpCode. I’ve considered all the characters other than OpCode to be comments for simplicity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
fn lex(source: String) -> Vec<OpCode> {
    let mut operations = Vec::new();

    for symbol in source.chars() {
        let op = match symbol {
            '>' => Some(OpCode::IncrementPointer),
            '<' => Some(OpCode::DecrementPointer),
            '+' => Some(OpCode::Increment),
            '-' => Some(OpCode::Decrement),
            '.' => Some(OpCode::Write),
            ',' => Some(OpCode::Read),
            '[' => Some(OpCode::LoopBegin),
            ']' => Some(OpCode::LoopEnd),
            _ => None
        };
        // Removing Comments.
        match op {
            Some(op) => operations.push(op),
            None => ()
        }
    }
    operations
}

2️⃣ Parser

From the table above, we can associate the OpCode with their respective Instruction. We can represent Instruction as Enum.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#[derive(Debug, Clone)]
enum Instruction {
    IncrementPointer,
    DecrementPointer,
    Increment,
    Decrement,
    Write,
    Read,
    Loop(Vec<Instruction>)
}

Parser takes in the Vector of OpCode generated by the lexer and returns Vector of Instruction.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
fn parse(opcodes: Vec<OpCode>) -> Vec<Instruction> {
    let mut program: Vec<Instruction> = Vec::new();
    let mut loop_stack = 0;
    let mut loop_start = 0;

    for (i, op) in opcodes.iter().enumerate() {
        if loop_stack == 0 {
            let instr = match op {
                OpCode::IncrementPointer => Some(Instruction::IncrementPointer),
                OpCode::DecrementPointer => Some(Instruction::DecrementPointer),
                OpCode::Increment => Some(Instruction::Increment),
                OpCode::Decrement => Some(Instruction::Decrement),
                OpCode::Write => Some(Instruction::Write),
                OpCode::Read => Some(Instruction::Read),

                OpCode::LoopBegin => {
                    loop_start = i;
                    loop_stack += 1;
                    None
                },

                OpCode::LoopEnd => panic!("LoopStartingError"),
            };

            match instr {
                Some(instr) => program.push(instr),
                None => ()
            }
        } else {
            match op {
                // Handling loop conditions
                OpCode::LoopBegin => {
                    loop_stack += 1;
                },
                OpCode::LoopEnd => {
                    loop_stack -= 1;

                    if loop_stack == 0 {
                        program.push(Instruction::Loop(parse(opcodes[loop_start+1..i].to_vec())));
                    }
                },
                _ => (),
            }
        }
    }

    if loop_stack != 0 {
        panic!("LoopEndingError");
    }
    program
}

Now, the only piece remaining is the logic that brings the parsed commands into an actual program.

3️⃣ Interpreter

Rather predictably, we would use the output of parser in the form of Vector of Instruction.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
fn interpret(instructions: &Vec<Instruction>, tape: &mut Vec<u8>, data_pointer: &mut usize) {
    for instr in instructions {
        match instr {
            Instruction::IncrementPointer => *data_pointer += 1,
            Instruction::DecrementPointer => *data_pointer -= 1,
            Instruction::Increment => tape[*data_pointer] += 1,
            Instruction::Decrement => tape[*data_pointer] -= 1,
            Instruction::Write => print!("{}", tape[*data_pointer] as char),
            Instruction::Read => {
                let mut input: [u8; 1] = [0];
                std::io::stdin().read_exact(&mut input).expect("failed to read stdin");
                tape[*data_pointer] = input[0];
            },
            Instruction::Loop(nested_instructions) => {
                while tape[*data_pointer] != 0 {
                    interpret(&nested_instructions, tape, data_pointer)
                }
            }
        }
    }
}

⭐ Main

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
fn main() {
    let args: Vec<String> = env::args().collect();
    if args.len() != 2 {
        println!("usage::\ncargo run <file.bf>\nbf <file.bf>");
        std::process::exit(1);
    }
    let filename = &args[1];

    // Read file into String
    let mut file = File::open(filename).expect("FileNotFoundError");
    let mut source = String::new();
    file.read_to_string(&mut source).expect("ReadFailureError");

    // File -> Lexer
    let opcodes = lex(source);

    // Lexer -> Parser
    let program = parse(opcodes);

    // Zero initialised 1kb tape
    let mut tape: Vec<u8> = vec![0; 1024];
    let mut data_pointer = 512;

    interpret(&program, &mut tape, &mut data_pointer);
}

That’s how easy it is to make BF interpreter in Rust. Make your own BF programs and share them with me !