ANTLR Project Logo

Build your own programming language with ANTLR

There are numerous programming languages available nowadays. Each one comes with different kind of coding styles; In compiler design we call it grammar which describes the coding style of particular language

Compiler

Compiler is a kind of program that translates source code from high level language in to lower level language.

Eg: C source code will be translated in to assembly

Basic Components of a Compiler

Lexer : splits source code in to tokens which are special keywords and structures of specific programming language

Parser : Identifies patterns of token set and build Abstract Syntax Tree(AST).

Generator : Generates the syntax of target language

When there are new changes in grammar above components need to be changed. Therefore writing a compiler from the scratch is somewhat difficult.

ANTLR

Another Tool For Language Recognition or ANTLR is making this task easy by giving formatting language for grammar. Also the Lexer and Parser source codes will be generated automatically. Awesome right 😍

Motivation

We are going to create very simple language called simpler 💪

a = 100
b = 150
show 10
show a
show b

simpler language only can store and display integer variables 😋

output

10
100
150

Getting Started

  1. Install Java and download ANTLR library
  2. Add ANTLR library location to path variable

Very first you need to define your language’s grammar. Create simplerlang.g4

grammar simplerlang;program : statement+;statement : let | show ;let : VAR ‘=’ INT ;
show : ‘show’ (INT | VAR) ;
VAR : [a-z]+ ;
INT : [0–9]+ ;
WS : [ \n\t]+ -> skip;

let and show are statements used to assign value a variable and used to display value(or value of a variable) respectively. INT means integer and VAR means variable

java -cp antlr-4.7.1-complete.jar org.antlr.v4.Tool simplerlang.g4

if you need to set custom package name use -package option 😀

java -cp antlr-4.7.1-complete.jar org.antlr.v4.Tool -package simplerlang simplerlang.g4

Now for each statements you can write some functions for your own language. simplerlangBaseListener class is having methods which will be called when ANTLR is dealing with AST. So we can go ahead and use those.

Create simplerlangCustomListener and extend simplerlangBaseListener. override methods as per below.

We need HashMap to store our variables 😎.

HashMap<String, Integer> variableMap = new HashMap();
@Override
public void exitShow(simplerlangParser.ShowContext ctx) {
if(ctx.INT() != null){
System.out.println(ctx.INT().getText());
}
else if(ctx.VAR() != null){
System.out.println(this.variableMap.get(ctx.VAR().getText()));
}
}

exitShow is giving you ctx which holds INT or VAR if there is an integer with show we just print it. otherwise if there is variable name we will fetch value from HashMap and print.

@Override
public void exitLet(simplerlangParser.LetContext ctx) {
this.variableMap.put(ctx.VAR().getText(),
Integer.parseInt(ctx.INT().getText()));
}

ctx gives you INT and VAR both. So we put in HashMap 🤪

Thereafter Create another java class Simperlang with main method to work as compiler of your own language.

public static void main(String[] args) {
try {
CharStream input = (CharStream) new ANTLRFileStream("test.simpler");
simplerlangLexer lexer = new simplerlangLexer(input);
simplerlangParser parser = new simplerlangParser(new CommonTokenStream(lexer));
parser.addParseListener(new simplerlangCustomListener());
parser.program();
} catch (IOException ex) {
Logger.getLogger(Simplerlang.class.getName()).log(Level.SEVERE, null, ex);
}


}

here test.simpler is your source file. parser.program() will start executing your statements.

Create test.simpler and write something with your own language

a = 100
b = 150
show 10
show a
show b

Hey.. you played with your own language. Congrats! see the output 🔥

10
100
150

This is my first medium article. I hope you enjoyed by reading this. You can download source code of this activity also. Since it is on GitHub your PRs will be appreciated too.

ANTLR is very powerful tool. Ballerina uses the ANTLR too.

Happy coding!

Programmer | Author of Neutralinojs | Technical Writer

Programmer | Author of Neutralinojs | Technical Writer