This page shows the implementation of a compiler which recognizes and translates part of the LISP programming language into the LLVM IR syntax (more information about LLVM can be found here).
The LLVM Intermediate Representation (IR) is placed between the source code in a given programming language and the machine code for a specific architecture. It is independent from both the programming language and the architecture.
It is important to know that LISP is a family of programming languages born in the early 60s and developed to date. The most common LISP dialect is the Common LISP, which is also the one studied below.
Some useful resources:
Tutorials:
An interesting article:
List of LISP parts supported by the compiler:
The compiler is made up of two parts: a scanner and a parser.
The Scanner, created using jflex, scans an input file and generates a specific token each time that a stream of characters matches with a given rule (that can be a regular expression). In that case, it recognizes all useful symbols of a LISP source code, generating tokens which are later used by the Parser. Symbols cover most important keywords of the LISP language.
nl = \r|\n|\r\n ws = [ \t] string = (\"[^0-9]*\") id = ([0-9]*[a-zA-Z\?\.\%&!\^\$£\+_\/\<\>\*\\\[\]@#\-]+[a-zA-Z0-9\?\.\%&!\^\$£\+_\/\<\>\*\\\[\]\-@#]*) integer = ([\+\-]*[1-9][0-9]*|0) double = (([0-9]+\.[0-9]*) | ([0-9]*\.[0-9]+)) (e|E('+'|'-')?[0-9]+)? not_evaluate = ('[a-zA-Z\-]+) %% "(" {return symbol(sym.RO);} ")" {return symbol(sym.RC);} "{" {return symbol(sym.BO);} "}" {return symbol(sym.BC);} "=" {return symbol(sym.EQ);} "+" {return symbol(sym.PLUS);} "-" {return symbol(sym.MINUS);} "*" {return symbol(sym.STAR);} "/" {return symbol(sym.DIV);} "<" {return symbol(sym.MIN);} ">" {return symbol(sym.MAJ);} "<=" {return symbol(sym.MIN_EQ);} "=<" {return symbol(sym.EQ_MIN);} ">=" {return symbol(sym.MAJ_EQ);} "=>" {return symbol(sym.EQ_MAJ);} "'" {return symbol(sym.APICE);} "[" {return symbol(sym.SO);} "]" {return symbol(sym.SC);} "int" {return symbol(sym.INT_TYPE);} "double" {return symbol(sym.DOUBLE_TYPE);} //Array make-array {return symbol(sym.MAKEARRAY);} :element-type {return symbol(sym.ELEMENTTYPE);} aref {return symbol(sym.AREF);} //Logical operators logand {return symbol(sym.LOGAND);} logior {return symbol(sym.LOGIOR);} logxor {return symbol(sym.LOGXOR);} //Output print {return symbol(sym.PRINT);} write {return symbol(sym.WRITE);} //Statements if {return symbol(sym.IF);} loop {return symbol(sym.LOOP);} for {return symbol(sym.FOR);} from {return symbol(sym.FROM);} to {return symbol(sym.TO);} do {return symbol(sym.DO);} in {return symbol(sym.IN);} when {return symbol(sym.WHEN);} then {return symbol(sym.THEN);} return-from {return symbol(sym.RETURNFROM);} return {return symbol(sym.RETURN);} //Functions defvar {return symbol(sym.DEFVAR);} defun {return symbol(sym.DEFUN);} setf {return symbol(sym.SETF);} setq {return symbol(sym.SETQ);} list {return symbol(sym.LIST);} , {return symbol(sym.CM);} {integer} {return symbol(sym.INT, new Integer(yytext())); } {double} {return symbol(sym.DOUBLE, new Double(yytext()));} {not_evaluate} {return symbol(sym.NOTEVAL, new String(yytext()));} {id} {return symbol(sym.ID, new String(yytext()));} {string} {return symbol(sym.STRING, new String(yytext()));} ";".* {;} {ws}|{nl} {;}
The parser is created using CUP. It takes as input the tokens provided by the scanner and recognize specific sequences of symbols. Every time a sequence is reduced, rule is executed. With the support of Java code for rules, the corrisponding LLVM IR code is produced.
Structure:
non_terminal ::= SYMBOLS {: /* rule code */:};
In the parser code is contained the Java code used to support the parser in memorizing variables and assigning the labels needed to the LLVM IR instructions.
//code buffer for llvm public static StringBuffer irCode = new StringBuffer(); //global code buffer for llvm //it is used to declare global variables like string //at the beginning of the program public static StringBuffer globalCode = new StringBuffer(); public BufferedWriter file_out; //flag used by addCode to understand in //what buffer code must be written public static boolean writeOnGlobal = false; //used for local scoping inside functions public static Integer tempGLobal = 0; //labelCounter for assigning registers public static Integer varLabelCounter = 1; //labelCounter for global variables public static Integer globalVarLabelCounter = 4; //labelCounter for decision statements public static Integer labelDecision = 0; public HashMap <String, varObject> vars; public HashMap<String, funcObject> functionsMap; public LinkedList<loopVar> loopList; /* addCode() add a string to the buffer code */ public void addCode(String var){ if(!writeOnGlobal){ irCode.append(var); } else{ globalCode.append(var); } } /* getVarLabel() Returns a unique label for a variable */ public String getVarLabel(){ return (varLabelCounter++).toString(); } /* getGlobalVarLabel() Returns a unique label for a global variable */ public String getGlobalVarLabel(){ return (globalVarLabelCounter++).toString(); } /* getDecisionLabel() Returns a unique label for a decision statement */ public String getDecisionLabel(){ return (labelDecision++).toString(); } public class varObject{ public String label; public String align; public String stringValue; public boolean global; /* void constructor is used to initialize values directly written in the code */ public varObject(){ this.label = null; this.stringValue = null; this.align = null; this.global = false; //All variables are declared at beginning as local } public varObject(String align, boolean setLabel){ this.align = align; this.global = false; if(setLabel){ if(align.equals("1")){ //If align == 1 is a string so it's global this.label = getGlobalVarLabel(); } else{ this.label = getVarLabel(); } } } } public class intVar extends varObject{ public Integer intValue; public intVar(){ this.intValue = null; } public intVar(Integer value, String align, boolean setLabel){ super(align, setLabel); this.intValue = value; this.stringValue = value.toString(); } } public class doubleVar extends varObject{ public Double doubleValue; public doubleVar(){ this.doubleValue = null; } public doubleVar(Double value, String align, boolean setLabel){ super(align, setLabel); this.doubleValue = value; this.stringValue = value.toString(); } } public class stringVar extends varObject{ public stringVar(String value, boolean setLabel){ super("1", setLabel); this.stringValue = value; } } public class loopVar extends varObject{ public loopVar(String name){ super(); this.stringValue = name; this.label = getDecisionLabel(); } } public class arrayVar extends varObject{ public Integer size; public String type; public arrayVar(Integer size, boolean setLabel){ super("", setLabel); this.size = size; } public void allocate(String type){ this.type = type; this.label = getVarLabel(); if(this.type.equals("double")){ this.align = "8"; addCode("%"+this.label+" = alloca ["+this.size+" x double], align 8\n"); } else if(this.type.equals("i32")){ this.align = "4"; addCode("%"+this.label+" = alloca ["+this.size+" x i32], align 4\n"); } } } public class funcObject{ public String name; public String returnType; public Integer size; public String[] arguments; public funcObject(String name, Integer size){ this.name = name; this.size = size; this.arguments = new String[size]; } }
varObject
: is a Java class used to model a generic object of the language Every object is used to memorize useful informations (like the Label) to write the correct LLVM IR code. On the following, the specializing classes:intVar
: represents the Integer variablesdoubleVar
: represents the Double variablesstringVar
: represents the String variablesloopVar
: represent a loop or loop for statement of the LISP programming languages. In this objects are stored labels used to perform correctly jumps in the LLVM IR codearrayVar
: represent a 1-dimension array variablefuncObject
: represent a function statement of the LISP programming languageloopList
is a list of loopVar
objects. It is taken always the last element of this list. In this way in the case there are nested loops, the code written will be always referenced to the last loop and label can not be confused.addCode
is a very simple but essential method. It adds the code to the buffer variables (on irCode if writeOnGlobal is false otherwise on globalCode)prog
is the very beginning of the grammar and the last symbol to be reduced. Moreover the reducing of this symbol means that no semantic errors where found in the input file.lists
and list
non terminal symbols highlight the structure of a LISP program: essentially a sequence of lists. A list, as in many thers programming languages, is an element inserted between the terminal symbols ( and ). In the next paragraphs some of the main instructions of the LISP are explained. All instructions, functions and variables in LISP are treated as lists.start with prog; prog ::= lists {: file_out.write("declare i32 @printf(i8*, ...)\n"); file_out.write("@.str.0 = private constant [4 x i8] c"+new String("\"%f\\0A\\00\"")+", align 1\n"); file_out.write("@.str.1 = private constant [4 x i8] c"+new String("\"%d\\0A\\00\"")+", align 1\n"); file_out.write("@.str.2 = private constant [4 x i8] c"+new String("\"%d\\20\\00\"")+", align 1\n"); file_out.write("@.str.3 = private constant [4 x i8] c"+new String("\"%f\\20\\00\"")+", align 1\n"); //Adding strings as global variables at the beginning of the LLVM program writeOnGlobal = true; for(varObject var : vars.values()){ if(var.align != null){ if(var.align.equals("1")){ var.stringValue = var.stringValue.replaceAll("\"", ""); addCode("@.str."+var.label+" = constant ["+new Integer(var.stringValue.length()+2).toString()+" x i8] c\""+var.stringValue+"\\0A\\00\", align 1\n"); } } } writeOnGlobal = false; //Writing global code file_out.write(globalCode.toString()); //Writing ir code inside the main function file_out.write("define void @main(){\n"); file_out.write(irCode.toString()); file_out.write("ret void\n}\n"); file_out.flush(); file_out.close(); System.out.println("Program terminated. File correclty written!"); :}; lists ::= lists list | list; list ::= RO instructions RC | RO glob_vars RC | RO if_stmt RC | RO loop_stmt RC | RO loop_for RC | RO num_list RC | expression | func;
DEFVAR
and SETQ
are the most important instructions in LISP to initialize global variables. In the following will be provided an example with only SETQ
because the implementation in the parser of DEFVAR
is very similar.
global_vars ::= SETQ ID:id RO MAKEARRAY INT:i ELEMENTTYPE NOTEVAL:typ RC{: //If variable does not exist create it if(parser.vars.get(id) == null){ //Inserting variable into map arrayVar var = new arrayVar(i, false); String type = typ.replace("'", ""); if(type.equals("double-float")){ var.allocate("double"); } else if(type.equals("integer")){ var.allocate("i32"); } else{ pSemError("Invalid type!"); } var.global = true; parser.vars.put(id, var); } else{ //else update it arrayVar var = new arrayVar(i, false); String type = typ.replace("'", ""); if(type.equals("double-float")){ var.allocate("double"); } else if(type.equals("integer")){ var.allocate("i32"); } else{ pSemError("Invalid type!"); } var.label = parser.vars.get(id).label; //Keeping old label parser.vars.put(id, var); } RESULT = parser.vars.get(id).label; :} | SETQ ID:id DOUBLE:num {: //If variable does not exist create it if(parser.vars.get(id) == null){ //Inserting variable into map doubleVar var = new doubleVar(num, "8", true); var.global = true; parser.vars.put(id, var); //Adding code to LLVM addCode("%"+var.label+" = alloca double, align 8\n"); addCode("store double "+var.stringValue+", double* %"+var.label+"\n"); } else{ //else update it doubleVar var = new doubleVar(num, "8", false); var.label = parser.vars.get(id).label; //Keeping old label parser.vars.put(id, var); //Adding code to LLVM - updating value addCode("store double "+var.stringValue+", double* %"+var.label+"\n"); } RESULT = num; :} | SETQ ID:id INT:num {: //If variable does not exist create it if(parser.vars.get(id) == null){ //Inserting variable into map intVar var = new intVar(num, "4", true); var.global = true; parser.vars.put(id, var); //Adding code to LLVM addCode("%"+var.label+" = alloca i32, align 4\n"); addCode("store i32 "+var.stringValue+", i32* %"+var.label+"\n"); } else{ //else update it intVar var = new intVar(num, "4", false); var.label = parser.vars.get(id).label; //Keeping old label parser.vars.put(id, var); //Adding code to LLVM - updating value addCode("store i32 "+var.stringValue+", i32* %"+var.label+"\n"); } RESULT = num; :}
SETQ
if the variable does not exist yet a new one is created. If it exists a new intVar
, arrayVar
or doubleVar
object is created but the old label is maintaned.With the following LISP code:
(setq a (make-array 3 :element-type 'integer)) (defvar b 4.0) (setq b 3.0)
The following LLVM Code:
declare i32 @printf(i8*, ...) @.str.0 = private constant [4 x i8] c"%f\0A\00", align 1 @.str.1 = private constant [4 x i8] c"%d\0A\00", align 1 @.str.2 = private constant [4 x i8] c"%d\20\00", align 1 @.str.3 = private constant [4 x i8] c"%f\20\00", align 1 define void @main(){ %1 = alloca [3 x i32], align 4 %2 = alloca double, align 8 store double 4.0, double* %2 store double 3.0, double* %2 ret void }
LISP uses prefix notation so, operators are written before their operands. For example, the expression:
a * ( b + c ) / d
in LISP is written as:
(/ (* a (+ b c) ) d)
Let's consider the following expression:
(setq a (+ 1 (- 3 4) (logand 3 (+ 2 3 4)) (- 4 5)))
The strategy used to evaluate this instruction is to start from the most nested expressions (like (+ 2 3 4)). When an expression
is reduced, in the RESULT variable is memorized the label of the LLVM instruction where the result value is stored. In this way, the upper level expression
is able to access the result and treating it as operand to calculate it's own result value and so the process repeats. If just one of the operands of an expression is a double all int operands are converted. For this purpose the sitofp
LLVM instruction is used.
expression ::= RO operator:x num_list:nl RC {: boolean doubleFlag = false; String previousLabel = ""; String localLabel = ""; String[] operands = new String[2]; String[] tempOperands = new String[2]; //Needed to manage conversions of operands for division sdiv Integer count = 0; String tempLabelConversion = ""; //Check if at least 1 element is double //in that case all other values need to //be converted if are int for(String s: nl){ if(vars.get(s) instanceof doubleVar){ doubleFlag = true; } } if(x.equals("and") || x.equals("or") || x.equals("xor")){ //with logical operators operands can be only integer if(doubleFlag){ pSemError("Logical operations require integer numbers"); } } ListIterator<String> litr = nl.listIterator(nl.size()); if(doubleFlag){ while(litr.hasPrevious()){ String s = litr.previous(); if(previousLabel.equals("")){ if(vars.get(s) instanceof intVar){ if(vars.get(s).label != null){ //It's an int stored variable localLabel = getVarLabel(); addCode("%"+localLabel+" = load i32, i32* %"+vars.get(s).label+", align 4\n"); localLabel = getVarLabel(); addCode("%"+localLabel+" = sitofp i32 %"+localLabel+" to double\n"); operands[count++] = "%"+localLabel; } else{ tempLabelConversion = getVarLabel(); addCode("%"+tempLabelConversion+" = sitofp i32 " + s + " to double\n" ); operands[count++] = "%"+tempLabelConversion; } } else{ if(vars.get(s).label != null){ //It's a double stored variable localLabel = getVarLabel(); addCode("%"+localLabel+" = load double, double* %"+vars.get(s).label+", align 8\n"); operands[count++] = "%"+localLabel; }else{ operands[count++]= s; } } if(count == 2){ previousLabel = "%"+getVarLabel(); if(x.equals("fdiv")){ addCode(previousLabel+" = fdiv double "+operands[0]+", "+operands[1]+"\n"); } else{ addCode(previousLabel+" = f"+x+" double "+operands[0]+", "+operands[1]+"\n"); } count=0; } } else{ operands[count++] = previousLabel; if(vars.get(s) instanceof intVar){ tempLabelConversion = getVarLabel(); addCode("%"+tempLabelConversion+" = sitofp i32 " + s + " to double\n" ); operands[count++] = "%"+tempLabelConversion; } else{ operands[count++]= s; } previousLabel = "%"+getVarLabel(); if(x.equals("fdiv")){ addCode(previousLabel+" = fdiv double "+operands[0]+", "+operands[1]+"\n"); } else{ addCode(previousLabel+" = f"+x+" double "+operands[0]+", "+operands[1]+"\n"); } count=0; } } vars.put(previousLabel, new doubleVar()); } else{ while(litr.hasPrevious()){ String s = litr.previous(); if(previousLabel.equals("")){ if(vars.get(s).label != null){ //It's an int stored variable localLabel = getVarLabel(); addCode("%"+localLabel+" = load i32, i32* %"+vars.get(s).label+", align 4\n"); operands[count++] = "%"+localLabel; } else{ operands[count++]= s; } if(count == 2){ //A division implies a floating value if(x.equals("fdiv")){ tempOperands[0] = "%"+getVarLabel(); tempOperands[1] = "%"+getVarLabel(); addCode(tempOperands[0]+" = sitofp i32 "+operands[0]+" to double\n"); addCode(tempOperands[1]+" = sitofp i32 "+operands[1]+" to double\n"); previousLabel = "%"+getVarLabel(); addCode(previousLabel+" = fdiv double "+tempOperands[0]+", "+tempOperands[1]+"\n"); } else{ previousLabel = "%"+getVarLabel(); addCode(previousLabel+" = "+x+" i32 "+operands[0]+", "+operands[1]+"\n"); } count=0; } } else{ operands[count++] = previousLabel; if(vars.get(s).label != null){ //It's an int stored variable localLabel = getVarLabel(); addCode("%"+localLabel+" = load i32, i32* %"+vars.get(s).label+", align 4\n"); operands[count++] = "%"+localLabel; } else{ operands[count++]= s; } if(x.equals("fdiv")){ tempOperands[0] = "%"+getVarLabel(); tempOperands[1] = "%"+getVarLabel(); addCode(tempOperands[0]+" = sitofp i32 "+operands[0]+" to double\n"); addCode(tempOperands[1]+" = sitofp i32 "+operands[1]+" to double\n"); previousLabel = "%"+getVarLabel(); addCode(previousLabel+" = fdiv double "+tempOperands[0]+", "+tempOperands[1]+"\n"); } else{ previousLabel = "%"+getVarLabel(); addCode(previousLabel+" = "+x+" i32 "+operands[0]+", "+operands[1]+"\n"); } count=0; } } //A division implies a floating value if(x.equals("fdiv")){ vars.put(previousLabel, new doubleVar()); } else{ vars.put(previousLabel, new intVar()); } } RESULT = new String(previousLabel); :};
The resulting LLVM code is:
declare i32 @printf(i8*, ...) @.str.0 = private constant [4 x i8] c"%f\0A\00", align 1 @.str.1 = private constant [4 x i8] c"%d\0A\00", align 1 @.str.2 = private constant [4 x i8] c"%d\20\00", align 1 @.str.3 = private constant [4 x i8] c"%f\20\00", align 1 define void @main(){ %1 = add i32 0,4 %2 = add i32 0,3 %3 = sub i32 %2, %1 %4 = add i32 0,4 %5 = add i32 0,3 %6 = add i32 0,2 %7 = add i32 %6, %5 %8 = add i32 %7, %4 %9 = add i32 0,3 %10 = and i32 %9, %8 %11 = add i32 0,5 %12 = add i32 0,4 %13 = sub i32 %12, %11 %14 = add i32 0,1 %15 = add i32 %14, %3 %16 = add i32 %15, %10 %17 = add i32 %16, %13 %18 = alloca i32, align 4 store i32 %17, i32* %18 %19 = load i32, i32* %18, align 4 %20 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i32 0, i32 0), i32 %19) ret void }
//Functions func ::= RO DEFUN ID:str RO num_list:nl NT2 RC lists RC {: funcObject func = new funcObject(str, nl.size()); func.returnType = "void"; //End of the function addCode("ret void \n}\n"); functionsMap.put(str, func); writeOnGlobal = false; varLabelCounter = tempGLobal; :}; NT2 ::= {: String name = (String) stack(-2); String arguments = new String(""); String label = ""; Integer functionCount = 3; //Functions' body in LLVM must declared on the global part writeOnGlobal = true; LinkedList<String> nl = (LinkedList<String>) stack(0); //Counting the arguments for(String s: nl){ arguments += "i32,"; vars.put(s, new intVar()); } StringBuilder tmp = new StringBuilder(arguments); tmp.setCharAt(arguments.length()-1, ')'); arguments = tmp.toString(); addCode("define void @"+ name+"("+arguments+"{\n"); int i = 0; //For each argument store a variable for(String s : nl){ label = (functionCount++).toString(); vars.get(s).label = label; addCode("%"+label+ " = alloca i32, align 4\n"); addCode("store i32 %"+ i+", i32* %"+label+"\n"); i++; } //LLVM Label counting inside functions is //indipendent from the main part tempGLobal = varLabelCounter; varLabelCounter = functionCount; :};
LISP functions with only integer parameters and void return type are supported.
NT2
. When the RO DEFUN ID:str RO num_list:nl
symbols are recognized, corresponding for example to (defun average(1, 2
it is quite clear that a new function has to be created, so the number of arguments is counted and the corresponding declare
LLVM instruction is written on the code
varLabelCounter
is temporary saved and initialized with the function counter valuefuncObject
is created and the final LLVM code is written to end the functionThe corresponding LISP code:
(defun division(n1 n2) (if (= n1 0) (print err)) (setq res (/ n2 n1)) (print res) )
is translated in the following LLVM code (simplified):
define void @division(i32,i32){ %3 = alloca i32, align 4 store i32 %0, i32* %3 %4 = alloca i32, align 4 store i32 %1, i32* %4 ... ... ... %16 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.0, i32 0, i32 0), double %15) ret void }
The following variants of the PRINT instruction are supported:
print a
- printing of declared variables (strings, integers, doubles)print arr
- printing of array variablesprint (+ 2 3 4)
- printing of direct expressionsprint (aref my-array 1)
- printing of array elementsFor semplicity only the parser code for declared variables is provided:
instructions ::= PRINT ID:id {: if(vars.get(id) != null){ //Printing a string if(vars.get(id) instanceof stringVar){ stringVar var = (stringVar) vars.get(id); String length = new Integer(var.stringValue.length()).toString(); addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds (["+length+" x i8], ["+length+" x i8]* @.str."+var.label+", i32 0, i32 0))\n"); } //Printing a double else if(vars.get(id) instanceof doubleVar){ doubleVar var = (doubleVar) vars.get(id); Integer localLabel = new Integer(getVarLabel()); addCode("%"+localLabel+" = load double, double* %"+var.label+", align 8\n"); addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.0, i32 0, i32 0), double %"+localLabel+")\n"); } //Printing an integer else if(vars.get(id) instanceof intVar){ intVar var = (intVar) vars.get(id); Integer localLabel = new Integer(getVarLabel()); addCode("%"+localLabel+" = load i32, i32* %"+var.label+", align 4\n"); addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i32 0, i32 0), i32 %"+localLabel+")\n"); } //Printing an array else if(vars.get(id) instanceof arrayVar){ arrayVar var = (arrayVar) vars.get(id); String label = ""; for(int i=0; i< var.size; i++){ label = getVarLabel(); addCode("%"+label+" = getelementptr inbounds ["+var.size+" x i32] , ["+var.size+" x i32]* %"+var.label+", i32 0, i32 "+i+"\n"); if(var.type.equals("i32")){ Integer localLabel = new Integer(getVarLabel()); addCode("%"+localLabel+" = load i32, i32* %"+label+", align 4\n"); if(i != var.size-1){ addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i32 0, i32 0), i32 %"+localLabel+")\n"); } else{ addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i32 0, i32 0), i32 %"+localLabel+")\n"); } } else{ Integer localLabel = new Integer(getVarLabel()); addCode("%"+localLabel+" = load double, double* %"+label+", align 8\n"); if(i != var.size-1){ addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.3, i32 0, i32 0), double %"+localLabel+")\n"); } else{ addCode("%"+getVarLabel()+" = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.0, i32 0, i32 0), double %"+localLabel+")\n"); } } } } } else{ pSemError("Variable not declared!"); } :}
stringVar
object and add the @printf
instruction to the LLVM code passing to this one the label (stored in localLabel
)intVar
and doubleVar
objects store the LLVM label of type double*
and i32*
that locate the memory area where the value is stored. With the load
instruction the value is red from the memory and then passed to the @printf
instructiongetelementptr
LLVM instruction is used. Because we want to print all the array (to print a specific cell AREF is used) the variable i is used to iterate over the positions from 0 to the size of the array minus 1 and passed to the instruction
In LISP the classical for statement present in other programming languages is implemented with the loop
or loop for
instructions. Here the parser implementation of the loop for
:
loop_for ::= LOOP FOR ID:i FROM INT:a TO INT:b DO {: String oldLabel = ""; String newLabel = ""; String cmpLabel = ""; String localLabel = ""; String localLabel1 = ""; loopList.addLast(new loopVar("for")); loopVar lv = loopList.getLast(); //Variable i is yet used in the code as global //Now becomes local if(vars.get(i) != null){ oldLabel = vars.get(i).label; newLabel = getVarLabel(); vars.get(i).label = newLabel; //To avoid use the global variable we need to use a new Label and store the old one addCode("%"+newLabel+" = alloca i32, align 4\n"); addCode("store i32 "+(a-1)+", i32* %"+newLabel+"\n"); } else{ intVar var = new intVar(a, "4", true); vars.put(i, var); newLabel = vars.get(i).label; addCode("%"+newLabel+" = alloca i32, align 4\n"); addCode("store i32 "+(a-1)+", i32* %"+newLabel+"\n"); } addCode("br label %for.body."+lv.label+"\n"); addCode("for.body."+lv.label+":\n"); localLabel = getVarLabel(); addCode("%"+localLabel+" = load i32, i32* %"+newLabel+", align 4\n"); localLabel1 = getVarLabel(); addCode("%"+localLabel1+" = add i32 %"+localLabel+", 1\n"); addCode("store i32 %"+localLabel1+", i32* %"+newLabel+", align 4\n"); localLabel = getVarLabel(); addCode("%"+localLabel+" = load i32, i32* %"+newLabel+", align 4\n"); cmpLabel = getVarLabel(); addCode("%"+cmpLabel+" = icmp slt i32 %"+localLabel+","+(b+1)+"\n"); addCode("br i1 %"+cmpLabel+", label %for.body2."+lv.label+", label %for.exit."+lv.label+"\n"); addCode("for.body2."+lv.label+":\n"); RESULT = oldLabel; :} lists{: loopVar lv = loopList.getLast(); addCode("br label %for.body."+lv.label+"\n"); addCode("for.exit."+lv.label+":\n"); String oldLabel = (String) stack(-1); //If oldLabel == "" means that ID:i was only a local vaiable in the loop, remove it if(oldLabel.equals("")){ vars.remove(i); } //else means that it had an oldLabel so was used as global -> restore the oldLabel else{ vars.get(i).label = oldLabel; } :};
loopList.addLast
add the current loop to the loop list. This allows us to 'select' the last loop inserted and correctly add to the code the labels. It is necessary when there are nested loops: we need to select always the last inserted. “for.body.”+lv.label+“:\n”
we enter in the loop body. The first thing to do is to load the loop variable, adding to it 1 and storing it. Then it is compared with the second INT to eventually terminate the loopThe compiler is able to recognize the following kind of errors:
loop for
and make-array
are supported only direct numerical values, not variablesAn example of a LISP program used to calculate the first 22 numbers of the Fibonacci serie:
(defvar a 0) (defvar b 1) (loop for i from 0 to 10 do (write a) (if (= i 10) (print b) (write b)) (setq a (+ a b)) (setq b (+ a b)) )
Execution of program will produce the following output:
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946
Compiler: lisp_compiler
Examples
Each example is provided separately and contains the LISP source code and the corresponding output in LLVM IR, generated by the compiler. For sure, each source code can be given as input of the compiler, which will produce the same output contained in the example. In general after
sudo apt install llvm
jflex lisp_scanner.jflex
java java_cup.Main lisp_parser.cup
javac *.java
java Main source_file.lisp
make bubble
make triagle
make fibo
make division
make
command will run bubble sort examplelli output.ll