compilers:julia_to_llvm
Return to Home page
If you found any error, or if you want to partecipate to the editing of this wiki, please contact: admin [at] skenz.it
You can reuse, distribute or modify the content of this page, but you must cite in any document (or webpage) this url: https://www.skenz.it/compilers/julia_to_llvm?do=diff
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
— | compilers:julia_to_llvm [2024/04/08 22:34] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== From Julia to LLVM ====== | ||
+ | ===== Background ===== | ||
+ | This page shows the implementation of a compiler which recognizes and translates part of the Julia programming language into the LLVM IR syntax (more information about LLVM can be found [[https:// | ||
+ | |||
+ | ==== Implemented parts ==== | ||
+ | List of Julia parts supported: | ||
+ | * **Data types:** | ||
+ | * Basic data types: | ||
+ | * int | ||
+ | * doubles | ||
+ | * 1-dimension array | ||
+ | * (Local scope for variables is implemented) | ||
+ | |||
+ | * **Operators: | ||
+ | * arithmetic: | ||
+ | * sum (**+**) | ||
+ | * difference (**-**) | ||
+ | * multiplication (*) | ||
+ | * division (**< | ||
+ | * compound: | ||
+ | * sum (**+=**) | ||
+ | * difference (**-=**) | ||
+ | * multiplication (***=**) | ||
+ | * division (**< | ||
+ | * comparison: | ||
+ | * equality(**< | ||
+ | * greater than (**>**) | ||
+ | * less than (**<**) | ||
+ | * greater than equal to (**< | ||
+ | * less than equal to (**< | ||
+ | * logic: | ||
+ | * AND (**&& | ||
+ | * OR (**||**) | ||
+ | * **Control flow instructions: | ||
+ | * if instruction | ||
+ | * if-then-else instruction | ||
+ | * for loop over a range | ||
+ | * while loop | ||
+ | * **Function details (partial implementation): | ||
+ | * double input parameters | ||
+ | * double return value | ||
+ | * **Output:** | ||
+ | * print/ | ||
+ | * with and without string interpolation (“x value is: $x”) | ||
+ | |||
+ | |||
+ | ==== Compiler ==== | ||
+ | The compiler is made up of two parts: a scanner and a parser. | ||
+ | |||
+ | ===== Scanner ===== | ||
+ | The scanner is able to recognize and retrieve tokens (terminal symbols) to the parser coupled with an object containing a value that the token represents. | ||
+ | It identifies integer, doubles and ids (that will be used for variables, function names, …) and other significant Julia keywords like: | ||
+ | * else | ||
+ | * end | ||
+ | * for | ||
+ | * function | ||
+ | * if | ||
+ | * in | ||
+ | |||
+ | * println | ||
+ | * return | ||
+ | * while | ||
+ | |||
+ | And other syntax elements like punctuation and other symbols. | ||
+ | |||
+ | ==== Snippet of Julia Scanner ==== | ||
+ | |||
+ | <code none[enable_line_numbers=" | ||
+ | d = [A-Za-z_][A-Za-z0-9_]* | ||
+ | integer = ([1-9][0-9]*|0) | ||
+ | double = (([0-9]+\.[0-9]*) | ([0-9]*\.[0-9]+)) (e|E(' | ||
+ | |||
+ | %% | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | "/" | ||
+ | |||
+ | ... | ||
+ | |||
+ | "&&" | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | |||
+ | |||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | |||
+ | ... | ||
+ | |||
+ | {id} {return symbol(sym.ID, | ||
+ | |||
+ | {integer} {return symbol(sym.INT, | ||
+ | |||
+ | {double} | ||
+ | </ | ||
+ | |||
+ | ===== Parser ===== | ||
+ | The parser is able to take as input the tokens that are provided by the scanner and recognize the main grammatical rules of Julia language. | ||
+ | Consequently, | ||
+ | |||
+ | ==== Data structures ==== | ||
+ | This snippet shows all variables and classes used to support the parser on the creation of the output program: | ||
+ | <code Java> | ||
+ | |||
+ | public HashMap <String, TypeVar> symbolTable; | ||
+ | public HashMap <String, TypeFun> functionTable; | ||
+ | public boolean isCorrect = true; | ||
+ | public StringBuffer printBuff; | ||
+ | public ArrayList< | ||
+ | |||
+ | public int var_label = 0; | ||
+ | public int str_label = 0; | ||
+ | |||
+ | public int flow_label = 0; | ||
+ | |||
+ | public int tot_flow_label = 0; | ||
+ | |||
+ | public BufferedWriter bwr; | ||
+ | |||
+ | public String retType; | ||
+ | public int genVarLabel(){ | ||
+ | |||
+ | var_label++; | ||
+ | |||
+ | return var_label; | ||
+ | }; | ||
+ | |||
+ | public int genStrLabel(){ | ||
+ | |||
+ | str_label++; | ||
+ | |||
+ | return str_label; | ||
+ | }; | ||
+ | |||
+ | public class TypeVar{ | ||
+ | |||
+ | public String reg_id; | ||
+ | public String type; | ||
+ | public String value; | ||
+ | public Integer align; | ||
+ | public Integer size1; | ||
+ | public Integer size2; | ||
+ | public Integer l; | ||
+ | |||
+ | public TypeVar() | ||
+ | { | ||
+ | reg_id = Integer.toString(genVarLabel()); | ||
+ | size1 = size2 = -1; | ||
+ | } | ||
+ | TypeVar(Integer value, String type, Integer align) | ||
+ | { | ||
+ | this(); | ||
+ | this.value = Integer.toString(value); | ||
+ | this.type = type; | ||
+ | this.align = align; | ||
+ | } | ||
+ | |||
+ | TypeVar(Double value, String type, Integer align) | ||
+ | { | ||
+ | this(); | ||
+ | this.value = Double.toString(value); | ||
+ | this.type = type; | ||
+ | this.align = align; | ||
+ | } | ||
+ | |||
+ | |||
+ | } | ||
+ | |||
+ | public class TypeFun{ | ||
+ | |||
+ | ArrayList< | ||
+ | Integer nPar; | ||
+ | String retT; | ||
+ | |||
+ | public TypeFun(ArrayList< | ||
+ | { | ||
+ | this.parT = parT; | ||
+ | this.nPar = parT.size(); | ||
+ | this.retT = retT; | ||
+ | } | ||
+ | |||
+ | } | ||
+ | |||
+ | |||
+ | </ | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | |||
+ | ==== Grammar start ==== | ||
+ | The grammar defined for the parser assumes that there are function definitions first and then there are the instructions that will be included in the '' | ||
+ | '' | ||
+ | '' | ||
+ | <code Java> | ||
+ | prog ::= function_defs {: | ||
+ | if(parser.isCorrect) | ||
+ | { | ||
+ | bwr.write(" | ||
+ | bwr.write(printBuff.toString()); | ||
+ | | ||
+ | } | ||
+ | else | ||
+ | System.out.println(" | ||
+ | var_label = 0; | ||
+ | printBuff.setLength(0); | ||
+ | | ||
+ | : | ||
+ | if(parser.isCorrect) | ||
+ | { | ||
+ | for(String s : stringDef) | ||
+ | { | ||
+ | bwr.write(s+" | ||
+ | } | ||
+ | |||
+ | bwr.write(" | ||
+ | | ||
+ | bwr.write(printBuff.toString()); | ||
+ | | ||
+ | bwr.write(" | ||
+ | bwr.flush(); | ||
+ | |||
+ | //close the stream | ||
+ | bwr.close(); | ||
+ | } | ||
+ | else | ||
+ | System.out.println(" | ||
+ | |||
+ | :}; | ||
+ | </ | ||
+ | |||
+ | ==== Some practical examples ==== | ||
+ | ==Recognition of scalars and variable IDs== | ||
+ | '' | ||
+ | Each time an '' | ||
+ | While when an '' | ||
+ | <code Java> | ||
+ | val ::= ID:x {: | ||
+ | if(!parser.symbolTable.containsKey(x)) | ||
+ | { | ||
+ | pSemError(" | ||
+ | }else{ | ||
+ | RESULT = parser.symbolTable.get(x); | ||
+ | |||
+ | } | ||
+ | |||
+ | :} | ||
+ | | INT:x {: | ||
+ | RESULT = new TypeVar(x, " | ||
+ | append(" | ||
+ | append(" | ||
+ | :} | ||
+ | | DOUBLE:x {: | ||
+ | RESULT = new TypeVar(x, " | ||
+ | append(" | ||
+ | append(" | ||
+ | :} | ||
+ | ; | ||
+ | </ | ||
+ | |||
+ | ==Array definition== | ||
+ | This rules uses '' | ||
+ | '' | ||
+ | |||
+ | <code Java> | ||
+ | assignment ::= ID:id EQU SO elem_list:x SC{: | ||
+ | TypeVar nTypeVar = new TypeVar(); | ||
+ | Integer arrReg = Integer.parseInt(nTypeVar.reg_id); | ||
+ | append(" | ||
+ | for(int i = 0; i< | ||
+ | { | ||
+ | TypeVar xTy = x.get(i); | ||
+ | append(" | ||
+ | append(" | ||
+ | append(" | ||
+ | } | ||
+ | | ||
+ | nTypeVar.type = x.get(0).type; | ||
+ | nTypeVar.align = x.get(0).align; | ||
+ | nTypeVar.size1 = x.size(); | ||
+ | addEntry(id, | ||
+ | |||
+ | :} | ||
+ | </ | ||
+ | For example, given the following code: | ||
+ | <code Julia> | ||
+ | a = [1,2,3] | ||
+ | </ | ||
+ | Here's the correspondent LLVM IR code: | ||
+ | <code ASM> | ||
+ | %1 = alloca i32, align 4 | ||
+ | store i32 1, i32* %1 | ||
+ | %2 = alloca i32, align 4 | ||
+ | store i32 2, i32* %2 | ||
+ | %3 = alloca i32, align 4 | ||
+ | store i32 3, i32* %3 | ||
+ | %4 = alloca [3 x i32], align 4 | ||
+ | %5 = getelementptr inbounds [3 x i32], [3 x i32]* %4, i32 0, i32 0 | ||
+ | %6 = load i32, i32* %1, align 4 | ||
+ | store i32 %6, i32* %5, align 4 | ||
+ | %7 = getelementptr inbounds [3 x i32], [3 x i32]* %4, i32 0, i32 1 | ||
+ | %8 = load i32, i32* %2, align 4 | ||
+ | store i32 %8, i32* %7, align 4 | ||
+ | %9 = getelementptr inbounds [3 x i32], [3 x i32]* %4, i32 0, i32 2 | ||
+ | %10 = load i32, i32* %3, align 4 | ||
+ | store i32 %10, i32* %9, align 4 | ||
+ | </ | ||
+ | ==== Function implementation ==== | ||
+ | The following code shows how LLVM IR code is produced for function definition. | ||
+ | //Remember: function definition implementation only supports doubles as input and return parameters.// | ||
+ | |||
+ | When the function definition is recognized, the parser produces the function definition in LLVM IR code. | ||
+ | Before parsing the function statements, the input parameters are stored in new registers. | ||
+ | <code Java> | ||
+ | function_def ::= FUNCT ID:f RO ID_list: | ||
+ | |||
+ | var_label = par.size()+1; | ||
+ | append(" | ||
+ | for(int i = 0; i< | ||
+ | { | ||
+ | append(" | ||
+ | if(i != (par.size()-1)) | ||
+ | append(", | ||
+ | else | ||
+ | append(" | ||
+ | |||
+ | } | ||
+ | for(int i = 0; i< | ||
+ | { | ||
+ | Integer reg; | ||
+ | if(i==0) | ||
+ | reg = var_label; | ||
+ | else | ||
+ | reg = genVarLabel(); | ||
+ | append(" | ||
+ | append(" | ||
+ | TypeVar p = new TypeVar(); | ||
+ | var_label--; | ||
+ | p.reg_id = Integer.toString(reg); | ||
+ | p.type = new String(" | ||
+ | p.align = new Integer(8); | ||
+ | addEntry(par.get(i), | ||
+ | |||
+ | } | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | :} RC statements END{: | ||
+ | append(" | ||
+ | String fName = parser.stack(-6).toString(); | ||
+ | ArrayList< | ||
+ | for(int i = 0; i< | ||
+ | { | ||
+ | pType.add(" | ||
+ | } | ||
+ | TypeFun funct = new TypeFun(pType, | ||
+ | functionTable.put(fName, | ||
+ | |||
+ | pType.set((pType.size()-1), | ||
+ | funct = new TypeFun(pType, | ||
+ | | ||
+ | Integer index = printBuff.indexOf(fName)-2; | ||
+ | printBuff.insert(index, | ||
+ | |||
+ | |||
+ | retType = " | ||
+ | var_label = 0; | ||
+ | symbolTable.clear(); | ||
+ | :}; | ||
+ | </ | ||
+ | |||
+ | After function statements are recognized, the parser defines the correspondent entry in the '' | ||
+ | |||
+ | Let's consider the following function definition in Julia: | ||
+ | <code Julia> | ||
+ | function division(a, | ||
+ | |||
+ | ... | ||
+ | |||
+ | end | ||
+ | </ | ||
+ | |||
+ | The correspondent LLVM IR code will be: | ||
+ | <code ASM> | ||
+ | define double @division(double, | ||
+ | %3 = alloca double, align 8 | ||
+ | store double %0, double* %3 | ||
+ | %4 = alloca double, align 8 | ||
+ | store double %1, double* %4 | ||
+ | ... | ||
+ | } | ||
+ | </ | ||
+ | ==== Print instruction ==== | ||
+ | When a string is recognized by the parser, it calls the '' | ||
+ | Then, for simplicity and for exploit this CreateString functionality, | ||
+ | If a numerical value is printed, the operation of pre-placing a $ symbol is obviously not done. If an array element is printed, '' | ||
+ | <code Java> | ||
+ | print_instr ::= print_keyw: | ||
+ | CreateString(nl, | ||
+ | :} | ||
+ | | print_keyw: | ||
+ | | ||
+ | if(!x.value.matches(" | ||
+ | { | ||
+ | x.value = " | ||
+ | } | ||
+ | CreateString(nl, | ||
+ | :} | ||
+ | | print_keyw: | ||
+ | {: | ||
+ | String s = new String(" | ||
+ | if(nl) | ||
+ | s = " | ||
+ | s = " | ||
+ | Integer length = s.length()-(nl? | ||
+ | append(" | ||
+ | parser.stringDef.add(" | ||
+ | append((" | ||
+ | |||
+ | |||
+ | :} | ||
+ | </ | ||
+ | |||
+ | ==== Error handling ==== | ||
+ | The compiler is able to recognize the following kind of errors: | ||
+ | * Variable not declared | ||
+ | * Variable is not an array | ||
+ | * Function not defined | ||
+ | * Generic error in assignment | ||
+ | * Missing ] in array definition | ||
+ | * Missing ] in matrix definition | ||
+ | * General error in while condition | ||
+ | * Missing ) in while condition | ||
+ | * General error in if condition | ||
+ | * General error in print instruction | ||
+ | |||
+ | ==== Missing functionalities, | ||
+ | * Matrix and touples grammar rules defined but their usage is not supported | ||
+ | * Functions work only with only doubles as input and return parameters | ||
+ | * String assignment to variable not supported | ||
+ | * Doubles are supported for arithmetic operations but not for other code functionalities (e.g. no support for array of double,…) | ||
+ | * Print instruction doesn' | ||
+ | * No support for global variables | ||
+ | |||
+ | ===== Download and installation ===== | ||
+ | **Compiler**: | ||
+ | |||
+ | **Examples** | ||
+ | * Division: [[https:// | ||
+ | * Bubble sort: [[https:// | ||
+ | * Floyd triangle: [[https:// | ||
+ | |||
+ | Examples contains a Julia source code and the correspondent LLVM IR code. | ||
+ | |||
+ | ==== How to run it ==== | ||
+ | - Install the //llvm// package: '' | ||
+ | - Download the // | ||
+ | - Open the terminal, go to the folder where the compiler is extracted and run the following commands: | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | - This will produce an '' | ||
+ | - Run output.ll file with //lli//: '' | ||
+ | |||
+ | |||
If you found any error, or if you want to partecipate to the editing of this wiki, please contact: admin [at] skenz.it
You can reuse, distribute or modify the content of this page, but you must cite in any document (or webpage) this url: https://www.skenz.it/compilers/julia_to_llvm?do=diff
/web/htdocs/www.skenz.it/home/data/pages/compilers/julia_to_llvm.txt · Last modified: 2024/04/08 22:34 by 127.0.0.1