This page shows the implementation of a compiler that recognizes and translates part of the Matlab programming language into the LLVM IR syntax (more information about LLVM can be found here).
List of the Matlab features Implemented
Data types
Operators
Sub block of codes
Function details (partial implementation):
Output:
The compiler is built of two parts: a scanner and parser
The scanner is able to recognize and retrieve tokens (terminal symbols) to the parser coupled with an object containing a value that represents the token. It identifies integers, doubles and ids (that will be used for variables, function names, etc…) and other significant Matlab keywords like:
if
else
end
for
while
function
fprintf
disp
And other syntax elements like punctuation and other symbols.
nl = \r|\n|\r\n ws = [ \t] id = [A-Za-z][A-Za-z0-9_]* integer = ([1-9][0-9]*|0) double = (([0-9]+\.[0-9]*) | ([0-9]*\.[0-9]+)) (e|E('+'|'-')?[0-9]+)? %% "(" {return symbol(sym.RO);} ")" {return symbol(sym.RC);} "=" {return symbol(sym.EQ);} "+" {return symbol(sym.PLUS);} "-" {return symbol(sym.MINUS);} "*" {return symbol(sym.STAR);} ".*" {return symbol(sym.DOTSTAR);} "/" {return symbol(sym.DIV);} "./" {return symbol(sym.DOTDIV);} "<" {return symbol(sym.MIN);} ">" {return symbol(sym.MAJ);} "<=" {return symbol(sym.MIN_EQ);} "=<" {return symbol(sym.EQ_MIN);} ">=" {return symbol(sym.MAJ_EQ);} "=>" {return symbol(sym.EQ_MAJ);} "&" {return symbol(sym.AND);} "|" {return symbol(sym.OR);} "~" {return symbol(sym.NOT);} "[" {return symbol(sym.SO);} "]" {return symbol(sym.SC);} "function" {return symbol(sym.FUNCT);} "end" {return symbol(sym.END);} "disp" {return symbol(sym.DISP);} "fprintf" {return symbol(sym.PRINT);} "if" {return symbol(sym.IF);} "while" {return symbol(sym.WHILE);} "for" {return symbol(sym.FOR);} "else" {return symbol(sym.ELSE);} ";" {return symbol(sym.S);} "," {return symbol(sym.CM);} ":" {return symbol(sym.C);} {id} {return symbol(sym.ID, yytext());} {integer} {return symbol(sym.INT, new Integer(yytext()));} {double} {return symbol(sym.DOUBLE, new Double(yytext()));}
The parser can take as input the tokens provided by the scanner and recognize the main grammatical rules of Matlab language. As a result, the LLVM IR code is produced.
This snippet shows all variables and classes used to support the parser on the creation of the output program:
public HashMap <String, InfoVar> symbolTable; public HashMap <String, InfoFun> functionTable; public boolean isCorrect = true; public StringBuffer stamentsBuff; public ArrayList<String> stringStatements; public int var_count = 0; public int str_label = 0; public int sub_label = 0; public int else_label = 1; public int tot_sub_label = 0; public int cmp_count=0; public boolean activate_while = false; public boolean desctivate_while = false; public boolean activate_for = false; public boolean desctivate_for = false; public String ret_id = ""; public BufferedWriter bwr; public int genVarCount(){ var_count++; return var_count; }; public int genStrCount(){ str_label++; return str_label; }; public class InfoVar{ public String reg_id; //First label assigned to the variable public String load_to; //Reg id of the one who loaded an existing variable (default self reg_id) public String type; //i32, double public String value; //The real value of the variable (ex: 1 or 1.0) public Integer align; //alignment required: 4, 8... public Integer size1; //If the variable is an array then this is its size, otherwise size1 = -1 public Integer size2; //If the variable is a matrix then this is its size, otherwise size1 = -1 public boolean just_created; //It helps to know if an operation must use the load_to or the real value public InfoVar() { reg_id = Integer.toString(genVarCount()); load_to = Integer.toString(var_count); size1 = size2 = -1; } InfoVar(Integer value, String type, Integer align) { this.just_created = true; this.value = Integer.toString(value); this.type = type; this.align = align; } InfoVar(Double value, String type, Integer align) { this.just_created = true; this.value = Double.toString(value); this.type = type; this.align = align; } } public class InfoFun{ ArrayList<String> funParam; Integer numParam; String funRet; public InfoFun(ArrayList<String> funParam) { this.funParam = funParam; this.numParam = funParam.size(); this.funRet = "i32"; }
Class InfoVar
: class that represents a variable, array or matrixreg_id
: represents the register in which the variable is storedload_to
: represents the register where a variable is going to be loadtype
: represents the type of the variablevalue
: the real value of the variablealign
: the align for the variablesize1
: size of the arraysize2
: size of the columns of matrices if neededjust_created
: It helps to know if an operation must use the load_to or the real valueClass InfoFun
: class used to represent functions informationfunParam
: list of parameters typenumParam
: number of parametersfunRet
: return typeHashmap<String, TypeVar> symbolTable
: hashmap containing the correspondence between a variable ID and a InfoVarHashmap<String, TypeFun> functionTable
: hashmap containing the correspondence between a fuction ID and a InfoFunStringbuffer stamentsBuff
: buffer used to save all the outputs and then display an output.ll fileArrayList<String> stringStatements
: array of the definition of the string in LLVM language, tipcally to be printedvar_count
: counter used for register names in LLVM IRstr_label
: counter for string labels namessub_label
: counter for sub section of codeelse_label
: counter label for instructions elsecmp_count
: counter of the cmp registers used in the LLVM languagetot_flow_label
: counter for total sub section of codeThe grammar starts with the main symbol prog and writes down by stamentsBuff that therefore is displayed in the output file output.ll. The non terminal symbol function_defs is read by first so all the functions definitions are goint to be displayed at the beggining before the @main, at the end of each function definition the var_count is reset so the main function can use the new registers. Between functions and main there are the string declarations to be consequently printed.
prog ::= function_defs {: if(parser.isCorrect) { bwr.write("declare i32 @printf(i8*, ...)\n"); bwr.write(stamentsBuff.toString()); } else System.out.println("Program contains errors."); var_count = 0; stamentsBuff.setLength(0); :}statements {: if(parser.isCorrect) { for(String s : stringStatements) { bwr.write(s+"\n"); } bwr.write("define void @main(){\n"); bwr.write(stamentsBuff.toString()); bwr.write("ret void\n}"); bwr.flush(); bwr.close(); } else System.out.println("There are errors in the program"); :};
In this example it can be seen that when a for or while feature is actived it is displayed the their corresponded labels before any register is be load
val ::= ID:x {: if(!parser.symbolTable.containsKey(x)) { pSemError("Error: Variable "+x+" is not declared."); }else{ RESULT = parser.symbolTable.get(x); //To load the variables inside the "while" block if(activate_while){ tot_sub_label++; sub_label = tot_sub_label; stamentsBuff.append("br label %while_cond." + sub_label+"\n"); stamentsBuff.append("while_cond." + sub_label + ":"+"\n"); activate_while = false; desctivate_while = true; } //To load the variables inside the "for" block if(activate_for){ tot_sub_label++; sub_label = tot_sub_label; stamentsBuff.append("br label %for_cond." + sub_label+"\n"); stamentsBuff.append("for_cond." + sub_label + ":"+"\n"); activate_for = false; } stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+RESULT.reg_id+", align "+RESULT.align+"\n"); RESULT.load_to = Integer.toString(var_count); } :} | ID:x RO arit_op:y RC {: if(!parser.symbolTable.containsKey(x)) { pSemError("Error: Variable "+x+" is not declared."); }else{ RESULT = parser.symbolTable.get(x); if(!y.just_created) stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" %"+y.load_to+"\n"); else stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(Integer.parseInt(y.value)-1)+"\n"); stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n"); RESULT.load_to = Integer.toString(var_count); } :} | ID:x RO arit_op:i CM arit_op:j RC {: if(!parser.symbolTable.containsKey(x)) { pSemError("Error: Variable "+x+" is not declared"); }else{ RESULT = parser.symbolTable.get(x); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]], ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n"); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size2+" x "+RESULT.type+"], ["+RESULT.size2+" x "+RESULT.type+"]* %"+(var_count-1)+", "+RESULT.type+" 0, "+RESULT.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n"); stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n"); RESULT.load_to = Integer.toString(var_count); } :} | INT:x {: RESULT = new InfoVar(x, "i32", new Integer(4)); :} | DOUBLE:x {: RESULT = new InfoVar(x, "double", new Integer(8)); :} ; //Elements of vectors of a matrix matrix_elements ::= matrix_elements:x S vect_elements:y{: x.add(y); RESULT = x; :} | vect_elements:x{: RESULT = new ArrayList<ArrayList<InfoVar>>(); RESULT.add(x); :} ; //Elements of variables or constants of a vector vect_elements ::= vect_elements:x elem:y{: x.add(y); RESULT = x; :} | elem:x {: RESULT = new ArrayList<InfoVar>(); RESULT.add(x); :} ;
Matrices definitions use also the array (vector) definitions since is just a list a of their definitions. In the same way, the definition of the arrays is a list of InfoVar
/Vector | ID:id EQ SO vect_elements:x SC{: InfoVar nInfoVar = new InfoVar(); Integer vector_Register = Integer.parseInt(nInfoVar.reg_id); stamentsBuff.append("%"+vector_Register+" = alloca ["+x.size()+" x "+x.get(0).type+"], align "+x.get(0).align+"\n"); for(int i = 0; i<x.size(); i++) { InfoVar xTy = x.get(i); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x "+x.get(i).type+"], ["+x.size()+" x "+x.get(i).type+"]* %"+vector_Register+", "+x.get(i).type+" 0, "+x.get(i).type+" "+i+"\n"); stamentsBuff.append("store "+xTy.type+" "+(x.get(i).just_created?x.get(i).value:"%"+x.get(i).load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n"); } nInfoVar.type = x.get(0).type; nInfoVar.align = x.get(0).align; nInfoVar.size1 = x.size(); addSymbol(id, nInfoVar ); :} //Vector element assignment | ID:id RO arit_op:x RC EQ arit_op:y {: InfoVar idVar = parser.symbolTable.get(id); if(!x.just_created) stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" %"+x.load_to+"\n"); else stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(Integer.parseInt(x.value)-1)+"\n"); stamentsBuff.append("store "+idVar.type+" "+(y.just_created?y.value:"%"+y.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n"); :} //Matrix | ID:id EQ SO matrix_elements:x SC{: InfoVar nInfoVar = new InfoVar(); Integer matrix_Register = Integer.parseInt(nInfoVar.reg_id); stamentsBuff.append("%"+matrix_Register+" = alloca ["+x.size()+" x ["+x.get(0).size()+" x "+x.get(0).get(0).type+"]], align "+x.get(0).get(0).align+"\n"); for(int i = 0; i<x.size(); i++) { for(int j = 0; j<x.get(i).size(); j++) { InfoVar xTy = x.get(i).get(j); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]], ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]]* %"+matrix_Register+", "+xTy.type+" 0, "+xTy.type+" "+i+"\n"); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.get(i).size()+" x "+xTy.type+"], ["+x.get(i).size()+" x "+xTy.type+"]* %"+(var_count-1)+", "+xTy.type+" 0, "+xTy.type+" "+j+"\n"); stamentsBuff.append("store "+xTy.type+" "+(xTy.just_created?xTy.value:"%"+xTy.load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n"); } } nInfoVar.type = x.get(0).get(0).type; nInfoVar.align = x.get(0).get(0).align; nInfoVar.size1 = x.size(); nInfoVar.size2 = x.get(0).size(); addSymbol(id, nInfoVar ); :} //Matrix element assignment | ID:id RO arit_op:i CM arit_op:j RC EQ arit_op:x {: InfoVar idVar = parser.symbolTable.get(id); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]], ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n"); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size2+" x "+idVar.type+"], ["+idVar.size2+" x "+idVar.type+"]* %"+(var_count-1)+", "+idVar.type+" 0, "+idVar.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n"); stamentsBuff.append("store "+idVar.type+" "+(x.just_created?x.value:"%"+x.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n"); :}
Here is an example:
d = [1 2 4 ; 5 6 7]
And here is the LLVM transformation:
%7 = alloca [2 x [3 x i32]], align 4 %8 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0 %9 = getelementptr inbounds [3 x i32], [3 x i32]* %8, i32 0, i32 0 store i32 1, i32* %9, align 4 %10 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0 %11 = getelementptr inbounds [3 x i32], [3 x i32]* %10, i32 0, i32 1 store i32 2, i32* %11, align 4 %12 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0 %13 = getelementptr inbounds [3 x i32], [3 x i32]* %12, i32 0, i32 2 store i32 4, i32* %13, align 4 %14 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1 %15 = getelementptr inbounds [3 x i32], [3 x i32]* %14, i32 0, i32 0 store i32 5, i32* %15, align 4 %16 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1 %17 = getelementptr inbounds [3 x i32], [3 x i32]* %16, i32 0, i32 1 store i32 6, i32* %17, align 4 %18 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1 %19 = getelementptr inbounds [3 x i32], [3 x i32]* %18, i32 0, i32 2 store i32 7, i32* %19, align 4
The following piece of code represents the LLVM IR code of the functions, this only accepts integers parameters and integer returns
function_def ::= FUNCT ID:r EQ ID:f RO parameters:par{: stamentsBuff.append("define i32 @"+f+"("); for(int i = 0; i<par.size(); i++) { genVarCount(); stamentsBuff.append("i32"); if(i != (par.size()-1)) stamentsBuff.append(", "); else stamentsBuff.append(") {"+"\n"); } Integer currentReg; for(int i = 0; i<par.size(); i++) { currentReg = genVarCount() ; stamentsBuff.append("%"+currentReg+" = alloca i32, align 4"+"\n"); stamentsBuff.append("store i32 %"+i+", i32* %"+currentReg+"\n"); InfoVar newParam = new InfoVar(); var_count--; newParam.reg_id = Integer.toString(currentReg); newParam.type = "i32"; newParam.align = 4; addSymbol(par.get(i), newParam); } ArrayList<String> parametersType= new ArrayList<String>(); for(int i = 0; i<par.size(); i++) { parametersType.add("i32"); } InfoFun funct = new InfoFun(parametersType); functionTable.put(f,funct); ret_id = r; :} RC statements END{: stamentsBuff.append("}"+"\n"); var_count = 0; symbolTable.clear(); :}; param ::= ID:x {:RESULT = x;:} | ; parameters ::= parameters:l CM param:x{: l.add(x); RESULT = l; :} | param:x{: RESULT = new ArrayList<String>(); RESULT.add(x); :} ;
There are two print instructions implemented, the first one is “disp” which only displays either string words with the function ManageString or variables (IDs for simple variables, arrays or matrices) with the function ManageStringID; if the ID to be printed is a vector or matrix, this instruction prints the whole structure. The Matlab instruction “fprintf” instead allows (in this implementation) to display string along to the reference of the variables (only single variables).
//Print instruction print_instr ::= DISP RO STRING:x RC{: ManageString(x); :} |DISP RO ID:x RC{: ManageStringID(x); :} | PRINT RO STRING:s CM id_list:x RC{: ManageString(s,x); :} | print_keyw error {:pSynWarning("Error in print instruction.");:} ; id_list ::= id_list:x CM ID:i{: x.add(i); RESULT = x; :} |ID:x{: RESULT = new ArrayList<String>(); RESULT.add(x); :} ;
Here are the three ManageString functions
public void ManageString(String x){ int label = genStrCount(); String s = x; s = s.replace("\"",""); s = s + "\\0A\\00"; Integer length = s.length()-4; parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1"); stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0))\n")); } public void ManageStringID(String x){ InfoVar infoVar = parser.symbolTable.get(x); if(!parser.symbolTable.containsKey(x)) { pSemError("Variable "+x+" not declared."); }else{ if(infoVar.size1==-1){ int label = genStrCount(); String s = "%"+(infoVar.type.equals("i32")?"d":"f")+"\\0A\\00"; Integer length = s.length()-4; stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+", "+infoVar.type+"* %"+infoVar.reg_id+", align "+infoVar.align+"\n"); infoVar.load_to = var_count+""; parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1"); stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0), "+infoVar.type+ " %"+infoVar.load_to+")\n")); }else if(infoVar.size1!=1 && infoVar.size2==-1){ int label = genStrCount(); String s = ""; ArrayList<Integer> loads_reg = new ArrayList<>(); for(int i = 0;i < infoVar.size1-1; i++){ s = s+" %"+(infoVar.type.equals("i32")?"d":"f"); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n"); stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n"); loads_reg.add(var_count); } s = s+" %"+(infoVar.type.equals("i32")?"d":"f") + "\\0A\\00"; stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+(infoVar.size1-1)+"\n"); stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n"); loads_reg.add(var_count); Integer length = s.length()-4; parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1"); stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)")); stamentsBuff.append(", "); for (int i = 0; i < loads_reg.size(); i ++) { if(i==0) stamentsBuff.append(infoVar.type+" %"+loads_reg.get(i)); else stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(i)); } stamentsBuff.append(")"+"\n"); }else{ for(int i = 0;i < infoVar.size1; i++){ int label = genStrCount(); String s = ""; ArrayList<Integer> loads_reg = new ArrayList<>(); for(int j = 0;j < infoVar.size2; j++){ s = s+" %"+(infoVar.type.equals("i32")?"d":"f"); if(j== infoVar.size2-1) s = s+"\\0A\\00"; stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]], ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n"); stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size2+" x "+infoVar.type+"], ["+infoVar.size2+" x "+infoVar.type+"]* %"+(var_count-1)+", "+infoVar.type+" 0, "+infoVar.type+" "+j+"\n"); stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n"); loads_reg.add(var_count); } Integer length = s.length()-4; parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1"); stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)")); stamentsBuff.append(", "); for (int j = 0; j < loads_reg.size(); j ++) { if(j==0) stamentsBuff.append(infoVar.type+" %"+loads_reg.get(j)); else stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(j)); } stamentsBuff.append(")"+"\n"); } } } } public void ManageString(String x, ArrayList<String> variables) { ArrayList <InfoVar> regList = new ArrayList<InfoVar>(); int label = genStrCount(); InfoVar t = null; String s = x; s = s.replace("\"", ""); s = s.replace("%i", "%d"); for(String var : variables) { t = parser.symbolTable.get(var); if(!parser.symbolTable.containsKey(var)) { pSemError("Variable "+var+" not declared."); }else if(parser.symbolTable.get(var).size1==-1){ stamentsBuff.append("%"+genVarCount()+" = load "+t.type+", "+t.type+"* %"+t.reg_id+", align "+t.align+"\n"); t.load_to = var_count+""; regList.add(t); } } s = s + "\\0A\\00"; Integer length = s.length()-4; parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1"); stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)")); stamentsBuff.append(", "); for (int i = 0; i < regList.size(); i ++) { InfoVar infoVar = regList.get(i); if(i==0) stamentsBuff.append(infoVar.type+" %"+infoVar.load_to); else stamentsBuff.append(", "+infoVar.type+" %"+infoVar.load_to); } stamentsBuff.append(")"+"\n"); }
The compiler is able to recognize the following kind of errors:
disp()
function only display strings or a single variable, not strings with variables nor or multiple variablesfprintf()
only print ID from simple variables not ids from arrays or matrices, but disp()
function does ID(arit_op, arit_op))
, function calls have an additional “()” so that function_call can be recognized properlyCompiler matlab_compiler.zip
Examples
sudo apt install llvm
jflex matlab_scanner.jflex
java java_cup.Main -expect 3 matlab_parser.cup
javac *.java
java Main source.mlx
output.ll
filelli output.ll