User Tools

Site Tools


compilers:matlab_to_llvm
Return to Home page

From Matlab to LLVM

Background

This page shows the implementation of a compiler that recognizes and translates part of the Matlab programming language into the LLVM IR syntax (more information about LLVM can be found here).

Implemented features

List of the Matlab features Implemented

Data types

  • Basic data types
    • int
    • double
    • arrays
    • matrix

Operators

  • arithmetic:
    • addition (+)
    • subtraction (-)
    • multiplication (*)
    • multiplication element by element (.*)
    • division(/)
    • division element by element (./)
  • comparison:
    • equality ( == )
    • greater than (>)
    • less than (<)
    • greater than equal to (>= or = >)
    • less than equal to (< = or =<)
  • logical:
    • AND (&)
    • OR (|)

Sub block of codes

  • if instruction
  • else instruction
  • for loop
  • while loop

Function details (partial implementation):

  • int type parameters
  • int return type

Output:

  • disp/fprintf
  • with variables and text (fprintf) and with only one variable (disp)

Compiler

The compiler is built of two parts: a scanner and parser

Scanner

The scanner is able to recognize and retrieve tokens (terminal symbols) to the parser coupled with an object containing a value that represents the token. It identifies integers, doubles and ids (that will be used for variables, function names, etc…) and other significant Matlab keywords like:

  • if
  • else
  • end
  • for
  • while
  • function
  • fprintf
  • disp

And other syntax elements like punctuation and other symbols.

Snippet of Matlab Scanner

nl = \r|\n|\r\n
ws = [ \t]
id = [A-Za-z][A-Za-z0-9_]*
integer =  ([1-9][0-9]*|0)
double = (([0-9]+\.[0-9]*) | ([0-9]*\.[0-9]+)) (e|E('+'|'-')?[0-9]+)?
 
%%
 
"("     {return symbol(sym.RO);}
")"     {return symbol(sym.RC);}
"="     {return symbol(sym.EQ);}
"+"     {return symbol(sym.PLUS);}
"-"     {return symbol(sym.MINUS);}
"*"     {return symbol(sym.STAR);}
".*"    {return symbol(sym.DOTSTAR);}
"/"     {return symbol(sym.DIV);}
"./"    {return symbol(sym.DOTDIV);}
"<"     {return symbol(sym.MIN);}
">"     {return symbol(sym.MAJ);}
"<="    {return symbol(sym.MIN_EQ);}
"=<"    {return symbol(sym.EQ_MIN);}
">="    {return symbol(sym.MAJ_EQ);}
"=>"    {return symbol(sym.EQ_MAJ);}
"&"     {return symbol(sym.AND);}
"|"     {return symbol(sym.OR);}
"~"     {return symbol(sym.NOT);}
 
"["     {return symbol(sym.SO);}
"]"     {return symbol(sym.SC);}
 
"function" {return symbol(sym.FUNCT);}
"end"     {return symbol(sym.END);}
"disp"    {return symbol(sym.DISP);}
"fprintf" {return symbol(sym.PRINT);}
"if"      {return symbol(sym.IF);}
"while"   {return symbol(sym.WHILE);}
"for"   {return symbol(sym.FOR);}
"else"    {return symbol(sym.ELSE);}
";"       {return symbol(sym.S);}
","       {return symbol(sym.CM);}
":"       {return symbol(sym.C);}
 
{id}      {return symbol(sym.ID, yytext());}
{integer} {return symbol(sym.INT, new Integer(yytext()));}
{double}  {return symbol(sym.DOUBLE, new Double(yytext()));}
 

Parser

The parser can take as input the tokens provided by the scanner and recognize the main grammatical rules of Matlab language. As a result, the LLVM IR code is produced.

Data structures

This snippet shows all variables and classes used to support the parser on the creation of the output program:

	public HashMap <String, InfoVar> symbolTable;
 
	public HashMap <String, InfoFun> functionTable;
 
	public boolean isCorrect = true; 
 
	public StringBuffer stamentsBuff;
 
	public ArrayList<String> stringStatements;
 
	public int var_count = 0;
 
	public int str_label = 0; 
 
	public int sub_label = 0;
 
	public int else_label = 1;
 
	public int tot_sub_label = 0;
 
	public int cmp_count=0;
 
	public boolean activate_while = false;
 
	public boolean desctivate_while = false;
 
	public boolean activate_for = false;
 
	public boolean desctivate_for = false;
 
	public String ret_id = "";
 
	public BufferedWriter bwr;
 
	public int genVarCount(){
 
		var_count++;
 
		return var_count; 
	};
 
	public int genStrCount(){
 
		str_label++;
 
		return str_label; 
	};
 
	public class InfoVar{
 
		public String reg_id; //First label assigned to the variable
		public String load_to; //Reg id of the one who loaded an existing variable (default self reg_id)
		public String type; //i32, double
		public String value; //The real value of the variable (ex: 1 or 1.0)
		public Integer align;  //alignment required: 4, 8...
		public Integer size1; //If the variable is an array then this is its size, otherwise size1 = -1 
		public Integer size2; //If the variable is a matrix then this is its size, otherwise size1 = -1 
		public boolean just_created; //It helps to know if an operation must use the load_to or the real value
 
		public InfoVar()
		{
			reg_id = Integer.toString(genVarCount());
			load_to = Integer.toString(var_count);
			size1 = size2 = -1;
		}
		InfoVar(Integer value, String type, Integer align)
		{
			this.just_created = true;
			this.value = Integer.toString(value);
			this.type = type;
			this.align = align;
		}
 
		InfoVar(Double value, String type, Integer align)
		{
			this.just_created = true;
			this.value = Double.toString(value);
			this.type = type;
			this.align = align;
		}
 
 
	}
 
	public class InfoFun{
 
		ArrayList<String> funParam;
		Integer numParam;
		String funRet;
 
		public InfoFun(ArrayList<String> funParam)
		{
			this.funParam = funParam;
			this.numParam = funParam.size(); 
			this.funRet = "i32";
		}
 
  • Class InfoVar: class that represents a variable, array or matrix
    • reg_id: represents the register in which the variable is stored
    • load_to: represents the register where a variable is going to be load
    • type: represents the type of the variable
    • value: the real value of the variable
    • align: the align for the variable
    • size1: size of the array
    • size2: size of the columns of matrices if needed
    • just_created: It helps to know if an operation must use the load_to or the real value
  • Class InfoFun: class used to represent functions information
    • funParam: list of parameters type
    • numParam: number of parameters
    • funRet: return type
  • Hashmap<String, TypeVar> symbolTable: hashmap containing the correspondence between a variable ID and a InfoVar
  • Hashmap<String, TypeFun> functionTable: hashmap containing the correspondence between a fuction ID and a InfoFun
  • Stringbuffer stamentsBuff: buffer used to save all the outputs and then display an output.ll file
  • ArrayList<String> stringStatements: array of the definition of the string in LLVM language, tipcally to be printed
  • var_count: counter used for register names in LLVM IR
  • str_label: counter for string labels names
  • sub_label: counter for sub section of code
  • else_label: counter label for instructions else
  • cmp_count: counter of the cmp registers used in the LLVM language
  • tot_flow_label: counter for total sub section of code

Grammar start

The grammar starts with the main symbol prog and writes down by stamentsBuff that therefore is displayed in the output file output.ll. The non terminal symbol function_defs is read by first so all the functions definitions are goint to be displayed at the beggining before the @main, at the end of each function definition the var_count is reset so the main function can use the new registers. Between functions and main there are the string declarations to be consequently printed.

prog ::= function_defs {:
	if(parser.isCorrect)
	{
		bwr.write("declare i32 @printf(i8*, ...)\n");
 
		bwr.write(stamentsBuff.toString());
 
	}
	else
		System.out.println("Program contains errors.");
	var_count = 0; 
	stamentsBuff.setLength(0);
 
:}statements {:
	if(parser.isCorrect)
	{
		for(String s : stringStatements)
		{
			bwr.write(s+"\n");
		}
 
		bwr.write("define void @main(){\n");
 
		bwr.write(stamentsBuff.toString());
 
		bwr.write("ret void\n}");
		bwr.flush();
 
		bwr.close();
	}
	else
		System.out.println("There are errors in the program");
 
:};

Practical examples

Recognition of constants, variable, arrays and matrices ID

In this example it can be seen that when a for or while feature is actived it is displayed the their corresponded labels before any register is be load

val ::= ID:x {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared.");	
	}else{
		RESULT = parser.symbolTable.get(x);
		//To load the variables inside the "while" block
		if(activate_while){
			tot_sub_label++;
			sub_label = tot_sub_label; 
			stamentsBuff.append("br label %while_cond." + sub_label+"\n");
			stamentsBuff.append("while_cond." + sub_label + ":"+"\n");
			activate_while = false;
			desctivate_while = true;
		}
		//To load the variables inside the "for" block
		if(activate_for){
			tot_sub_label++;
			sub_label = tot_sub_label; 
			stamentsBuff.append("br label %for_cond." + sub_label+"\n");
			stamentsBuff.append("for_cond." + sub_label + ":"+"\n");
			activate_for = false;
		}
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+RESULT.reg_id+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
 
:}
| ID:x RO arit_op:y RC {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared.");	
	}else{
		RESULT = parser.symbolTable.get(x);
		if(!y.just_created)
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" %"+y.load_to+"\n");
		else
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(Integer.parseInt(y.value)-1)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
:}
| ID:x RO arit_op:i CM arit_op:j RC {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared");	
	}else{
		RESULT = parser.symbolTable.get(x);
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]], ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size2+" x "+RESULT.type+"], ["+RESULT.size2+" x "+RESULT.type+"]* %"+(var_count-1)+", "+RESULT.type+" 0, "+RESULT.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
:}
| INT:x {:
	RESULT = new InfoVar(x, "i32", new Integer(4));
:}
| DOUBLE:x {:
	RESULT = new InfoVar(x, "double", new Integer(8));
:}
;
//Elements of vectors of a matrix
matrix_elements ::= matrix_elements:x S vect_elements:y{:
	x.add(y);
	RESULT = x;
:}
| vect_elements:x{:
	RESULT = new ArrayList<ArrayList<InfoVar>>();
	RESULT.add(x);
:}
;
//Elements of variables or constants of a vector
vect_elements ::= vect_elements:x elem:y{:
	x.add(y);
	RESULT = x;
:}
| elem:x {:
	RESULT = new ArrayList<InfoVar>();
	RESULT.add(x);
:}
;

Matrix and array definition

Matrices definitions use also the array (vector) definitions since is just a list a of their definitions. In the same way, the definition of the arrays is a list of InfoVar

/Vector
| ID:id EQ SO vect_elements:x SC{:
	InfoVar nInfoVar = new InfoVar(); 
	Integer vector_Register = Integer.parseInt(nInfoVar.reg_id);
	stamentsBuff.append("%"+vector_Register+" = alloca ["+x.size()+" x "+x.get(0).type+"], align "+x.get(0).align+"\n");
	for(int i = 0; i<x.size(); i++)
	{
		InfoVar xTy = x.get(i); 	
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x "+x.get(i).type+"], ["+x.size()+" x "+x.get(i).type+"]* %"+vector_Register+", "+x.get(i).type+" 0, "+x.get(i).type+" "+i+"\n");
		stamentsBuff.append("store "+xTy.type+" "+(x.get(i).just_created?x.get(i).value:"%"+x.get(i).load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n");
	} 
 
	nInfoVar.type = x.get(0).type;
	nInfoVar.align = x.get(0).align;
	nInfoVar.size1 = x.size();
	addSymbol(id, nInfoVar );
 
:}
//Vector element assignment
| ID:id RO arit_op:x RC EQ arit_op:y {:
	InfoVar idVar = parser.symbolTable.get(id);
	if(!x.just_created)
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" %"+x.load_to+"\n");
	else
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(Integer.parseInt(x.value)-1)+"\n");
	stamentsBuff.append("store "+idVar.type+" "+(y.just_created?y.value:"%"+y.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n");
:}
//Matrix
| ID:id EQ SO matrix_elements:x SC{:
 
	InfoVar nInfoVar = new InfoVar(); 
	Integer matrix_Register = Integer.parseInt(nInfoVar.reg_id);
	stamentsBuff.append("%"+matrix_Register+" = alloca ["+x.size()+" x ["+x.get(0).size()+" x "+x.get(0).get(0).type+"]], align "+x.get(0).get(0).align+"\n");
	for(int i = 0; i<x.size(); i++)
	{
		for(int j = 0; j<x.get(i).size(); j++)
		{
			InfoVar xTy = x.get(i).get(j); 	
			stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]], ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]]* %"+matrix_Register+", "+xTy.type+" 0, "+xTy.type+" "+i+"\n");
			stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.get(i).size()+" x "+xTy.type+"], ["+x.get(i).size()+" x "+xTy.type+"]* %"+(var_count-1)+", "+xTy.type+" 0, "+xTy.type+" "+j+"\n");
			stamentsBuff.append("store "+xTy.type+" "+(xTy.just_created?xTy.value:"%"+xTy.load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n");
		}
	} 
 
	nInfoVar.type = x.get(0).get(0).type;
	nInfoVar.align = x.get(0).get(0).align;
	nInfoVar.size1 = x.size();
	nInfoVar.size2 = x.get(0).size();
	addSymbol(id, nInfoVar );
 
:}
//Matrix element assignment
| ID:id RO arit_op:i CM arit_op:j RC EQ arit_op:x {:
	InfoVar idVar = parser.symbolTable.get(id);
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]], ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n");
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size2+" x "+idVar.type+"], ["+idVar.size2+" x "+idVar.type+"]* %"+(var_count-1)+", "+idVar.type+" 0, "+idVar.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n");
	stamentsBuff.append("store "+idVar.type+" "+(x.just_created?x.value:"%"+x.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n");
:}

Here is an example:

d = [1 2 4 ; 5 6 7]

And here is the LLVM transformation:

%7 = alloca [2 x [3 x i32]], align 4
%8 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%9 = getelementptr inbounds [3 x i32], [3 x i32]* %8, i32 0, i32 0
store i32 1, i32* %9, align 4
%10 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%11 = getelementptr inbounds [3 x i32], [3 x i32]* %10, i32 0, i32 1
store i32 2, i32* %11, align 4
%12 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%13 = getelementptr inbounds [3 x i32], [3 x i32]* %12, i32 0, i32 2
store i32 4, i32* %13, align 4
%14 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%15 = getelementptr inbounds [3 x i32], [3 x i32]* %14, i32 0, i32 0
store i32 5, i32* %15, align 4
%16 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%17 = getelementptr inbounds [3 x i32], [3 x i32]* %16, i32 0, i32 1
store i32 6, i32* %17, align 4
%18 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%19 = getelementptr inbounds [3 x i32], [3 x i32]* %18, i32 0, i32 2
store i32 7, i32* %19, align 4

Function implementation

The following piece of code represents the LLVM IR code of the functions, this only accepts integers parameters and integer returns

function_def ::= FUNCT ID:r EQ ID:f RO parameters:par{:
 
	stamentsBuff.append("define i32 @"+f+"("); 
	for(int i = 0; i<par.size(); i++)
	{
		genVarCount();
		stamentsBuff.append("i32"); 
		if(i != (par.size()-1))
			stamentsBuff.append(", "); 
		else 
			stamentsBuff.append(") {"+"\n"); 			
	}
	Integer currentReg; 
	for(int i = 0; i<par.size(); i++)
	{
		currentReg = genVarCount() ;
		stamentsBuff.append("%"+currentReg+" = alloca i32, align 4"+"\n");
		stamentsBuff.append("store i32 %"+i+", i32* %"+currentReg+"\n"); 
		InfoVar newParam = new InfoVar();
		var_count--;
		newParam.reg_id = Integer.toString(currentReg); 
		newParam.type = "i32";
		newParam.align = 4;  
		addSymbol(par.get(i), newParam);		
	}
	ArrayList<String> parametersType= new ArrayList<String>();
	for(int i = 0; i<par.size(); i++)
	{
		parametersType.add("i32");
	}
	InfoFun funct = new InfoFun(parametersType);
	functionTable.put(f,funct);
 
	ret_id = r;
 
:} RC statements END{:
	stamentsBuff.append("}"+"\n");
	var_count = 0;
	symbolTable.clear(); 
:};
 
param ::= ID:x {:RESULT = x;:} | ;
 
parameters ::= parameters:l CM param:x{:
	l.add(x);
	RESULT = l;
:}
| param:x{:
	RESULT = new ArrayList<String>();
	RESULT.add(x);
 
:} 
;

There are two print instructions implemented, the first one is “disp” which only displays either string words with the function ManageString or variables (IDs for simple variables, arrays or matrices) with the function ManageStringID; if the ID to be printed is a vector or matrix, this instruction prints the whole structure. The Matlab instruction “fprintf” instead allows (in this implementation) to display string along to the reference of the variables (only single variables).

//Print instruction
print_instr ::= DISP RO STRING:x RC{:
	ManageString(x);
:} 
|DISP RO ID:x RC{:
	ManageStringID(x);
:} 
| PRINT RO STRING:s CM id_list:x RC{:
	ManageString(s,x);
:}
| print_keyw error {:pSynWarning("Error in print instruction.");:}
;
 
id_list ::= id_list:x CM ID:i{:
	x.add(i);
	RESULT = x;
:}
|ID:x{:
	RESULT = new ArrayList<String>();
	RESULT.add(x);
:}
;

Here are the three ManageString functions

public void ManageString(String x){
	int label = genStrCount();
	String s = x;
	s = s.replace("\"","");
	s = s + "\\0A\\00";
	Integer length = s.length()-4;
	parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
	stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0))\n"));
}
 
public void ManageStringID(String x){
 
	InfoVar infoVar = parser.symbolTable.get(x);
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Variable "+x+" not declared.");
	}else{
		if(infoVar.size1==-1){ 
			int label = genStrCount();
			String s = "%"+(infoVar.type.equals("i32")?"d":"f")+"\\0A\\00";
			Integer length = s.length()-4;
			stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+", "+infoVar.type+"* %"+infoVar.reg_id+", align "+infoVar.align+"\n");
			infoVar.load_to = var_count+"";
			parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
			stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0), "+infoVar.type+ " %"+infoVar.load_to+")\n"));
		}else if(infoVar.size1!=1 && infoVar.size2==-1){
				int label = genStrCount();
				String s = "";
				ArrayList<Integer> loads_reg = new ArrayList<>();
				for(int i = 0;i < infoVar.size1-1; i++){
					s = s+" %"+(infoVar.type.equals("i32")?"d":"f");
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n");
					stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
					loads_reg.add(var_count);
				}
				s = s+" %"+(infoVar.type.equals("i32")?"d":"f") + "\\0A\\00";
				stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+(infoVar.size1-1)+"\n");
				stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
				loads_reg.add(var_count);
				Integer length = s.length()-4;
				parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
				stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
				stamentsBuff.append(", ");
				for (int i = 0; i < loads_reg.size(); i ++)
				{
					if(i==0)
						stamentsBuff.append(infoVar.type+" %"+loads_reg.get(i));
						else
						stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(i));						         
				}
				stamentsBuff.append(")"+"\n");
		}else{
			for(int i = 0;i < infoVar.size1; i++){
				int label = genStrCount();
				String s = "";
				ArrayList<Integer> loads_reg = new ArrayList<>();
				for(int j = 0;j < infoVar.size2; j++){
					s = s+" %"+(infoVar.type.equals("i32")?"d":"f");
					if(j== infoVar.size2-1)
						s = s+"\\0A\\00";
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]], ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n");
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size2+" x "+infoVar.type+"], ["+infoVar.size2+" x "+infoVar.type+"]* %"+(var_count-1)+", "+infoVar.type+" 0, "+infoVar.type+" "+j+"\n");
					stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
					loads_reg.add(var_count);
				}
				Integer length = s.length()-4;
				parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
				stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
				stamentsBuff.append(", ");
				for (int j = 0; j < loads_reg.size(); j ++)
				{
					if(j==0)
						stamentsBuff.append(infoVar.type+" %"+loads_reg.get(j));
					else
						stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(j));						         
				}
				stamentsBuff.append(")"+"\n");
			}
		}
	}
}
 
public void ManageString(String x, ArrayList<String> variables)
{
	ArrayList <InfoVar> regList = new ArrayList<InfoVar>();
	int label = genStrCount();
	InfoVar t = null;
	String s = x;
	s = s.replace("\"", "");
	s = s.replace("%i", "%d");
 
	for(String var : variables)
	{
		t = parser.symbolTable.get(var);
		if(!parser.symbolTable.containsKey(var))
		{
			pSemError("Variable "+var+" not declared.");
		}else if(parser.symbolTable.get(var).size1==-1){ 
			stamentsBuff.append("%"+genVarCount()+" = load "+t.type+", "+t.type+"* %"+t.reg_id+", align "+t.align+"\n");
			t.load_to = var_count+"";
			regList.add(t);
		}
	}
	s = s + "\\0A\\00";
	Integer length = s.length()-4;
 
	parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
	stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
	stamentsBuff.append(", ");
	for (int i = 0; i < regList.size(); i ++)
	{
		InfoVar infoVar = regList.get(i);
		if(i==0)
			stamentsBuff.append(infoVar.type+" %"+infoVar.load_to);
		else
			stamentsBuff.append(", "+infoVar.type+" %"+infoVar.load_to);						         
	}
	stamentsBuff.append(")"+"\n");
}

Error handling

The compiler is able to recognize the following kind of errors:

  • Variable not declared
  • Variable is not an array
  • Function not defined
  • Generic error in assignment
  • Missing ] in array definition
  • General error in while condition
  • Missing ) in while condition
  • General error in if condition

Missing functionalities, partial implementations

  • the disp() function only display strings or a single variable, not strings with variables nor or multiple variables
  • fprintf() only print ID from simple variables not ids from arrays or matrices, but disp() function does
  • functions only return and accept parameters with type i32 (integers)
  • if and while condition only allows AND conditions, OR conditions are not generated properly
  • No support for global variables
  • Strings cannot be assigned to a variable
  • Due to a reduce/shift conflict between reference from matrix element and function call (both with this syntax: ID(arit_op, arit_op)), function calls have an additional “()” so that function_call can be recognized properly

Download and Parser

Compiler matlab_compiler.zip

Examples

How to run it

  1. Install the llvm package sudo apt install llvm
  1. Download the matlab_compiler and unzip it
  2. Start a new terminal inside the source folder and run the following commands:
    • jflex matlab_scanner.jflex
    • java java_cup.Main -expect 3 matlab_parser.cup
    • javac *.java
    • java Main source.mlx
  3. This will produce an output.ll file
  4. Run output.ll file with: lli output.ll

References


If you found any error, or if you want to partecipate to the editing of this wiki, please contact: admin [at] skenz.it

You can reuse, distribute or modify the content of this page, but you must cite in any document (or webpage) this url: https://www.skenz.it/compilers/matlab_to_llvm
/web/htdocs/www.skenz.it/home/data/pages/compilers/matlab_to_llvm.txt · Last modified: 2022/09/24 23:05 by sebastian