Table of Contents

From Matlab to LLVM

Background

This page shows the implementation of a compiler that recognizes and translates part of the Matlab programming language into the LLVM IR syntax (more information about LLVM can be found here).

Implemented features

List of the Matlab features Implemented

Data types

Operators

Sub block of codes

Function details (partial implementation):

Output:

Compiler

The compiler is built of two parts: a scanner and parser

Scanner

The scanner is able to recognize and retrieve tokens (terminal symbols) to the parser coupled with an object containing a value that represents the token. It identifies integers, doubles and ids (that will be used for variables, function names, etc…) and other significant Matlab keywords like:

And other syntax elements like punctuation and other symbols.

Snippet of Matlab Scanner

nl = \r|\n|\r\n
ws = [ \t]
id = [A-Za-z][A-Za-z0-9_]*
integer =  ([1-9][0-9]*|0)
double = (([0-9]+\.[0-9]*) | ([0-9]*\.[0-9]+)) (e|E('+'|'-')?[0-9]+)?
 
%%
 
"("     {return symbol(sym.RO);}
")"     {return symbol(sym.RC);}
"="     {return symbol(sym.EQ);}
"+"     {return symbol(sym.PLUS);}
"-"     {return symbol(sym.MINUS);}
"*"     {return symbol(sym.STAR);}
".*"    {return symbol(sym.DOTSTAR);}
"/"     {return symbol(sym.DIV);}
"./"    {return symbol(sym.DOTDIV);}
"<"     {return symbol(sym.MIN);}
">"     {return symbol(sym.MAJ);}
"<="    {return symbol(sym.MIN_EQ);}
"=<"    {return symbol(sym.EQ_MIN);}
">="    {return symbol(sym.MAJ_EQ);}
"=>"    {return symbol(sym.EQ_MAJ);}
"&"     {return symbol(sym.AND);}
"|"     {return symbol(sym.OR);}
"~"     {return symbol(sym.NOT);}
 
"["     {return symbol(sym.SO);}
"]"     {return symbol(sym.SC);}
 
"function" {return symbol(sym.FUNCT);}
"end"     {return symbol(sym.END);}
"disp"    {return symbol(sym.DISP);}
"fprintf" {return symbol(sym.PRINT);}
"if"      {return symbol(sym.IF);}
"while"   {return symbol(sym.WHILE);}
"for"   {return symbol(sym.FOR);}
"else"    {return symbol(sym.ELSE);}
";"       {return symbol(sym.S);}
","       {return symbol(sym.CM);}
":"       {return symbol(sym.C);}
 
{id}      {return symbol(sym.ID, yytext());}
{integer} {return symbol(sym.INT, new Integer(yytext()));}
{double}  {return symbol(sym.DOUBLE, new Double(yytext()));}
 

Parser

The parser can take as input the tokens provided by the scanner and recognize the main grammatical rules of Matlab language. As a result, the LLVM IR code is produced.

Data structures

This snippet shows all variables and classes used to support the parser on the creation of the output program:

	public HashMap <String, InfoVar> symbolTable;
 
	public HashMap <String, InfoFun> functionTable;
 
	public boolean isCorrect = true; 
 
	public StringBuffer stamentsBuff;
 
	public ArrayList<String> stringStatements;
 
	public int var_count = 0;
 
	public int str_label = 0; 
 
	public int sub_label = 0;
 
	public int else_label = 1;
 
	public int tot_sub_label = 0;
 
	public int cmp_count=0;
 
	public boolean activate_while = false;
 
	public boolean desctivate_while = false;
 
	public boolean activate_for = false;
 
	public boolean desctivate_for = false;
 
	public String ret_id = "";
 
	public BufferedWriter bwr;
 
	public int genVarCount(){
 
		var_count++;
 
		return var_count; 
	};
 
	public int genStrCount(){
 
		str_label++;
 
		return str_label; 
	};
 
	public class InfoVar{
 
		public String reg_id; //First label assigned to the variable
		public String load_to; //Reg id of the one who loaded an existing variable (default self reg_id)
		public String type; //i32, double
		public String value; //The real value of the variable (ex: 1 or 1.0)
		public Integer align;  //alignment required: 4, 8...
		public Integer size1; //If the variable is an array then this is its size, otherwise size1 = -1 
		public Integer size2; //If the variable is a matrix then this is its size, otherwise size1 = -1 
		public boolean just_created; //It helps to know if an operation must use the load_to or the real value
 
		public InfoVar()
		{
			reg_id = Integer.toString(genVarCount());
			load_to = Integer.toString(var_count);
			size1 = size2 = -1;
		}
		InfoVar(Integer value, String type, Integer align)
		{
			this.just_created = true;
			this.value = Integer.toString(value);
			this.type = type;
			this.align = align;
		}
 
		InfoVar(Double value, String type, Integer align)
		{
			this.just_created = true;
			this.value = Double.toString(value);
			this.type = type;
			this.align = align;
		}
 
 
	}
 
	public class InfoFun{
 
		ArrayList<String> funParam;
		Integer numParam;
		String funRet;
 
		public InfoFun(ArrayList<String> funParam)
		{
			this.funParam = funParam;
			this.numParam = funParam.size(); 
			this.funRet = "i32";
		}
 

Grammar start

The grammar starts with the main symbol prog and writes down by stamentsBuff that therefore is displayed in the output file output.ll. The non terminal symbol function_defs is read by first so all the functions definitions are goint to be displayed at the beggining before the @main, at the end of each function definition the var_count is reset so the main function can use the new registers. Between functions and main there are the string declarations to be consequently printed.

prog ::= function_defs {:
	if(parser.isCorrect)
	{
		bwr.write("declare i32 @printf(i8*, ...)\n");
 
		bwr.write(stamentsBuff.toString());
 
	}
	else
		System.out.println("Program contains errors.");
	var_count = 0; 
	stamentsBuff.setLength(0);
 
:}statements {:
	if(parser.isCorrect)
	{
		for(String s : stringStatements)
		{
			bwr.write(s+"\n");
		}
 
		bwr.write("define void @main(){\n");
 
		bwr.write(stamentsBuff.toString());
 
		bwr.write("ret void\n}");
		bwr.flush();
 
		bwr.close();
	}
	else
		System.out.println("There are errors in the program");
 
:};

Practical examples

Recognition of constants, variable, arrays and matrices ID

In this example it can be seen that when a for or while feature is actived it is displayed the their corresponded labels before any register is be load

val ::= ID:x {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared.");	
	}else{
		RESULT = parser.symbolTable.get(x);
		//To load the variables inside the "while" block
		if(activate_while){
			tot_sub_label++;
			sub_label = tot_sub_label; 
			stamentsBuff.append("br label %while_cond." + sub_label+"\n");
			stamentsBuff.append("while_cond." + sub_label + ":"+"\n");
			activate_while = false;
			desctivate_while = true;
		}
		//To load the variables inside the "for" block
		if(activate_for){
			tot_sub_label++;
			sub_label = tot_sub_label; 
			stamentsBuff.append("br label %for_cond." + sub_label+"\n");
			stamentsBuff.append("for_cond." + sub_label + ":"+"\n");
			activate_for = false;
		}
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+RESULT.reg_id+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
 
:}
| ID:x RO arit_op:y RC {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared.");	
	}else{
		RESULT = parser.symbolTable.get(x);
		if(!y.just_created)
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" %"+y.load_to+"\n");
		else
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x "+RESULT.type+"], ["+RESULT.size1+" x "+RESULT.type+"]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(Integer.parseInt(y.value)-1)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
:}
| ID:x RO arit_op:i CM arit_op:j RC {:
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Error: Variable "+x+"  is not declared");	
	}else{
		RESULT = parser.symbolTable.get(x);
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]], ["+RESULT.size1+" x ["+RESULT.size2+" x "+RESULT.type+"]]* %"+RESULT.reg_id+", "+RESULT.type+" 0, "+RESULT.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+RESULT.size2+" x "+RESULT.type+"], ["+RESULT.size2+" x "+RESULT.type+"]* %"+(var_count-1)+", "+RESULT.type+" 0, "+RESULT.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n");
		stamentsBuff.append("%"+genVarCount()+" = load "+RESULT.type+" , "+RESULT.type+"* %"+(var_count-1)+", align "+RESULT.align+"\n");
		RESULT.load_to = Integer.toString(var_count);
	}
:}
| INT:x {:
	RESULT = new InfoVar(x, "i32", new Integer(4));
:}
| DOUBLE:x {:
	RESULT = new InfoVar(x, "double", new Integer(8));
:}
;
//Elements of vectors of a matrix
matrix_elements ::= matrix_elements:x S vect_elements:y{:
	x.add(y);
	RESULT = x;
:}
| vect_elements:x{:
	RESULT = new ArrayList<ArrayList<InfoVar>>();
	RESULT.add(x);
:}
;
//Elements of variables or constants of a vector
vect_elements ::= vect_elements:x elem:y{:
	x.add(y);
	RESULT = x;
:}
| elem:x {:
	RESULT = new ArrayList<InfoVar>();
	RESULT.add(x);
:}
;

Matrix and array definition

Matrices definitions use also the array (vector) definitions since is just a list a of their definitions. In the same way, the definition of the arrays is a list of InfoVar

/Vector
| ID:id EQ SO vect_elements:x SC{:
	InfoVar nInfoVar = new InfoVar(); 
	Integer vector_Register = Integer.parseInt(nInfoVar.reg_id);
	stamentsBuff.append("%"+vector_Register+" = alloca ["+x.size()+" x "+x.get(0).type+"], align "+x.get(0).align+"\n");
	for(int i = 0; i<x.size(); i++)
	{
		InfoVar xTy = x.get(i); 	
		stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x "+x.get(i).type+"], ["+x.size()+" x "+x.get(i).type+"]* %"+vector_Register+", "+x.get(i).type+" 0, "+x.get(i).type+" "+i+"\n");
		stamentsBuff.append("store "+xTy.type+" "+(x.get(i).just_created?x.get(i).value:"%"+x.get(i).load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n");
	} 
 
	nInfoVar.type = x.get(0).type;
	nInfoVar.align = x.get(0).align;
	nInfoVar.size1 = x.size();
	addSymbol(id, nInfoVar );
 
:}
//Vector element assignment
| ID:id RO arit_op:x RC EQ arit_op:y {:
	InfoVar idVar = parser.symbolTable.get(id);
	if(!x.just_created)
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" %"+x.load_to+"\n");
	else
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x "+idVar.type+"], ["+idVar.size1+" x "+idVar.type+"]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(Integer.parseInt(x.value)-1)+"\n");
	stamentsBuff.append("store "+idVar.type+" "+(y.just_created?y.value:"%"+y.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n");
:}
//Matrix
| ID:id EQ SO matrix_elements:x SC{:
 
	InfoVar nInfoVar = new InfoVar(); 
	Integer matrix_Register = Integer.parseInt(nInfoVar.reg_id);
	stamentsBuff.append("%"+matrix_Register+" = alloca ["+x.size()+" x ["+x.get(0).size()+" x "+x.get(0).get(0).type+"]], align "+x.get(0).get(0).align+"\n");
	for(int i = 0; i<x.size(); i++)
	{
		for(int j = 0; j<x.get(i).size(); j++)
		{
			InfoVar xTy = x.get(i).get(j); 	
			stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]], ["+x.size()+" x ["+x.get(i).size()+" x "+xTy.type+"]]* %"+matrix_Register+", "+xTy.type+" 0, "+xTy.type+" "+i+"\n");
			stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+x.get(i).size()+" x "+xTy.type+"], ["+x.get(i).size()+" x "+xTy.type+"]* %"+(var_count-1)+", "+xTy.type+" 0, "+xTy.type+" "+j+"\n");
			stamentsBuff.append("store "+xTy.type+" "+(xTy.just_created?xTy.value:"%"+xTy.load_to)+", "+xTy.type+"* %"+var_count+", align "+xTy.align+"\n");
		}
	} 
 
	nInfoVar.type = x.get(0).get(0).type;
	nInfoVar.align = x.get(0).get(0).align;
	nInfoVar.size1 = x.size();
	nInfoVar.size2 = x.get(0).size();
	addSymbol(id, nInfoVar );
 
:}
//Matrix element assignment
| ID:id RO arit_op:i CM arit_op:j RC EQ arit_op:x {:
	InfoVar idVar = parser.symbolTable.get(id);
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]], ["+idVar.size1+" x ["+idVar.size2+" x "+idVar.type+"]]* %"+idVar.reg_id+", "+idVar.type+" 0, "+idVar.type+" "+(i.just_created?Integer.parseInt(i.value)-1:"%"+i.load_to)+"\n");
	stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+idVar.size2+" x "+idVar.type+"], ["+idVar.size2+" x "+idVar.type+"]* %"+(var_count-1)+", "+idVar.type+" 0, "+idVar.type+" "+(j.just_created?Integer.parseInt(j.value)-1:"%"+j.load_to)+"\n");
	stamentsBuff.append("store "+idVar.type+" "+(x.just_created?x.value:"%"+x.load_to)+", "+idVar.type+"* %"+var_count+", align "+idVar.align+"\n");
:}

Here is an example:

d = [1 2 4 ; 5 6 7]

And here is the LLVM transformation:

%7 = alloca [2 x [3 x i32]], align 4
%8 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%9 = getelementptr inbounds [3 x i32], [3 x i32]* %8, i32 0, i32 0
store i32 1, i32* %9, align 4
%10 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%11 = getelementptr inbounds [3 x i32], [3 x i32]* %10, i32 0, i32 1
store i32 2, i32* %11, align 4
%12 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 0
%13 = getelementptr inbounds [3 x i32], [3 x i32]* %12, i32 0, i32 2
store i32 4, i32* %13, align 4
%14 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%15 = getelementptr inbounds [3 x i32], [3 x i32]* %14, i32 0, i32 0
store i32 5, i32* %15, align 4
%16 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%17 = getelementptr inbounds [3 x i32], [3 x i32]* %16, i32 0, i32 1
store i32 6, i32* %17, align 4
%18 = getelementptr inbounds [2 x [3 x i32]], [2 x [3 x i32]]* %7, i32 0, i32 1
%19 = getelementptr inbounds [3 x i32], [3 x i32]* %18, i32 0, i32 2
store i32 7, i32* %19, align 4

Function implementation

The following piece of code represents the LLVM IR code of the functions, this only accepts integers parameters and integer returns

function_def ::= FUNCT ID:r EQ ID:f RO parameters:par{:
 
	stamentsBuff.append("define i32 @"+f+"("); 
	for(int i = 0; i<par.size(); i++)
	{
		genVarCount();
		stamentsBuff.append("i32"); 
		if(i != (par.size()-1))
			stamentsBuff.append(", "); 
		else 
			stamentsBuff.append(") {"+"\n"); 			
	}
	Integer currentReg; 
	for(int i = 0; i<par.size(); i++)
	{
		currentReg = genVarCount() ;
		stamentsBuff.append("%"+currentReg+" = alloca i32, align 4"+"\n");
		stamentsBuff.append("store i32 %"+i+", i32* %"+currentReg+"\n"); 
		InfoVar newParam = new InfoVar();
		var_count--;
		newParam.reg_id = Integer.toString(currentReg); 
		newParam.type = "i32";
		newParam.align = 4;  
		addSymbol(par.get(i), newParam);		
	}
	ArrayList<String> parametersType= new ArrayList<String>();
	for(int i = 0; i<par.size(); i++)
	{
		parametersType.add("i32");
	}
	InfoFun funct = new InfoFun(parametersType);
	functionTable.put(f,funct);
 
	ret_id = r;
 
:} RC statements END{:
	stamentsBuff.append("}"+"\n");
	var_count = 0;
	symbolTable.clear(); 
:};
 
param ::= ID:x {:RESULT = x;:} | ;
 
parameters ::= parameters:l CM param:x{:
	l.add(x);
	RESULT = l;
:}
| param:x{:
	RESULT = new ArrayList<String>();
	RESULT.add(x);
 
:} 
;

There are two print instructions implemented, the first one is “disp” which only displays either string words with the function ManageString or variables (IDs for simple variables, arrays or matrices) with the function ManageStringID; if the ID to be printed is a vector or matrix, this instruction prints the whole structure. The Matlab instruction “fprintf” instead allows (in this implementation) to display string along to the reference of the variables (only single variables).

//Print instruction
print_instr ::= DISP RO STRING:x RC{:
	ManageString(x);
:} 
|DISP RO ID:x RC{:
	ManageStringID(x);
:} 
| PRINT RO STRING:s CM id_list:x RC{:
	ManageString(s,x);
:}
| print_keyw error {:pSynWarning("Error in print instruction.");:}
;
 
id_list ::= id_list:x CM ID:i{:
	x.add(i);
	RESULT = x;
:}
|ID:x{:
	RESULT = new ArrayList<String>();
	RESULT.add(x);
:}
;

Here are the three ManageString functions

public void ManageString(String x){
	int label = genStrCount();
	String s = x;
	s = s.replace("\"","");
	s = s + "\\0A\\00";
	Integer length = s.length()-4;
	parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
	stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0))\n"));
}
 
public void ManageStringID(String x){
 
	InfoVar infoVar = parser.symbolTable.get(x);
	if(!parser.symbolTable.containsKey(x))
	{
		pSemError("Variable "+x+" not declared.");
	}else{
		if(infoVar.size1==-1){ 
			int label = genStrCount();
			String s = "%"+(infoVar.type.equals("i32")?"d":"f")+"\\0A\\00";
			Integer length = s.length()-4;
			stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+", "+infoVar.type+"* %"+infoVar.reg_id+", align "+infoVar.align+"\n");
			infoVar.load_to = var_count+"";
			parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
			stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0), "+infoVar.type+ " %"+infoVar.load_to+")\n"));
		}else if(infoVar.size1!=1 && infoVar.size2==-1){
				int label = genStrCount();
				String s = "";
				ArrayList<Integer> loads_reg = new ArrayList<>();
				for(int i = 0;i < infoVar.size1-1; i++){
					s = s+" %"+(infoVar.type.equals("i32")?"d":"f");
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n");
					stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
					loads_reg.add(var_count);
				}
				s = s+" %"+(infoVar.type.equals("i32")?"d":"f") + "\\0A\\00";
				stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x "+infoVar.type+"], ["+infoVar.size1+" x "+infoVar.type+"]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+(infoVar.size1-1)+"\n");
				stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
				loads_reg.add(var_count);
				Integer length = s.length()-4;
				parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
				stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
				stamentsBuff.append(", ");
				for (int i = 0; i < loads_reg.size(); i ++)
				{
					if(i==0)
						stamentsBuff.append(infoVar.type+" %"+loads_reg.get(i));
						else
						stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(i));						         
				}
				stamentsBuff.append(")"+"\n");
		}else{
			for(int i = 0;i < infoVar.size1; i++){
				int label = genStrCount();
				String s = "";
				ArrayList<Integer> loads_reg = new ArrayList<>();
				for(int j = 0;j < infoVar.size2; j++){
					s = s+" %"+(infoVar.type.equals("i32")?"d":"f");
					if(j== infoVar.size2-1)
						s = s+"\\0A\\00";
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]], ["+infoVar.size1+" x ["+infoVar.size2+" x "+infoVar.type+"]]* %"+infoVar.reg_id+", "+infoVar.type+" 0, "+infoVar.type+" "+i+"\n");
					stamentsBuff.append("%"+genVarCount()+" = getelementptr inbounds ["+infoVar.size2+" x "+infoVar.type+"], ["+infoVar.size2+" x "+infoVar.type+"]* %"+(var_count-1)+", "+infoVar.type+" 0, "+infoVar.type+" "+j+"\n");
					stamentsBuff.append("%"+genVarCount()+" = load "+infoVar.type+" , "+infoVar.type+"* %"+(var_count-1)+", align "+infoVar.align+"\n");
					loads_reg.add(var_count);
				}
				Integer length = s.length()-4;
				parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
				stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
				stamentsBuff.append(", ");
				for (int j = 0; j < loads_reg.size(); j ++)
				{
					if(j==0)
						stamentsBuff.append(infoVar.type+" %"+loads_reg.get(j));
					else
						stamentsBuff.append(", "+infoVar.type+" %"+loads_reg.get(j));						         
				}
				stamentsBuff.append(")"+"\n");
			}
		}
	}
}
 
public void ManageString(String x, ArrayList<String> variables)
{
	ArrayList <InfoVar> regList = new ArrayList<InfoVar>();
	int label = genStrCount();
	InfoVar t = null;
	String s = x;
	s = s.replace("\"", "");
	s = s.replace("%i", "%d");
 
	for(String var : variables)
	{
		t = parser.symbolTable.get(var);
		if(!parser.symbolTable.containsKey(var))
		{
			pSemError("Variable "+var+" not declared.");
		}else if(parser.symbolTable.get(var).size1==-1){ 
			stamentsBuff.append("%"+genVarCount()+" = load "+t.type+", "+t.type+"* %"+t.reg_id+", align "+t.align+"\n");
			t.load_to = var_count+"";
			regList.add(t);
		}
	}
	s = s + "\\0A\\00";
	Integer length = s.length()-4;
 
	parser.stringStatements.add("@.str." + label + " = private constant [" + length + " x i8] c\"" + s + "\", align 1");
	stamentsBuff.append(("%" + genVarCount() + " = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([" + length + " x i8], [" + length + " x i8]* @.str." + label + ", i32 0, i32 0)"));
	stamentsBuff.append(", ");
	for (int i = 0; i < regList.size(); i ++)
	{
		InfoVar infoVar = regList.get(i);
		if(i==0)
			stamentsBuff.append(infoVar.type+" %"+infoVar.load_to);
		else
			stamentsBuff.append(", "+infoVar.type+" %"+infoVar.load_to);						         
	}
	stamentsBuff.append(")"+"\n");
}

Error handling

The compiler is able to recognize the following kind of errors:

Missing functionalities, partial implementations

Download and Parser

Compiler matlab_compiler.zip

Examples

How to run it

  1. Install the llvm package sudo apt install llvm
  1. Download the matlab_compiler and unzip it
  2. Start a new terminal inside the source folder and run the following commands:
    • jflex matlab_scanner.jflex
    • java java_cup.Main -expect 3 matlab_parser.cup
    • javac *.java
    • java Main source.mlx
  3. This will produce an output.ll file
  4. Run output.ll file with: lli output.ll

References