Return to Home page

Jflex & Cup: Exam 2015/09/03

Text

Using the Jflex lexer generator and the Cup parser generator, realize a java program capable of recognizing and executing the language described here:20150903.pdf

Commented Solution

Scanner

scanner.jflex

import java_cup.runtime.*;
 
%%
 
%unicode
%cup
%line
%column
 
%{
	private Symbol sym(int type) {
		return new Symbol(type, yyline, yycolumn);
	}
 
	private Symbol sym(int type, Object value) {
		return new Symbol(type, yyline, yycolumn, value);
	}
 
%}
 
/* TOKEN1 regular expression */
token1     = (("%%%%%"("%%")*) | (("**" | "???"){2,3})) {odd_number}
odd_number = "-"(3[135] | [12][13579] | [13579]) | [13579] | [1-9][13579] | [12][0-9][13579] | 3([0-2][13579]|3[13])
 
/* TOKEN2 regular expression */
token2	= {date}("-" | "+"){date}
date	= 2015"/"12"/"(1[2-9] | 2[0-9] | 3[01]) | 2016"/"(01"/"(0[1-46-9] | [12][0-9] | 3[01]) | 02"/"(0[1-9] | [12][0-9]) | 03"/"(0[1-9] | 1[0-3]))
 
/* TOKEN3 regular expression */                        
token3	= "$"(101 | 110 | 111 | 1(0|1){3} | 1(0|1){4} | 10(1000|0(0|1){3}))
 
q_string = \" ~ \"
 
uint	= 0 | [1-9][0-9]*
 
sep		= "##"("##")+
 
nl 		= \r | \n | \r\n
cpp_comment     = "//" .*
 
%%
 
{token1}			{ return sym(sym.TOKEN1);}
{token2}			{ return sym(sym.TOKEN2);}
{token3}			{ return sym(sym.TOKEN3);}
 
{q_string}			{ return sym(sym.QSTRING, yytext());}
{uint}				{ return sym(sym.UINT, new Integer(yytext()));}
 
{sep}				{ return sym(sym.SEP);}
 
"PRINT_MIN_MAX"	        	{ return sym(sym.MINMAX);}
"PART"				{ return sym(sym.PART);}
"m"				{ return sym(sym.M);}
"m/s"				{ return sym(sym.MS);}
"->"				{ return sym(sym.ARROW);}
"=" 				{ return sym(sym.EQ);}
"|" 				{ return sym(sym.PIPE);}
"," 				{ return sym(sym.CM);}
";" 				{ return sym(sym.S);}
":" 				{ return sym(sym.COL);}
"(" 				{ return sym(sym.RO);}
")" 				{ return sym(sym.RC);}
"{" 				{ return sym(sym.SO);}
"}" 				{ return sym(sym.SC);}
 
{cpp_comment}	 		{;}
\r | \n | \r\n | " " | \t	{;}
 
.				{ System.out.println("Scanner Error: " + yytext()); }

The scanner is relatively simple. Most of the attentions should be given to regular expressions regarding token1, token2 and token3.

token1:

it starts with the character % repeated an odd number of times, at least 5: "%%%%%"("%%")*. Another possible solution could be "%%%"("%%")+
or it starts with 2 o 3 repetitions of the words "**" and ??? in any combination (???**, ??????, **??????, ******,…): in this case the operator {n,m} should be used to manage the repetition of ** or ??? for a number of times between 2 and 3: ("**" | "???"){2,3}
The first part of the token is optionally followed by an odd number between −35 and 333: in this context the regular expression has to recognize all the numbers, but only those, in the range. Remember that numbers have to be odd. As a consequence the regular expression of the less significant digit has to recognize only 1, 3, 5, 7, 9 digits: "-"(3[135] | [12][13579] | [13579]) | [13579] | [1-9][13579] | [12][0-9][13579] | 3([0-2][13579]|3[13])

token2:

it is composed of two dates separated by the character - or by the character +: {date}("-" | "+"){date}
A date has the format YYYY/MM/DD and it is in the range between 2015/12/12 and 2016/03/13, with the exclusion of the day 2016/01/05. Remember that the month of February has 29 days. Regular expression has to recognize only the date that matches the requirements regarding days ranges: 2015"/"12"/"(1[2-9] | 2[0-9] | 3[01]) | 2016"/"(01"/"(0[1-46-9] | [12][0-9] | 3[01]) | 02"/"(0[1-9] | [12][0-9]) | 03"/"(0[1-9] | 1[0-3]))

token3:

it is the character $ followed by a binary number between 101 and 101000. As in the previous regular expression, you have to take into account numbers range, but in this case numbers are binaries: "$"(101 | 110 | 111 | 1(0|1){3} | 1(0|1){4} | 10(1000|0(0|1){3}))

Parser

parser.cup

import java_cup.runtime.*;
import java.util.*;
import java.io.*;
 
init with {:
    table = new HashMap<String, HashMap<String, Integer>>();
:};
 
parser code 	{:
 
    public HashMap<String, HashMap<String, Integer>> table;
 
    public void report_error(String message, Object info) {
        StringBuffer m = new StringBuffer(message);
        if (info instanceof Symbol) {
            if (((Symbol)info).left != 1 && ((Symbol)info).right != 1) {
                if (((Symbol)info).left != -1 && ((Symbol)info).right != -1) {
                    int line = (((Symbol)info).left) + 1;
                    int column = (((Symbol)info).right) + 1;
                    m.append(" (line " + line + " column " + column + ")");
                }
            }
            System.err.println(m);
        }
    }
 
    public Object stack(int position) {
        return (((Symbol)stack.elementAt(tos + position)).value);
    }
:};
 
 
 
//////////////////////////////////
///// SYMBOLS DECLARATION
/////////////////////////////////
 
terminal TOKEN1, TOKEN2, TOKEN3, M, MS, MINMAX, PART, ARROW, SEP;
terminal EQ, PIPE, CM, S, COL, RO, RC, SO, SC;
terminal String QSTRING;
terminal Integer UINT;
 
non terminal prog, header, token1_l, cars, car, race, print_min_max_l, min_max;
non terminal Object[] section_names, performances;
non terminal HashMap speeds;
non terminal Float drive_stats, parts, part;
non terminal String NT0, NT1;
 
 
 
//////////////////////////////////
///// GRAMMAR
/////////////////////////////////
 
start with prog;
 
prog ::= header SEP cars SEP race;
 
 
/******************/
/* Header section */
/******************/
header ::= token1_l TOKEN2 S token1_l TOKEN3 S token1_l
         | token1_l TOKEN3 S token1_l TOKEN2 S token1_l
;
 
token1_l ::= token1_l TOKEN1 S | /* epsilon */;
 
 
/****************/
/* Cars section */
/****************/
cars ::= car car | cars car car;
 
car ::= QSTRING:s SO speeds:tab SC {:
	     parser.table.put(s, tab);
	:}
;
 
speeds ::= QSTRING:s EQ UINT:u MS {:
                RESULT = new HashMap<String, Integer>();
                RESULT.put(s, u);
           :}
	   | speeds:tab CM QSTRING:s EQ UINT:u MS {:
	        tab.put(s, u);
		RESULT = tab;
           :}
;
 
/****************/
/* Race section */
/****************/
race ::= print_min_max_l performances:s {: 
              System.out.println("WINNER: " + s[0] + " " + s[1] + " s"); 
	 :}
;
 
/* Management of PRINT_MIN_MAX function */
print_min_max_l ::= | print_min_max_l min_max;
 
min_max ::= MINMAX RO QSTRING:s RC RO section_names:m RC S {:
                 System.out.println("MIN: " + m[0] + " MAX: " + m[1]);
	    :}
;
 
section_names ::= QSTRING:s {:
		       String car = (String)parser.stack(-3);
                       HashMap<String, Integer> speeds = parser.table.get(car);
                       Integer speed = (Integer)speeds.get(s);
                       RESULT = new Object[2];
                       RESULT[0] = speed; // Current min value
                       RESULT[1] = speed; // Current max value
		  :}
		  | section_names:m CM QSTRING:s {:
                       String car = (String)parser.stack(-5);
                       HashMap<String, Integer> speeds = parser.table.get(car);
                       Integer speed = (Integer)speeds.get(s);
                       RESULT = new Object[2];
                       // Update current min and max values
                       if (speed < (Integer)m[0]) {
                           // New min
                           RESULT[0] = speed;
                           RESULT[1] = m[1];
                       } else if (speed > (Integer)m[1]) {
                           // New max
                           RESULT[0] = m[0];
                           RESULT[1] = speed;
                       } else {
                           // No change in min and max
                           RESULT[0] = m[0];
                           RESULT[1] = m[1];
                       }
                  :}
;
 
/* Part regarding performances */
performances ::= QSTRING:s {: System.out.println(s); :} ARROW parts:x S {:
                      System.out.println("TOTAL: " + x + " s");
                      // To detect the winner car
                      RESULT = new Object[2];
                      RESULT[0] = s; // car name
                      RESULT[1] = x; // result
		 :}
	         | performances:perf QSTRING:s {: System.out.println(s); :} ARROW parts:x S {:
                      System.out.println("TOTAL: " + x + " s");
                      RESULT = new Object[2];
                      // Check if this car is the current winner
                      if ((Float)perf[1] < x) {
                          // Current winner is an old car
                          RESULT[0] = perf[0];
                          RESULT[1] = perf[1];
                      } else {
                          // Current winner is this car
                          RESULT[0] = s;
                          RESULT[1] = x;
                      }
                 :}
;
 
// The two markers are used to transfer the car name semantic value in the symbol that preceeds the non terminal "part"
parts ::= NT0 part:x {: RESULT = x; :}
	  | parts:res PIPE NT1 part:x {: RESULT = res + x; :}
;
 
NT0 ::= {: RESULT = (String)parser.stack(-2); :};
 
NT1 ::= {: RESULT = (String)parser.stack(-4); :};
 
part ::= PART UINT:x COL drive_stats:stat {: 
              RESULT = stat;
              System.out.println("PART" + x + ": " + stat + " s");
:};
 
drive_stats ::= QSTRING:s UINT:u M {:
                     String car = (String)parser.stack(-6);
                     HashMap<String, Integer> speeds = parser.table.get(car);
                     Integer speed = (Integer)speeds.get(s);
                     float result = (float)u.intValue() / (float)speed.intValue();
                     RESULT = new Float(result);
		:}
		| drive_stats:stat CM QSTRING:s UINT:u M {:
                     String car = (String)parser.stack(-8);
                     HashMap<String, Integer> speeds = parser.table.get(car);
                     Integer speed = (Integer)speeds.get(s);
                     float result = (float)u.intValue() / (float)speed.intValue();
                     RESULT = new Float(result);
                     RESULT += stat; /* Accumulate the time in result */
                :}
;

Grammar

Regarding the grammar, the most complex part is the header section. It requires exactly one token2 and one token3, while any number of token1 can be found in the header. Since any order is possible, token2 and token3 can appear in the order TOKEN2 TOKEN3 or TOKEN3 TOKEN2. For this reason two header rules are present (see the | operator). Before the two tokens, between them, and after them any number (even 0) of token1 may be present. For this reason, in any position of the header grammar, adjacent to the two tokens, there is a list that may be empty. Consequently, the grammar of the header is the following:

header ::= token1_l TOKEN2 S token1_l TOKEN3 S token1_l
         | token1_l TOKEN3 S token1_l TOKEN2 S token1_l
;
 
token1_l ::= token1_l TOKEN1 S | /* epsilon */;

Another interesting part of the grammar is that concerning the list of cars composed of at least 2 elements and in even number:

cars ::= car car | cars car car;

the first part of the rule recognizes the first two elements, while the second part recognizes other two elements for any further reduction.

Semantic

Semantic regarding the car section is very easy and it is aimed to fill the data structure containing statistics regarding cars speed.

For the PRINT_MIN_MAX function, the idea is to access the <car_name> through inherited attributes. For the first element of the list (section_names ::= QSTRING) <car_name> is in position top-3 of the stack, because the stack content is MINMAX RO QSTRING RC RO QSTRING and the first QSTRING is the <car_name>. The array RESULT[] is initialized with the speed of the first element of the list. RESULT[0] is the current minimum value, while RESULT[1] is the current maximum value.
For the other elements of the list (section_names ::= section_names CM QSTRING) <car_name> is in position top-5 of the stack, because the stack content is MINMAX RO QSTRING RC RO section_names CM QSTRING. In this case, the rule checks the current speed, and if it is lower then the current minimum it saves the speed in RESULT[0]. Conversely, if the speed is greater than the current maximum, the speed is saved in RESULT[1]. At the end, in rule (min_max ::= MINMAX RO QSTRING RC RO section_names RC S) section_names contains the minimum and the maximum values.

Finally, rules drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M have to access the <car_name> in order to obtain the speed of a given <section_name>. Two markers (NT0 and NT1) have been added to move the <car_name> just before the non terminal part. Markers are needed, because in this grammar there are two nested lists, and the inner list (drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M) makes use of inherited attributes. It requires that the semantic value to be accessed has always the same offset in the stack. For rule drive_stats ::= QSTRING UINT M the <car_name> is in position top-6 of the stack (i.e., stack state is NT01 PART UINT COL QSTRING UINT M). NT01 means that the stack can contain NT0 or NT1. For rule drive_stats ::= drive_stats CM QSTRING UINT M the <car_name> is in position top-8 of the stack (i.e., stack state is NT01 PART UINT COL drive_stats CM QSTRING UINT M). At this point, the computation of the times in rules (drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M) and the computation of the winner in rules (performances ::= QSTRING ARROW parts S | performances QSTRING ARROW parts S) is relatively easy java code.

Note: to print the car name before the reduction of rule performances ::= QSTRING ARROW parts S, intermediate actions can be used:

performances ::= QSTRING:s {: System.out.println(s); :} ARROW parts:x S;

an alternative solution is:

performances ::= c_name:s ARROW parts:x S;
c_name ::= QSTRING:s {: System.out.println(s); RESULT=s; :}

Whole solution

DOWNLOAD THE SOLUTION: sol_20150903.zip

The solution contains:

scanner.jflex: the scanner
parser.cup: the parser
Main.java: the main file
exam_20150903.txt: an example file
Makefile: the make file

To compile:

jflex scanner.jflex
java java_cup.Main paser.cup
javac *.java

or just use the makefile by typing:

make

To run the program on the example:

java Main exam_20150903.txt

The output is:

MIN: 10 MAX: 50
"CAR_A"
PART1: 2.0 s
PART2: 19.0 s
PART3: 10.0 s
TOTAL: 31.0 s
"CAR_B"
PART1: 4.0 s
PART2: 14.2 s
PART3: 11.0 s
TOTAL: 29.2 s
WINNER: "CAR_B" 29.2 s

If you found any error, or if you want to partecipate to the editing of this wiki, please contact: admin [at] skenz.it

You can reuse, distribute or modify the content of this page, but you must cite in any document (or webpage) this url: https://www.skenz.it/compilers/exam_20150903?do=

Skenz - How To Wiki

Table of Contents