Table of Contents
Jflex & Cup: Exam 2015/09/03
Text
Using the Jflex
lexer generator and the Cup
parser generator, realize a java
program capable of recognizing and executing the language described here:20150903.pdf
Commented Solution
Scanner
- scanner.jflex
import java_cup.runtime.*; %% %unicode %cup %line %column %{ private Symbol sym(int type) { return new Symbol(type, yyline, yycolumn); } private Symbol sym(int type, Object value) { return new Symbol(type, yyline, yycolumn, value); } %} /* TOKEN1 regular expression */ token1 = (("%%%%%"("%%")*) | (("**" | "???"){2,3})) {odd_number} odd_number = "-"(3[135] | [12][13579] | [13579]) | [13579] | [1-9][13579] | [12][0-9][13579] | 3([0-2][13579]|3[13]) /* TOKEN2 regular expression */ token2 = {date}("-" | "+"){date} date = 2015"/"12"/"(1[2-9] | 2[0-9] | 3[01]) | 2016"/"(01"/"(0[1-46-9] | [12][0-9] | 3[01]) | 02"/"(0[1-9] | [12][0-9]) | 03"/"(0[1-9] | 1[0-3])) /* TOKEN3 regular expression */ token3 = "$"(101 | 110 | 111 | 1(0|1){3} | 1(0|1){4} | 10(1000|0(0|1){3})) q_string = \" ~ \" uint = 0 | [1-9][0-9]* sep = "##"("##")+ nl = \r | \n | \r\n cpp_comment = "//" .* %% {token1} { return sym(sym.TOKEN1);} {token2} { return sym(sym.TOKEN2);} {token3} { return sym(sym.TOKEN3);} {q_string} { return sym(sym.QSTRING, yytext());} {uint} { return sym(sym.UINT, new Integer(yytext()));} {sep} { return sym(sym.SEP);} "PRINT_MIN_MAX" { return sym(sym.MINMAX);} "PART" { return sym(sym.PART);} "m" { return sym(sym.M);} "m/s" { return sym(sym.MS);} "->" { return sym(sym.ARROW);} "=" { return sym(sym.EQ);} "|" { return sym(sym.PIPE);} "," { return sym(sym.CM);} ";" { return sym(sym.S);} ":" { return sym(sym.COL);} "(" { return sym(sym.RO);} ")" { return sym(sym.RC);} "{" { return sym(sym.SO);} "}" { return sym(sym.SC);} {cpp_comment} {;} \r | \n | \r\n | " " | \t {;} . { System.out.println("Scanner Error: " + yytext()); }
The scanner is relatively simple. Most of the attentions should be given to regular expressions regarding token1
, token2
and token3
.
token1:
- it starts with the character
%
repeated an odd number of times, at least 5:"%%%%%"("%%")*
. Another possible solution could be"%%%"("%%")+
- or it starts with 2 o 3 repetitions of the words
"**"
and???
in any combination (???**
,??????
,**??????
,******
,…): in this case the operator{n,m}
should be used to manage the repetition of**
or???
for a number of times between 2 and 3:("**" | "???"){2,3}
- The first part of the token is optionally followed by an odd number between
−35
and333
: in this context the regular expression has to recognize all the numbers, but only those, in the range. Remember that numbers have to be odd. As a consequence the regular expression of the less significant digit has to recognize only 1, 3, 5, 7, 9 digits:"-"(3[135] | [12][13579] | [13579]) | [13579] | [1-9][13579] | [12][0-9][13579] | 3([0-2][13579]|3[13])
token2:
- it is composed of two dates separated by the character
-
or by the character+
:{date}("-" | "+"){date}
- A date has the format
YYYY/MM/DD
and it is in the range between2015/12/12
and2016/03/13
, with the exclusion of the day2016/01/05
. Remember that the month of February has29
days. Regular expression has to recognize only the date that matches the requirements regarding days ranges:2015"/"12"/"(1[2-9] | 2[0-9] | 3[01]) | 2016"/"(01"/"(0[1-46-9] | [12][0-9] | 3[01]) | 02"/"(0[1-9] | [12][0-9]) | 03"/"(0[1-9] | 1[0-3]))
token3:
- it is the character
$
followed by a binary number between101
and101000
. As in the previous regular expression, you have to take into account numbers range, but in this case numbers are binaries:"$"(101 | 110 | 111 | 1(0|1){3} | 1(0|1){4} | 10(1000|0(0|1){3}))
Parser
- parser.cup
import java_cup.runtime.*; import java.util.*; import java.io.*; init with {: table = new HashMap<String, HashMap<String, Integer>>(); :}; parser code {: public HashMap<String, HashMap<String, Integer>> table; public void report_error(String message, Object info) { StringBuffer m = new StringBuffer(message); if (info instanceof Symbol) { if (((Symbol)info).left != 1 && ((Symbol)info).right != 1) { if (((Symbol)info).left != -1 && ((Symbol)info).right != -1) { int line = (((Symbol)info).left) + 1; int column = (((Symbol)info).right) + 1; m.append(" (line " + line + " column " + column + ")"); } } System.err.println(m); } } public Object stack(int position) { return (((Symbol)stack.elementAt(tos + position)).value); } :}; ////////////////////////////////// ///// SYMBOLS DECLARATION ///////////////////////////////// terminal TOKEN1, TOKEN2, TOKEN3, M, MS, MINMAX, PART, ARROW, SEP; terminal EQ, PIPE, CM, S, COL, RO, RC, SO, SC; terminal String QSTRING; terminal Integer UINT; non terminal prog, header, token1_l, cars, car, race, print_min_max_l, min_max; non terminal Object[] section_names, performances; non terminal HashMap speeds; non terminal Float drive_stats, parts, part; non terminal String NT0, NT1; ////////////////////////////////// ///// GRAMMAR ///////////////////////////////// start with prog; prog ::= header SEP cars SEP race; /******************/ /* Header section */ /******************/ header ::= token1_l TOKEN2 S token1_l TOKEN3 S token1_l | token1_l TOKEN3 S token1_l TOKEN2 S token1_l ; token1_l ::= token1_l TOKEN1 S | /* epsilon */; /****************/ /* Cars section */ /****************/ cars ::= car car | cars car car; car ::= QSTRING:s SO speeds:tab SC {: parser.table.put(s, tab); :} ; speeds ::= QSTRING:s EQ UINT:u MS {: RESULT = new HashMap<String, Integer>(); RESULT.put(s, u); :} | speeds:tab CM QSTRING:s EQ UINT:u MS {: tab.put(s, u); RESULT = tab; :} ; /****************/ /* Race section */ /****************/ race ::= print_min_max_l performances:s {: System.out.println("WINNER: " + s[0] + " " + s[1] + " s"); :} ; /* Management of PRINT_MIN_MAX function */ print_min_max_l ::= | print_min_max_l min_max; min_max ::= MINMAX RO QSTRING:s RC RO section_names:m RC S {: System.out.println("MIN: " + m[0] + " MAX: " + m[1]); :} ; section_names ::= QSTRING:s {: String car = (String)parser.stack(-3); HashMap<String, Integer> speeds = parser.table.get(car); Integer speed = (Integer)speeds.get(s); RESULT = new Object[2]; RESULT[0] = speed; // Current min value RESULT[1] = speed; // Current max value :} | section_names:m CM QSTRING:s {: String car = (String)parser.stack(-5); HashMap<String, Integer> speeds = parser.table.get(car); Integer speed = (Integer)speeds.get(s); RESULT = new Object[2]; // Update current min and max values if (speed < (Integer)m[0]) { // New min RESULT[0] = speed; RESULT[1] = m[1]; } else if (speed > (Integer)m[1]) { // New max RESULT[0] = m[0]; RESULT[1] = speed; } else { // No change in min and max RESULT[0] = m[0]; RESULT[1] = m[1]; } :} ; /* Part regarding performances */ performances ::= QSTRING:s {: System.out.println(s); :} ARROW parts:x S {: System.out.println("TOTAL: " + x + " s"); // To detect the winner car RESULT = new Object[2]; RESULT[0] = s; // car name RESULT[1] = x; // result :} | performances:perf QSTRING:s {: System.out.println(s); :} ARROW parts:x S {: System.out.println("TOTAL: " + x + " s"); RESULT = new Object[2]; // Check if this car is the current winner if ((Float)perf[1] < x) { // Current winner is an old car RESULT[0] = perf[0]; RESULT[1] = perf[1]; } else { // Current winner is this car RESULT[0] = s; RESULT[1] = x; } :} ; // The two markers are used to transfer the car name semantic value in the symbol that preceeds the non terminal "part" parts ::= NT0 part:x {: RESULT = x; :} | parts:res PIPE NT1 part:x {: RESULT = res + x; :} ; NT0 ::= {: RESULT = (String)parser.stack(-2); :}; NT1 ::= {: RESULT = (String)parser.stack(-4); :}; part ::= PART UINT:x COL drive_stats:stat {: RESULT = stat; System.out.println("PART" + x + ": " + stat + " s"); :}; drive_stats ::= QSTRING:s UINT:u M {: String car = (String)parser.stack(-6); HashMap<String, Integer> speeds = parser.table.get(car); Integer speed = (Integer)speeds.get(s); float result = (float)u.intValue() / (float)speed.intValue(); RESULT = new Float(result); :} | drive_stats:stat CM QSTRING:s UINT:u M {: String car = (String)parser.stack(-8); HashMap<String, Integer> speeds = parser.table.get(car); Integer speed = (Integer)speeds.get(s); float result = (float)u.intValue() / (float)speed.intValue(); RESULT = new Float(result); RESULT += stat; /* Accumulate the time in result */ :} ;
Grammar
Regarding the grammar, the most complex part is the header section. It requires exactly one token2 and one token3, while any number of token1 can be found in the header. Since any order is possible, token2 and token3 can appear in the order TOKEN2 TOKEN3
or TOKEN3 TOKEN2
. For this reason two header rules are present (see the | operator). Before the two tokens, between them, and after them any number (even 0) of token1 may be present. For this reason, in any position of the header grammar, adjacent to the two tokens, there is a list that may be empty. Consequently, the grammar of the header is the following:
header ::= token1_l TOKEN2 S token1_l TOKEN3 S token1_l | token1_l TOKEN3 S token1_l TOKEN2 S token1_l ; token1_l ::= token1_l TOKEN1 S | /* epsilon */;
Another interesting part of the grammar is that concerning the list of cars composed of at least 2 elements and in even number:
cars ::= car car | cars car car;
the first part of the rule recognizes the first two elements, while the second part recognizes other two elements for any further reduction.
Semantic
Semantic regarding the car section is very easy and it is aimed to fill the data structure containing statistics regarding cars speed.
For the PRINT_MIN_MAX
function, the idea is to access the <car_name>
through inherited attributes. For the first element of the list (section_names ::= QSTRING
) <car_name>
is in position top-3
of the stack, because the stack content is MINMAX RO QSTRING RC RO QSTRING
and the first QSTRING
is the
<car_name>
. The array RESULT[]
is initialized with the speed of the first element of the list. RESULT[0]
is the current minimum value, while RESULT[1]
is the current maximum value.
For the other elements of the list (section_names ::= section_names CM QSTRING
) <car_name>
is in position top-5
of the stack, because the stack content is MINMAX RO QSTRING RC RO section_names CM QSTRING
. In this case, the rule checks the current speed, and if it is lower then the current minimum it saves the speed in RESULT[0]
. Conversely, if the speed is greater than the current maximum, the speed is saved in RESULT[1]
. At the end, in rule (min_max ::= MINMAX RO QSTRING RC RO section_names RC S
) section_names
contains the minimum and the maximum values.
Finally, rules drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M
have to access the <car_name>
in order to obtain the speed of a given <section_name>
. Two markers (NT0
and NT1
) have been added to move the <car_name>
just before the non terminal part
. Markers are needed, because in this grammar there are two nested lists, and the inner list (drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M
) makes use of inherited attributes. It requires that the semantic value to be accessed has always the same offset in the stack. For rule drive_stats ::= QSTRING UINT M
the <car_name>
is in position top-6
of the stack (i.e., stack state is NT01 PART UINT COL QSTRING UINT M
). NT01
means that the stack can contain NT0
or NT1
. For rule drive_stats ::= drive_stats CM QSTRING UINT M
the <car_name>
is in position top-8
of the stack (i.e., stack state is NT01 PART UINT COL drive_stats CM QSTRING UINT M
). At this point, the computation of the times in rules (drive_stats ::= QSTRING UINT M | drive_stats CM QSTRING UINT M
) and the computation of the winner in rules (performances ::= QSTRING ARROW parts S | performances QSTRING ARROW parts S
) is relatively easy java code.
Note: to print the car name before the reduction of rule performances ::= QSTRING ARROW parts S
, intermediate actions can be used:
performances ::= QSTRING:s {: System.out.println(s); :} ARROW parts:x S;
an alternative solution is:
performances ::= c_name:s ARROW parts:x S; c_name ::= QSTRING:s {: System.out.println(s); RESULT=s; :}
Whole solution
DOWNLOAD THE SOLUTION: sol_20150903.zip
The solution contains:
scanner.jflex
: the scannerparser.cup
: the parserMain.java
: the main fileexam_20150903.txt
: an example fileMakefile
: the make file
To compile:
jflex scanner.jflex java java_cup.Main paser.cup javac *.java
or just use the makefile by typing:
make
To run the program on the example:
java Main exam_20150903.txt
The output is:
MIN: 10 MAX: 50 "CAR_A" PART1: 2.0 s PART2: 19.0 s PART3: 10.0 s TOTAL: 31.0 s "CAR_B" PART1: 4.0 s PART2: 14.2 s PART3: 11.0 s TOTAL: 29.2 s WINNER: "CAR_B" 29.2 s
If you found any error, or if you want to partecipate to the editing of this wiki, please contact: admin [at] skenz.it
You can reuse, distribute or modify the content of this page, but you must cite in any document (or webpage) this url: https://www.skenz.it/compilers/exam_20150903?do=