Homework #2: CMPT-379
Distributed on Sep 30; due on Oct 18
Anoop Sarkar – [email protected]
Reading for this homework includes Chp 4 of the Dragon book. If needed, refer to:
http://tldp.org/HOWTO/Lex-YACC-HOWTO.html
Only submit answers for questions marked with y. You must use a makefile such that make compiles all your
programs, and make test runs each program, and supplies the necessary input files. You have been provided a
file makefile.answer which you should use as your makefile. The stdout when you run make test must match
makefile.answer.out, and stderr must match makefile.answer.err.
(1) The following program is yacc code for a very simple (and incomplete) expression interpreter.
%{
#include
%}
%token NAME NUMBER
%%
statement: NAME '=' expression { printf(“%c = %dn”, $1, $3); }
| expression { printf(“%dn”, $1); }
;
expression: expression '+' NUMBER {
= $1 – $3; }
| NUMBER {
= $1 + $3; }
| expression '-' NUMBER {
= $1; }
;
%%
2
The %union declaration can include complex datatypes. The yacc code defines a type not just for the
tokens, but also for nonterminals, which is specified in the %type definition above. This allows yacc to
check that the type of the non-terminal expression is rvalue, an integer type.
Use the above code fragments, and add the necessary lex and yacc code in order to handle the following
input and output:
Input:
a = 2 + 3
a
b = a + a – 2 + 3 + a
b
Output:
5
16
Extend your expression interpreter to include constants of type double, and variables that can hold either
integer or double types. Finally, add the functions: exp, sqrt, log so that you can interpret the input:
a = 2.0
b = exp(a)
b
(3) y Write a Decaf program that implements the quicksort algorithm to sort a list. Your program should
print the sorted list by iteratively calling the print int library function. Submit the program as the file
quicksort.decaf
(4) Write down a context-free grammar for the structure of Decaf programs based on the reference grammar
in the Decaf language definition (make sure that the non-terminal and terminal symbols used in the CFG
correspond as much as possible to the symbols used in the reference grammar). Submit a file called
decafGrammar.txt which contains the CFG in the following text format: For the CFG:
hstarti ! A hstarti B j the text format you should use is:
start A start B
start
Follow the convention that the non-terminals in this text format are written in the same format as
identifiers in Decaf but are in lowercase (e.g. start, and for hyphenated non-terminals like
method-name replace the hyphen with an underscore, e.g. method name) and write the terminal symbols
in the same format as identifiers but entirely in uppercase (e.g. A).
You should verify the correctness of your CFG either by examining it closely or you can verify aspects of
the CFG by writing some simple code for checking whether non-terminals are used in the right-hand side
of rules but not defined on the left-hand side anywhere else in the CFG, whether the terminal symbols are
valid tokens, etc.
(5) y Write down a yacc parser for the following context-free grammar:
e ! e PLUS t
e ! t
t ! t TIMES f
t ! f
f ! LPAREN e RPAREN
f ! ID
The tokens PLUS, TIMES, LPAREN, RPAREN are defined to be +, *, (, ) respectively. And the token ID
is defined to be an identifier as in the Decaf specification. These tokens should be defined using a lexical
analyzer produced using lex.
3
For the input string x + y * ( z ) the output produced by the yacc parser should be the parse tree for
the input string produced in the format shown in the left column below. Note the backslash preceding
each instance of a literal parenthesis to avoid confusion with the parentheses used to denote the tree
structure. Note that you may need to augment the grammar to produce the right output. Do not bother to
indent your tree, just print out your parse tree in a single line of output text. A graphical view, using the
Tcl/Tk program viewtree, is shown in the right column below:
(e (e (t (f (ID x))))
(PLUS +)
(t (t (f (ID y)))
(TIMES *)
(f (LPAREN ()
(e (t (f (ID z))))
(RPAREN )))))
x
ID
f
t
e
+
PLUS
y
ID
f
t
*
TIMES
(
LPAREN
z
ID
f
t
e
)
RPAREN
f
t
e
(6) Grammar Conversion: Consider the following fragment of a Decaf program:
class foo {
int bar
Note that we could continue the above fragment with a field declaration, or a method declaration. This
issue will not be a problem for a LR parser if the CFG for Decaf can be written as an LR grammar.
Convert the following CFG, which represents a small fragment of Decaf syntax, into an LR grammar. As
a result, the LR parsing table for such a grammar will have no shift/reduce or reduce/reduce conflicts.
program ! CLASS ID LCB field decl list method decl list RCB
field decl list ! field decl field decl list
field decl list !
method decl list ! method decl method decl list
method decl list !
field decl ! type ID ASSIGN INTCONSTANT SEMICOLON
method decl ! return type ID LPAREN RPAREN
return type ! type
return type ! VOID
type ! INT
type ! BOOL
4
You can test if your revised grammar is an LR grammar by writing down the grammar as yacc code with
no actions and simply check if you get any shift/reduce conflicts. Note that in this skeleton yacc code, the
token definitions do not need to be linked with a lexical analyzer in order to check for shift/reduce
conflicts.
(7) y Decaf Parse Trees: Write a yacc program that prints out a parse tree for any input Decaf program.
You will need to make sure that the context-free grammar you use is an LR grammar. The parse tree
format should be the same as in Q. (5). Use the lexical analyzer you have already built based on the
Decaf specification.