The assignment is about implementing a parser/evaluator for a simple programming language. The language allows users to write code that computes natural numbers. We shall call it NAT.
The assignment consists of two tasks, worth 50% and 30% respectively. The remaining 20% can be earned for readable and maintainable code. Comments and indentation improve readability. Maintainable source code can easily have new features added to it.
- Language Description
- NAT program consists of several lines of code, each of which defines a single function. Figure 1 contains a few samples of NAT For instance, the function that returns its argument incremented by 4 can be defined using the following syntax.
FUN addfour X { X+4 } ;
addfour will be referred to as the function name, X will be called the parameter name and X+4 is the function body.
A function definition must contain the seven elements specified below. They must occur in a single line, in the same order as below, and have to be separated from each other by exactly one space character.
- keyword FUN
- function name
- parameter name
- left brace
- function body
- right brace
- semicolon
The character F from FUN must be the first character on the line. The final semicolon has to be followed by the end-of-line character.
Additionally, NAT code satisfies the conditions described below.
- Function names must consist of lower-case letters.
- Parameter names must consist of upper-case letters.
- If the function name is main, no parameter is allowed. In this exceptional case, main must be followed by one space and then the left brace.
- There can be no whitespace inside the function body, but remember that the body must be separated from the enclosing braces by one space on each side.
- The function body is an arithmetic expression. It can contain natural numbers, the associated parameter name as well as function calls. The only operations allowed are addition and multiplication. Parentheses are not allowed, except in function calls (see next point).
- Function calls have to refer to functions that have been defined as part of the same program. Function definitions are listed in no particular order. It is possible for a function body to make calls to functions defined later in the program. Functions can also call themselves.
- Any function di?erent from main can be called. Function calls are made by men-tioning the relevant function name along with an argument enclosed in a pair of parentheses, e.g. addfour(3+5*6). No whitespace can occur between the parenthe-ses. The argument must satisfy the same constraints as function bodies.
- Every program must define the main No function can be defined twice.
Tasks
Task One
Implement a parser/lexer that recognises NAT programs.
NAT programs can be executed by running the function body of main. This may result in calls to other functions, which may in turn call further functions and so on. Sometimes this will produce a result: the first two programs from Figure 1 return 14 and 10 respectively. In cases when there are circular dependencies between functions (Example 3 in Figure 1), the program will diverge. During execution we assume that multiplication takes precedence over addition.
Task Two
Extend the parser so that it can determine whether the input program returns a result or not. If the former is the case, the result (a natural number) should be printed out.
- Input/Output Specification
The parser should read the input from System.in. Parsing errors should be reported on System.err and error messages should give one reason why the input cannot be regarded as a NAT program. System.out should be used for output.
- If the input represents a NAT program, two lines must be printed out: one with the word PASS and another containing information about the result, as explained in the next sentence. If the program evaluates to a number, the number should be printed out. Otherwise, the line should read DIVERGENCE.
- If the input is not a NAT program, only a single line with FAIL should be printed out.
Here is the expected output for the examples from Figure 1.
- Example 1
PASS 14
- Example 2
PASS 10
- Example 3
PASS DIVERGENCE
- Non-example
FAIL
- Further Information
Submit a specification file called Assignment.jj which causes javacc to generate a parser Assignment.java. No marks will be awarded to parsers that are named di?erently. Com-pilation should work on DCS machines using the commands:
javacc Assignment.jj
javac *.java
For reading an input file called test.txt, the command
java Assignment < test.txt
should produce the required output. The output of the parser should just appear on screen, it should not be sent to another file.
- Suggestions
Since partial credit will be given for partially working solutions, you may find it helpful to build your solution incrementally. You may discuss with fellow students the general workings of javacc or parsing, but you are not allowed to collaborate on the solution. The University of Warwick takes plagiarism seriously, and penalties will be incurred if any form of plagiarism is detected. Copying, or basing your work on, solutions written by people who have not taken this course is also counted as plagiarism. This includes material that has been downloaded from the internet.
- Final Instructions
SUBMISSION DEADLINE: Noon on Thursday 2nd May 2013.
ONLINE SUBMISSION: Use the BOSS system to submit coursework. It will be set up to receive submissions for this assignment from the end of April. Please ensure that your programs work on DCS machines. There will be some basic automatic testing of submissions, followed by manual testing.
BACKUP: Please keep a copy of everything you submit.
- Example 1
FUN main { 1+addfour(2+addfour(3)) } ; FUN addfour X { X+4 } ;
- Example 2
FUN abcd XYZ { bcd(XYZ) } ; FUN bcd XY { 2*cd(XY) } ; FUN cd X { X+4 } ;
FUN main { abcd(1) } ;
- Example 3
FUN qq YY { 2*pp(YY)+3*qq(YY) } ; FUN pp XX { qq(XX)+3 } ;
FUN main { pp(0)+3 } ;
- Non-example
FAN p2p XxX { q(3) };
This is not a NAT program for a number of reasons: FAN instead FUN, p2p contains a digit (only lower-case letters are acceptable here), XxX contains a lower-case letter (here upper case is expected), the function body contains a call to an undefined function, there are two spaces after ), the main function is missing and there is no space between } and ;.