The Micromentor Learning Server

The OPUS Interpeter



Mark J. Norton
Tom Nguyen





Introduction

As work on the MicroMentor Learning Server progress, more and more work is called for generic CGI-type applications, especially for processing forms. Tom Nguyen proposed that we create a multi-purpose CGI application which interprets commends embedded in the input arguments and templates to generate HTML forms and process the results. This document describes the Omega Performance Universal Script langauge (OPUS) and the built-in functions which come with it.


Script Language Syntax

The OPUS language uses expressions based on prefix parenthetical notation. This means that each expression begins and ends with a delimiter (square brackets) and the first token in the expression indicates a function to be performed. Arguments to the function follow the function identifier and may be sub-expressions.

Script expressions are designated by square brackets, [ and ]. At the top level of a template document, characters are echo'ed to stdout until a '[' is encountered. All subsequent characters are processed by the OPUS interpreter until a closing ']' is encountered.

The script language is defined as:

hex 0 - 1or A -F
char a - z or A - Z or 0 - 9
spec ~``!@#^&*()_-+={}|\:;'<,>.?/
sp space or tab or return
esc %

identifier one or more char or underbar (_)

string one or more char or spec
esc hex hex

quote-string "one or more char, spec, or sp "

expr identifier
$identifier
string
quote-string
[one or more expr]


Pre-defined Functions

A number of pre-defined functions are built into the OPUS interpreter to handle primitive operations. Most important of these allow functions, variables and constants to be defined. Pre-defined functions are broken down into groups which deal with evaluation, variables, conditionals, flow control, files, user defined functions, name-value databases, strings, numbers, time, lists, structures, error checking, ODBC databases, and miscellaneous.

Names to the right of the function definition refer to regression test templates.

Functions that deal with Evaluation

[EVAL expr] regr_eval
Evaluate the expression, which is usually a variable name and return results. This is commonly used with the VAR function (below) to compute a complex string and then use it in mulitple places in a template. Note that the expression [EVAL [sub-expr]] is legal, but do anything more than [sub-expr] would accomplish. The $ operator is not supported in this function.

[SIDE expr] regr_eval
Evaluate the expression and throw the results away. This is largely used to group mutiple expressions together which are exectuted for their side-effects, hence the name.

[EXEC str]
Parse, evaluate and return the results of the string expression. This is used when a piece of script code is saved in a variable or database and needs to be executed. The $ operator is not supported in this function.


Functions that deal with Variables

Global variables may be declared and modified using these functions. The following variable types are supported at this time: STR, INT, BOOL, LIST. Variables declared withing a user-defined functions have local scope to that function.

[VAR id type expr] regr_var
Create a new variable and initialize it to expr. The variable identifier may be composed of letters, numbers, and underbar, case sensative. Variable declarations stay in effect for the evaluation of an entire web template.

[ISVAR id]
Return t if id is a valid OPUS variable, f if not.

[ASSIGN id expr] regr_var
Set the value of a variable to expr. ASSIGN is used to change the value of an existing variable to the results of the expression.





Conditionals and Logical Functions

In order to provide full support for boolean expressions, a set of comparison and logic functions have been provide. These all evaluate to either true (t) or false (f). Expressions may include sub-expressions, so these comparisons and logic statements can be arbitrarily complex.

[EQ expr expr] regr_cond
Compare the results from expressions and return t if equal. Expressions must be of the same type or an error will result.

[NE expr expr] regr_cond
Compare the results from expressions and return t if unequal. Expressions must be of the same type or an error will result.

[LT expr expr] regr_cond
Compare the integers from expressions and return t if a is less than b. Both expressions must evaluate to integers or an error will result.

[GT expr expr] regr_cond
Compare the integers from expression and return t if a greater than b. Both expressions must evaluate to integers or an error will result.

[OR expr expr] regr_cond
Perform a logical OR on expression results, thus t is returned if the first expression is true or the second expression is true. Both expressions must evaluate to a boolean (t or f).

[AND expr expr] regr_cond
Perform a logical AND on expression results, thus t is returned only if both the first and second expressions are true. Both expressions must evaluate to a boolean (t or f).

[NOT expr] regr_cond
Invert the results of expression, thus if expression is true, f is returned. Expression must evaluate to a boolean (t or f).


Flow Control Functions

The flow control functions provide a means of controlling execting of script expressions. Both selections (IF and SWITCH) and looping (FOR and WHILE) are supported.

[IF conditional expr1 expr2] regr_if
Return the results of expr1 if the condtional expression evaluates to true, otherwise return the results of expr2.

[FOR initial conditional iterate body] regr_for
Concatentate results of expr while the conditional expression evaluates to true. Perform the iterate expression each time through the loop. The $ operator is not supported in this function. A typical use of this function would be something like, [FOR [VAR i INT 0] [LT $i 10] [INC i] [.... body .....]], which interates from zero to nine.

[WHILE conditional body] regr_while
Concatenate results of expr while the conditional expression is true. The $ operator is not supported in this function.

[SWITCH expr result-1 result-2 result-3 ... result-n] regr_switch
Evaluate expr and return the evaluation of the nth result. The expression must resolve to an integer in the range of 0 to n, which must be contiguous. Holes in the list are not supported.


Functions that deal with Files

These functions allow operations to be performed on text files and directories of files. In general, the contents of a file are treated as a single, continuous string of text. No support is provided for binary files. File names used by these functions should include the full path to the file in question. The only exection to this rule are files that live in the same directory as the OPUS Interpeter.

[LOAD file_name]
Load the contents of file_name and evaulate it. No value is returned.

[FREAD file_name] regr_file
Read the contents of the file named and return it as a string.

[FWRITE file_name append str] regr_file
Write the string to the file named. If the append flag is t, append the string to the existing file, otherwise, the string replaces existing contents. Returns nothing.

[FDELETE file_name] regr_file
Delete the file named. Returns nothing.

[FCOPY from to] regr_file
Copy the contents of the file named from to the file named to. Returns nothing.

[FEXISTS file_name] regr_file
Returns t if file name exists, f otherwise.

[DIRFILES dir_path_name list_var_name] regr_file
Create a list of the files in the directory named and append to list_var_name. The dir_path_name must be a full path and end in a wild card specification, such as "*.*" . Nothing is returned.

[DIRDIR dir_path_name list_var_name] regr_file
Create a list of the sub-directories in the directory named and append to list_var_name. Nothing is returned.

[DIRMAKE dir_path_name] regr_file
Make a sub-directory in the directory named. Nothing is returned.

[DIRDELETE dir_path_name] regr_file
Delete the directory named. Nothing is returned.


Functions that deal with User Defined functions

In addition to the pre-defined functions built into the OPUS interpreter, user defined functions may be created and manipulated with the functions provided here. Function parameters are defined as variables with scope restricted to that function and are bound on invocation.

[FTN name arg1 arg2 ... argn expr] regr_ufunc
Define a function called name which returns data of the type specified. The function is specified by the names arg1, arg2, ... argn. Each of these arguments is a string of the form: "name type", where name is the name of the argument, and type is its data type. When the function is invoked, evaluate expr with arguments bound the argument names as local variables.

[ISFTN id]
Returns t if the function id is currently defined, f otherwise.

[REMOVEFTN id]
Remove the function id specified from the list of available functions. This allows the id to be re-used.


Functions that deal with Name-Value Databases

A Name-Value Database is a simple data management system based on formatted text files (see additional documentation). Each line in the database contains a name and a value associated with it. These are loaded into memory where they can be accessed on demand.

[GET db_name key_name] regr_nv
Return the contents of the key_name record in the db_name database. The contents of the database are cached into memory.

[PUT db_name key_name expr] regr_nv
Put results of expr into the key_name record in the db_name database.

[GETVAR db_name key_name] regr_nv
Copy the key name into the variable list. This makes a particular record available as a variable to script expressions.

[PUTVAR db_name key_name] regr_nv
Copy the variable named into the database list. Forces the current value back into the cached database image.

[FLUSH db_name] regr_nv
Save the list named out to the database associated with it.

[LOADVARS db_name]
Load the entries of db_name into the global variables variables list.


Functions that deal with Strings

HTML files are built from text formatted using tags. String functions are commonly used to build this text and format it appropriately. Strings created by OPUS expressions can be pure text, HTML text with tags, JavaScript expressions, even other OPUS script expressions.


[CONCAT expr-1 expr-2 ... expr-n] regr_concat
Concatenate the results of all expressions and return it.

[EXPLODE id str] regr_explode
Break up the string given and save a list of sub-strings in the list variable id. White space is used to determine where to break up text. Note carefully that mulitple spaces, returns, or tabs may be lost in the EXPLODE - IMPLODE process.

[IMPLODE id spaces] regr_explode
Create and return a string composed of the elements of the list variable id. If spaces is t, then add one space between entries, no spaces otherwise.

[TOUPPER str] regr_string
Return a new string where the characters in str are forced to upper case.

[TOLOWER str] regr_string
Return a new string where the characters in str are forced to lower case.

[REMOVE str sub] regr_string
Remove all instances of the string sub from the string str and return it.

[REPLACE str find_sub replace_sub] regr_string
Find all instances of find_sub in str and replace them with replace_str. Return modified string.

[STRLEN string]
Returns the length of the string.

[INSTR string substring]
If the string exists, this function returns the numeric position of the first character of the substring. Otherwise, zero is returned.

[LEFT string n]
Returns the first n characters of the string.

[RIGHT string n]
Returns the last n characters of the string.

[MID string offset n]
Returns the middle n characters of the string based on the offset given.

[SCRIPTPARAM string]
Convert the string passed to a form which can be passed as a script parameter and return it. Spaces, ampersands, commas, quotes, etc. are converted to the appropriate escape sequences.


Functions that deal with Numbers

Integers are provided as a numerical data type. These numbers are signed 32 bit numbers. No provision is made for overflow or underflow of mathemetic functions.

[INC id] regr_math
Increment the variable by one and return results. The $ operator is not supported in this function.

[DEC id] regr_math
Decrement the variable by one and return results. The $ operator is not supported in this function.

[ADD x y] regr_math
Add x to y and return results.

[SUB x y] regr_math
Subtract y from x and return results.

[MULT x y] regr_math
Multiply x by y and return the results.

[DIV x y] regr_math
Divide x by y and returns the results.

[MOD x y] regr_math
Find the modulo (remainder) of x over y and return the results.

[BITAND x y]
Return the bitwise AND of x and y.

[BITOR x y]
Return the bitwise OR of x and y.

[BITNOT x]
Return the bitwise negation of x.

[BITLEFT x y]
Return x shifted left by y bits. Zero is inserted at right.

[BITRIGHT x y]
Return x shifted right by y bits. Zero is inserted at left.

[RAND]
Return a random integer in the range of 0 to 32767 (2^15-1).


Functions that deal with Time

[TIME] regr_time
Return a string containing the current time and data. The $ operator is not supported in this function.

[GETDATESTAMP]
Returns a string which is a time/date stamp.

[GETDATEPART id part]
Extracts part of a data from id defined by part. Part can be 'h' for hour, 'm' for minute, 'M' for month, 'D' for day, 'Y' for year, and 'W' for day of the week.

[SETDATEPART id part expr ]
Set the date part in id specified by part to the results of the expression provided. The expression must evaluate to an integer value in the range supported by that part, ie. 1 - 12 for month, etc.

[DATEARITH id part expr]
Add the results fo expr to the part of the date identified by part and return it.

[DATEDIF id1 id2 part]
Return the difference between the dates given by the part specified.


Functions that deal with Lists

Uniform linear, single dimension lists are supported by these functions. Elements of the list must all be the same data type (STR, INT or BOOL).

[FIRST ident] regr_list
Return the first element in ident.

[LAST ident] regr_list
Return the last element in ident.

[NTH ident n] regr_list
Return the nth element in ident.

[APPEND id value] regr_list
Append value as an element of the list ident.

[LENGTH ident] regr_list
Return the length of the list ident.

[INSERT id expr n] regr_list
Insert expr in id at offset n, or after the given string.

[DELETE id n] regr_list
Delete the at offset n.

[FIND id value case-flag] regr_list
Find the value in list variable named by id and return its offset. Matchis case sensative if case-flag is true, insensative otherwise.

[SORT id case-flag] regr_list
Sort the entries in the list named (bubble sort is n-squared). If case-flag is 's', sort is case sensative, if 'i' then sort insensative, if 'n' then do an numerical sort.


Functions that deal with Structures

Structures are intended to be objects with named value fields. Structures are completely unimplemented at this time.

[GETPROP id prop] unimplemented
Return the value of the property named.

[SETPROP id prop value] unimplemented
Set the the value of the property name to the value given.

[FIRSTPROP id] unimplemented
Return the name of the first property in ident.

[LASTPROP id] unimplemented
Return the name of the last property in ident.

[NTHPROP id n] unimplemented
Return the name of the nth property in ident.

[APPENDPROP id prop] unimplemented
Append prop as a property of the structure.

[LENGTH id] unimplemented
Return the number of properties in the structure.

[DELPROP id n] unimplemented
Delete the nth property in id. Delete the property named.

[DELPROP id prop] unimplemented
Delete the property named from the variable id.


Functions that deal with Errors and Regression Testing

These functions are defined to aid in automated regression testing. Although a comprehensive set of regression tests exist, they are not set up to be run as a whole.

[RESETERR] unimplemented
Reset the regression error count to zero.

[ERRCT] unimplemented
Return the number of regression test errors.

[ERRDETECT flag] unimplemented
Toggle regression test error detection on or off.




Miscellaneous Functions

[VERSION] unimplemented
Returns the current version number of the OPUS interpreter.

[COM comment-text] unimplemented
This function can be used to insert a comment into a script expression. Nothing is returned from it.

[CGIHEADER type url]
Emit a CGI Header of the given type (HTML, PLAIN, URL). If type is specified as URL, control is transferred to the url specified. The $ operator is not supported in this function.

[GETENV variable]
Get and return the current value of the environment variable passed.

Typed Variables

Variables which contain only strings handle much of what is needed to process HTML templates. To extend this functionality further, typed variables are needed. The following variable types are proposed, of these all but STRUCT is implemented:

STR An array of characters, white space, special characters, etc.
INT A whole signed, 32 bit number.
BOOL Values of 't' or 'f' only.
LIST A uniform array of string, int, bool, list, or struct.
STRUCT A set of named fields of string, int, bool, list or struct.

String variables are saved in a name/value pair as:

S [string value]

Integer variables are saved in a name/value pair as:

I [integer value]

Boolean variables are saved in a name/value pair as:

B [boolean value]

List variables are saved as:

L [Length of List]
elements are named by appending a dash and number to root name.
Elements must be uniformly typed in the list.

Structure variables are saved as:

X [Property count] [List of field names]
properties are accessed by appending a dot and property name to the root
name.

(structures are not implemented at this time)

Variable names must start with a letter and may contain any combination of letters, numbers, and the underbar character. While there is no actual limit on the length of a variable names, it is recommended that they be less than 32 characters long.


The OPUS Interpeter

The script language defined above is used to interprete and finalize the formating of HTML pages based on template files. The CGI application which does this is called:

script1.exe

OPUS is a CGI application conforming to accepted notations and conventions. As such, it takes query arguments in via command line arguments, or POST variables from stdin. Query arguments must be passed as:

variable_name,value

and are separated by an ampersand character. POST variables are similarly associated with a variable. All CGI arguments are bound as variables and passed to the template processor. These arguments are bound on application start-up to variables which are made available during template processing. The following characters are reserved and may not be used in values of arguments passed:

? Question mark Start of CGI arguments
+ Plus sign CGI argument separator
& Ampersand Script engine variable separator
, Comma Name / value separator
These characters can be included in value strings using the following escape codes:

? Question mark %3F
+ Plus sign %2B
& Ampersand %26
, Comma %2C

Other useful escape codes include:

RET Carriage Return %0D
NL New Line %0A
HT Tab %09
SP Space %20
% Percent %25
" Double Quote %22
$ Dollar Sign %24
# Hash Mark %23
[ Open Bracket %5B
] Close Bracket %5D


The following arguments must be passed in every invocation of the template engine:

CONFIG name of the system configuration file to use.
USER user ID.
TEMPLATE template file name.

The CONFIG variable specifies the name of the system configuration file, which is required to be in the same directory as the script engine. This file contains directory paths, and other configuration information to define how templates, data, and other files are to be accessed. If a CONFIG variable is not defined, it defaults to "ls_config.txt". Variables defined by CONFIG include:

ls_host the host name of the server.
ls_cgibin name of the scripts directory.
ls_root root path name to this learning server environment.
ls_base root name.
ls_bin scripts directory.
ls_user path to the user profiles directory.
ls_content path to the content directory.
ls_templates path to the template directory.
ls_scripteng name of the script engine.

The USER variable defines the current user. This variable is required when security is in effect. See Login and Security document for more information. The TEMPLATE variable names the template file to be used. In some cases, the location of this file is well-known to the script engine. If not, the location is further refined by passing the following arguments:

ROOT
STRUCTURE
MODULE
SUBMODULE

The following argument names are reserved:

ACTION


Using Templates

The template processor opens the template file specified and echo's the text in it to stdout. Script expressions are recursively evaluated as they are encountered. This approach to template processing is very powerful and can be used in a variety of ways.

Flat Templates

Basic templates used used to format a particular HTML page, or a general one customized by a few bits of data. Flat templates do not depend on other template files and can be resolved using the information contained in it.

Conditional Templates

Occasionally it is necessary to use one template in one situation or another otherwise. This can be accomplished using conditional loading of templates based on some selection criteria.

Boilerplate Templates

The LOAD function can be used to access sections of HTML which are commonly used across many different pages. Headers are a typical use. Note that this use of template files is similar to cascaded style sheets.

Commonly Used Functions

Commonly used functions can be grouped into function files which are loaded as needed. This is similar to the program library modularization technique that many programming languages use to group similar functions together.


Debugging

The following debug flags are available:

DUMP_PARSE Dump the parse tree.
DUMP_VARS Dump the variable list.
TRACE_PARSE Trace the parser.
TRACE_EVAL Trace the evaluator.

The following are not implemented at this time:

DUMP_DB Dump all database lists.
CHECK_PARENS Check parentheses for balance.
REGRESSION Turn on regression test error counting.


Examples

Here is a form template which uses this script engine:

[VAR data_file STR "/ls/cont/db.dat"]
[VAR name_val STR [GET Name data_file]]

<FORM METHOD=POST ACTION="cgi.exe?assign+[EVAL data_file]">
Enter your name:<BR>
<INPUT TYPE="text" NAME="Name" SIZE=30 VALUE="[EVAL name_val]"><BR>
<INPUT TYPE="submit" VALUE="Submit">
</FORM>

Here is an example of constructing a file name for background graphic:

<BODY BACKGROUND="
[CONCAT $ROOT $STRUCTURE $MODULE "graphics"
[GET Description_DB Background]]">

Here is a boilerplate template for an HTML header:

<HTML>
<HEAD>
<TITLE>
Template file [LOAD $TEMPLATE_NAME]
</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black LINK=blue VLINK=blue>

Here is an example of calling the script engine with a new template:

[VAR sample_url STR
[CONCAT "http://"
$ls_host
"/scripts/"
$SCRIPTAPP
"?"
"USER," $USER
"&CONFIG, $CONFIG
"&TEMPLATE," "new_template_name.txt"
]
]

Variable and function declarations can be grouped together:

[SIDE
[VAR aaa STR "foo"]
[VAR bbb STR "bar"]
[VAR ccc INT 666]
]