Matrix arrays

From PROSE Programming Language - Wiki
Revision as of 20:32, 7 June 2016 by Cambridge (Talk | contribs)

Jump to: navigation, search

PAL Tutorial - Matrix arrays

In release 0.10.x, only the PROSE Assembly Language (PAL) is available, and then only a subset of those instructions. So be aware, it's very low-level programming at this time. To learn more about the PROSE Programming Language, visit http://prose.sourceforge.net.

It is suggested you begin with the following articles before attempting this tutorial.

A full list of available tutorials for the PROSE Assembly Language can be found on the tutorials index page.

This tutorial works through an example PAL program that demonstrates some of the capabilities of matrix arrays. Note that this tutorial requires a minimum of version 0.8.1.

Objectives

The objective is to write a PAL program that will populate a two-dimensional matrix array with tabular data from standard input, create a second array which is a reversed version of the first, then display to standard output in tabular form the elements that match between the two arrays, or a dot '.' in those positions where the elements do not match.

Given the following input:

BIWFMMXAPSSKEDKJCDOZ
YEIRYEUPWOCTERQUWUPP
XNELGARBUIQHANDELRBE
ADFTQRSFXVSFUEWFDFES
HBACHOMLUHJPBBOPHUYU
FJKASOLASDHJSRELKASD
POU ZLVHKJPLIAYRIOHN
BNUAYIPEEPCXTHOWTXBO
VWRFEZSCNASFDMARWFCA
FTAKJSLDHDHSFSJSDKAL
EUWCBYSWUPOC WWBUETQ
GHLKMUMJGLFNGKJNPRUI
FLOTJSHAJD DEGASAHJD
FTWXRSADCGASCVCZSFAD
YOWREERWEIUEOTOYTURP
XRECSGBQRSEDVMEHCABE
NUOHPKIUNIIGJIKHTMNH
SSLLEDNAHALDSRAGLEGA
ADLJSKJACCJHFDKJSFEA
LUEHWMVNXESYBUKLTUWB

The top half of the resulting output should resemble a completed word search game containing the names of famous classical composers.

Step 1 - Reading input into a matrix array

The first step is to read from standard input and populate a matrix array. This will make the assumption that the input is a table of characters of fixed width. Each character will have its own row and column position within the resulting array.

We will implement this as a function called getmtx() which we set-up and call as follows:

%
% Simple test of matrix arrays, where tabular data on standard input is
% reversed, then compared, with only identical cells remaining in output
%
._init
func/def	[main], &[.main]
func/def	[getmtx], &[.func_getmtx], [psMatrix]
local/rtn

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.main
%
% Read stdin into matrix array via getmtx() function
%
func/call	P0, [getmtx]
func/rtn

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.func_getmtx
func/rtn

We defined getmtx above with the func/def instruction, informing the execution engine that this function is to return array type psMatrix, and is to take no parameters.

A psMatrix object is a single or multi-dimensioned array of variable type. It may be an array of bytes, integers, strings, or indeed any of the PROSE variable types.

A matrix array is usually initialised by setting individual elements to the required values. However, when an array is of type psByte, indicating that it is a byte array, it may also be initialised either partially or entirely via a byte string. It is this latter approach that we will be taking in our <cde>getmtx</code> function.

Copy the following code underneath the .func_getmtx label.

%
% Copy standard input to string
%
reg/load        P10, ![.prose.sys.io]
attr/load       P11, [psStreamIn]

.stdin_loop
attr/copy       P1, P10, P11
reg/jmpeq       &[.stdin_eof], P1, NULL

%
% Strip newline character
%
reg/jmpneq      &[.A], A, #0
reg/scan        A, P1, [\n]

.A
%
% Force all lines to have the same number of rows as the first
%
reg/save        P1, (A)

reg/copy        P0, P0, P1
reg/clr         P1
local/jmp       &[.stdin_loop]

.stdin_eof

We process one line at a time from standard input by copying data from the psStreamIn attribute attached to the .prose.sys.io object and appending to the end of a new byte string we store in register P0 using reg/copy.

We use reg/scan to locate the newline character on the first line only, storing the character position in the Accumulator (A). Then every line is forced to the same length using reg/save. This ensures that we can have a fixed number of columns in the matrix array.

Next we need to calculate the number of rows we'll require in the matrix array. We can work this out by dividing the length of our new byte string (P0) by the number of columns in the Accumulator (A), as follows:

%
% Calculate array dimensions
%       A  - already has correct number of columns
%       P2 - to contain number of rows
%
reg/load        P2, (P0)
op/div          P2, P2, A

Now we have everything we need to create the matrix array:

%
% Convert string to byte array, copy to XVALUE and return
%
mtx/local       P0, [psByte], [mtx], @[P2, A], P0
attr/xcopy      P0, P0, [psMatrix]
func/rtn        P0

We use the mtx/local instruction to create a matrix array as a local variable object. This array will be of type psByte and will be called mtx. The array dimensions are provided in a list sequence @[P2, A]. As the array is psByte it can be initialised via a byte string, which we do with the string we created in P0.

To return a matrix array, we need to use attr/xcopy to take a copy of the psMatrix attribute value, which returns a new encoded value (XVALUE) that can be passed to func/rtn.

In a future release it is expected to become easier and more efficient to return variables and arrays without having to duplicate data. In the current release, func/rtn can only return a data type that can be directly represented in a register. As there is no register type that can represent an entire array, the array must be duplicated using attr/xcopy in order to return it.

One alternative approach would have been to create an empty array in the parent function and pass a pointer to getmtx instead. This would have been more efficient, but wouldn't have served as a useful example for how to return an array. We use pointers later on in this tutorial with some of our other functions, so that you get to see both methods.

Step 2 - Inspect the array

We now take a little time out to inspect the array using a couple of methods. We don't need this to achieve our program objectives, however they are useful techniques to know for debugging.

At the end of the getmtx function, immediately before the attr/xcopy instruction, add:

obj/dump	P0

At this point in the code, register P0 contains a pointer to a node in the nexus - an object of class psMatrix. When assembled and executed, the program will wait for standard input - try typing 1<ENTER> 2<ENTER> 3<ENTER> then press <Ctrl>+D. The output will be as follows:

1
2
3
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psArray
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psContainer
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=top
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psMatrix
.prose.code.default.getmtx._i0#0.var.mtx:pn=[mtx]
.prose.code.default.getmtx._i0#0.var.mtx:psMatrix=[3:0 $ 1:0]
.prose.code.default.getmtx._i0#0.var.mtx:psMatrixType=[psByte]
* prose.error.sys.RegisterViolation: Pointer overwrite without clear register
* Can't return from function: Register P0 contains PSUNIT_TYPE_XVALUE
*    at default.main()                      [mb.pro, addr 0x0044]
*    in prose.code._tid.0

We are going to ignore this error for now, because we haven't finished writing our main function (you could add reg/clr P0 immediately before the func/rtn in the main function temporarily if this error bothers you).

The output of obj/dump above shows the location of the mtx array as .prose.code.default.getmtx.i0#0.var.mtx, which is contained underneath var in the getmtx function's instance container. This is where all variable objects, including arrays, are stored when they are defined with local scope.

The object has the class psMatrix which is a subclass of psArray telling us it is an array object. It has two attributes: psMatrix which contains the array structure itself, and psMatrixType which tells us the type of data stored in the array.

The psMatrix attribute displays in dump output as a list of array dimensions and base dimensions in the format [x1:y1 $ x2:y2], where x1 is the number of elements in the first dimension, y1 is the base index number for the first dimension, x2 is the number of elements in the second dimension, y2 is the base index number for the second dimension, and so on for as many dimensions as exist for the array.

As the array is created in the same location as scalar variables, this tells us that arrays share the same namespace as scalar variables. One could deduce that scalar variables cannot therefore have the same name as an array, but this would be an incorrect deduction. Just to prove this point, let's see what happens to the object if we also attempt to assign a string variable with the same name:

Immediately before the obj/dump instruction we added above, insert:

var/local	P0, [psString], [mtx], [test string]

The output of the following obj/dump now looks like this:

.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psArray
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psContainer
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=top
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psMatrix
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psVariable
.prose.code.default.getmtx._i0#0.var.mtx:objectClass=psString
.prose.code.default.getmtx._i0#0.var.mtx:pn=[mtx]
.prose.code.default.getmtx._i0#0.var.mtx:psMatrix=[3:0 $ 1:0]
.prose.code.default.getmtx._i0#0.var.mtx:psMatrixType=[psByte]
.prose.code.default.getmtx._i0#0.var.mtx:psString=[test string]

The object is now both a psString class as well as a psMatrix class, and possesses an additional attribute psString. Now it may be treated as an array or a byte string. Note that the array structure exposed via the psMatrix attribute is entirely independent of the byte string value exposed via the psString attribute.

This completes our inspection of a matrix array object. Remove the var/local and obj/dump instructions we just added before proceeding to the next step in this tutorial.

Step 3 - Create a reverse array

The next step is to create a new array that is the exact reverse of the first. Before we can do this, let's save the matrix array that getmtx returned, as well as the number of rows and columns in the array, into local variables in the main function for convenience. The following code goes in the main function immediately following our call to getmtx and before the func/rtn:

%
% Store the array as a local variable by creating an empty array
% and then replacing by the array returned by getmtx()
%
mtx/local       P1, [psByte], [mtx]
attr/mod        P1, [psMatrix], P0

%
% Record size of matrix array in two separate local variables
%
mtx/size        @[P10, P11], P1
var/local       P2, [psInteger], [rows], P10
var/local       P3, [psInteger], [cols], P11

The array returned by getmtx was an encoded value (XVALUE) and we stored this in register P0. Now we use attr/mod to modify the psMatrix attribute to contain this data.

We obtain the number of rows and columns from our two-dimensional array using the mtx/size instruction. This can take a list sequence @[P10, P11] that requests that the number of elements in the first and second dimensions are returned as encoded attribute values of type psInteger in registers P10 and P11 respectively. We can then save these values to two new local variables called rows and cols using two var/local instructions.

Next we define the reverse() function that will create a reverse array. This first requires the function definition in the ._init section:

func/def        [reverse], &[.func_reverse], [psMatrix], [psMatrixRef], [px],
							 [psInteger], [rows],
							 [psInteger], [cols]

The function has three parameters: a pointer to a matrix array (psMatrixRef) and the number of rows and columns (psInteger). It returns a new array (psMatrix).

We therefore need to provide these parameters when we call it, which we can do with the following code snippet in the main function immediately after we set-up the rows and cols local variables, as follows:

%
% Copy rows and cols as XVALUEs ready for call to reverse()
%
attr/xcopy      P4, P2, [psInteger]
attr/xcopy      P5, P3, [psInteger]

%
% Create a reversed version of the array
%
func/bcall      P6, [reverse], P1, P4, P5

Note that we are passing rows and cols by value, which is why we need the attr/xcopy instructions to take a copy of their values before we pass this through to func/bcall.

The psMatrixRef type is worth a special mention. It is a subclass of psPointer so behaves in the same way as an ordinary pointer variable, with the additional benefit that it allows for iteration through an array and for pointer arithmetic.

To visualise its capabilities, let's dump out the px variable object when it is passed to the reverse() function:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.func_reverse
%
% Read args
%
var/addr        P0, [px], P1, [rows], P2, [cols]

obj/dump        P0
func/rtn

If we assemble and execute the program now, and type the input 1<ENTER> 2<ENTER> 3<ENTER> then press <Ctrl>+D as before, the output will be as follows:

1
2
3
.prose.code.default.reverse._i0#0.var.px:objectClass=psPointer
.prose.code.default.reverse._i0#0.var.px:objectClass=psVariable
.prose.code.default.reverse._i0#0.var.px:objectClass=psContainer
.prose.code.default.reverse._i0#0.var.px:objectClass=top
.prose.code.default.reverse._i0#0.var.px:objectClass=psMatrixRef
.prose.code.default.reverse._i0#0.var.px:psMatrixIdx=[#0 $ #0]
.prose.code.default.reverse._i0#0.var.px:psMatrixVal=[1]
.prose.code.default.reverse._i0#0.var.px:pn=[px]
.prose.code.default.reverse._i0#0.var.px:psMatrixRef=[.prose.code.default.main._i0#0.var.mtx]
* prose.error.sys.BadReturn: Return type does not match function definition
* Can't return from function: reverse() expects a return value
*    at default.reverse()                   [mb.pro, addr 0x00e7]
*    at default.main()                      [mb.pro, addr 0x006e]
*    in prose.code._tid.0

We can ignore the error because we have deliberately not returned any value at this time.

The output from obj/dump shows us that the available attributes include psMatrixIdx which allows us to set the pointer to an absolute element position within the array or read its current absolute position, psMatrixVal which exposes the value of the currently selected array element, and psMatrixRef that points to the array object itself. We can perform pointer arithmetic on the psMatrixRef attribute by addition or subtraction, which can be used for changing the relative position of the array pointer.

Remove the obj/dump and func/rtn instructions we just added for test purposes before proceeding.

Now we need to de-reference the pointer briefly in order to copy the array contents into a single byte string representation:

%
% Read array as byte string
%
attr/copy       P3, P0, [psMatrixRef]
obj/addr        PUSH, P3
reg/clr         P3
attr/copy       P3, PULL, [psMatrix]

Once the array is stored as a single byte string, we can use reg/load, op/swap and reg/save to reverse the string 32 bits at a time, storing the result in a new byte string:

%
% Reverse the byte string 4 bytes at a time
%
reg/load        P4, (P3)
reg/copy        P5, []
reg/save        P5, (P4)

.reverse_loop
reg/cmp         P4, #4
reg/jmpeq       &[.reverse_end], SCMP, #2

op/sub          P4, P4, #4
reg/load        P6, (P3, A)
op/swap         P6, P6, #0x4321
reg/save        P5, (P6, P4)
opa/add         #4
local/jmp       &[.reverse_loop]

.reverse_end

This process may overflow 8, 16 or 24 bits which need to be cleaned up to complete the process:

%
% Store last 1, 2 or 3 bytes
%
reg/jmpeq       &[.reverse_rtn], P4, #0
reg/load        P6, (P3, A)
op/swap         P6, P6, #0x4321
op/mask         A, P4
opa/not
reg/load        P7, (P5, #0)
opa/and         P7
op/sub          P8, #4, P4
op/mult         P8, P8, #8
op/shl          P6, P6, P8
opa/or          P6
reg/save        P5, (A, #0)

.reverse_rtn

Finally we use mtx/local to create a new local matrix array called mtxrev with our reversed byte string as initialisation data, use attr/xcopy to take a copy of this array as an encoded attribute value, and return:

%
% Convert string to byte array, copy to XVALUE and return
%
attr/index      P1, P1, [psInteger]
attr/index      P2, P2, [psInteger]
mtx/local       P5, [psByte], [mtxrev], @[P1, P2], P5
attr/xcopy      P5, P5, [psMatrix]
reg/clr         P3
func/rtn        P5

Back in the main() function we need to store the returned XVALUE as a local array. Insert the following code immediately following the call to reverse():

%
% Store the array as a local variable by creating an empty array
% and then replacing by the array returned by reverse()
%
mtx/local       P7, [psByte], [mtx2]
attr/mod        P7, [psMatrix], P6

Step 4 - Compare arrays and mask non-matching elements

The next step involves iterating through both the forward and reverse arrays, comparing element values, and masking out elements that don't match. First we will need a function to be defined in the ._init section called mask() that accepts two parameters, both pointers to matrix arrays, and which returns nothing:

func/def        [mask], &[.func_mask], NULL, [psMatrixRef], [px1],
					     [psMatrixRef], [px2]

Then we call the function from main():

%
% Walk through the first array, comparing with the second,
% and removing cells that do not match
%
func/bcall      NULL, [mask], P1, P7

The function code itself is quite simple to implement, first we prepare our registers:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.func_mask
%
% Read args
%
var/addr        P0, [px1], P1, [px2]
attr/load       P2, [psMatrixVal]

We're going to increment our array pointers one element at a time, comparing the psMatrixVal attribute as we go, until an OutOfBounds error occurs. This error will be generated when we try to increment the pointer beyond the end of the array. We will catch the error using error/jmp as follows:

%
% We'll loop through the array until we get an OutOfBounds error
%
error/jmp       &[.mask_end], ![.prose.error.sys.OutOfBounds]

Then the main loop can be closed, with no obvious way of completing. In reality, it will complete when op/incr errors:

%
% Loop through arrays px1 and px2 comparing cells
%
.mask_loop
attr/cmp        P0, P2, P1, P2
reg/jmpeq       &[.mask_next], SCMP, #1

%
% Cells don't match, replace cell in px1 with '.'
%
attr/mod        P0, P2, [.]

.mask_next
%
% Continue until OutOfBounds
%
op/incr         P0, P1
local/jmp       &[.mask_loop]

.mask_end
error/clr
func/rtn

We use attr/cmp to compare the values from both arrays. If they don't match, then the element in the first array is replaced by a single dot.

Step 5 - Display the result

The last step is the display the result, once the mask() function has completed altering the primary array. We will define a function called display() in the ._init section for this purpose:

func/def        [display], &[.func_display], NULL, [psMatrixRef], [px],
						   [psInteger], [rows],
						   [psInteger], [cols]

Our function will take an array pointer as well as the number of rows and columns as parameters.

Following the call to mask() in the main() function, we'll make the call to display():

%
% Copy rows and cols as XVALUEs ready for call to display()
%
attr/xcopy      P4, P2, [psInteger]
attr/xcopy      P5, P3, [psInteger]

%
% Display the contents of the array via the display() function
%
func/call       NULL, [display], P1, P4, P5

The function begins by setting up a few local variables and registers:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

.func_display
%
% Read args
%
var/addr        P0, [px], P1, [rows], P2, [cols]

%
% Set-up local reference variables
%
reg/load        P10, ![.prose.sys.io]
attr/load       P11, [psStreamOut], P12, [psMatrixVal], P13, [psInteger]

%
% Set-up local counter variables
%
var/local       P3, P13, [R], #0
var/local       P4, P13, [C], #0

Next we use attr/direct to copy the psMatrixVal attribute from the array pointer directly to the psStreamOut attribute on the output stream:

%
% Scan array cell-by-cell
%
.tab_loop
attr/direct     P10, P11, P0, P12

Increment counters:

%
% Increment column number
%
op/incr         P4
obj/cmp         P4, P2, P13
reg/jmpneq      &[.tab_next], SCMP, #2

%
% Advance matrix pointer
%
op/incr         P0
local/jmp       &[.tab_loop]

At the end of each row, send a newline character and reset column counter:

.tab_next
attr/mod        P10, P11, [\n]

%
% Increment row number
%
op/incr         P3
obj/cmp         P3, P1, P13
reg/jmpneq      &[.tab_end], SCMP, #2

%
% Reset column number
%
attr/mod        P4, P13, #0

%
% Advance matrix pointer
%
op/incr         P0
local/jmp       &[.tab_loop]

.tab_end
func/rtn

We should now have a fully functional program.

Step 6 - Testing the program

Save the example input from the objectives at the top of this tutorial into a file called matrix.txt. Assuming you have assembled the tutorial example using prism and svaed the resulting bytecode in a file called matrix.pro, then you can test the program with the following on the command-line:

prose matrix < matrix.txt

The top half of the resulting output should resemble a completed word search game containing the names of famous classical composers.

Resources from this tutorial

Further reading

See the other tutorials available for the PROSE Assembly Language on the tutorials index page.

PROSE is released with detailed manual pages that describe how PAL operates, and how each instruction is used. These manual pages can be read using the man command, for example man pal_intro or man pal_commands, or from the Reference Manual Pages online.