Short tour of PAL

From PROSE Programming Language - Wiki
Revision as of 12:13, 4 August 2010 by Cambridge (Talk | contribs)

Jump to: navigation, search

Short tour of PAL in release 0.6.x

In release 0.6.x, only the PROSE Assembly Language (PAL) is available, and then only a subset of those instructions. So be aware, it's very low-level programming at this time. To learn more about the PROSE Programming Language, visit http://prose.sourceforge.net.

It is suggested you begin with the following articles before attempting this tour.

This article gives a brief tour of the following features in the PROSE Assembly Language.

Registers and indices

There are 93 unique instruction codes (opcodes) in PAL, 15 registers, 8 special register keywords and 5 data entry macros. Run prism -l and you'll see a list of them all:

    $ prism -l
    ------------------------------------------------------------
    INSTRUCTIONS

      0x80 noop
      0x81 stack/push
      0x82 stack/pull
      0x83 stack/peek
      0x84 stack/lock
      0x85 stack/unlock
      0x86 stack/flush
      0x89 obj/def
      0x8a obj/clone
      0x8b obj/edit
      0x8c obj/commit
      0x8d obj/del
      0x8e obj/addr
      0x8f obj/pa
      0x90 obj/child
      0x91 obj/dump
      0x92 class/add
      0x93 class/del
      0x94 class/load
      0x95 class/load()
      0x96 class/test
      0x97 attr/add
      0x98 attr/del
      0x99 attr/mod
      0x9a attr/mvadd
      0x9b attr/mvdel
      0x9c attr/mvmod
      0x9d attr/load
      0x9e attr/load()
      0x9f attr/test
      0xa0 attr/mvtest
      0xa1 attr/copy
      0xa2 attr/copy()
      0xa3 attr/direct
      0xa6 op/incr
      0xa7 op/decr
      0xa8 op/add
      0xa9 op/sub
      0xaa op/mult
      0xab op/div
      0xac op/mod
      0xad op/not
      0xae op/and
      0xaf op/or
      0xb0 op/xor
      0xb1 op/shl
      0xb2 op/shr
      0xb3 op/mask
      0xb4 op/swap
      0xbc error/def
      0xbd error/now
      0xbe error/jmp
      0xbf error/clr
      0xc0 func/def
      0xc1 func/call
      0xc2 func/bcall
      0xc3 func/rtn
      0xc4 local/jmp()
      0xc5 local/jsr()
      0xc6 local/jmp
      0xc7 local/jsr
      0xc8 local/rtn
      0xc9 reg/move
      0xca reg/load
      0xcb reg/jmpeq
      0xcc reg/jmpneq
      0xcd reg/jsreq
      0xce reg/jsrneq
      0xcf reg/dump
      0xd0 reg/load()
      0xd1 reg/clr
      0xd2 reg/index
      0xd3 reg/cmp
      0xd4 reg/save()
      0xd5 reg/copy
      0xd6 reg/copy()
      0xd7 reg/conv
      0xd8 reg/lcmp
      0xd9 reg/rcmp
      0xda reg/scan
      0xe8 opa/add
      0xe9 opa/sub
      0xea opa/mult
      0xeb opa/div
      0xec opa/mod
      0xed opa/not
      0xee opa/and
      0xef opa/or
      0xf0 opa/xor
      0xf1 opa/shl
      0xf2 opa/shr
      0xfb debug/source
      0xfc debug/level
    ------------------------------------------------------------
    INDICES (with Accumulator)

      0x60 to 0x63 A, LABEL
      0x64 to 0x67 A, DLABEL
      0x68 to 0x6b A, OBJREF
      0x6c to 0x6f A, TEXT
      0x70 to 0x73 A, RAW
      0x7c ------- A, PTRPOS
      0x7d ------- A, PTRNEG
    ------------------------------------------------------------
    INDICES (without Accumulator)

      0x40 to 0x43 LABEL
      0x44 to 0x47 DLABEL
      0x48 to 0x4b OBJREF
      0x4c to 0x4f TEXT
      0x50 to 0x53 RAW
      0x5c ------- PTRPOS
      0x5d ------- PTRNEG
    ------------------------------------------------------------
    REGISTERS (with Accumulator)

      0x20 A, P0
      0x21 A, P1
      0x22 A, P2
      0x23 A, P3
      0x24 A, P4
      0x25 A, P5
      0x26 A, P6
      0x27 A, P7
      0x28 A, P8
      0x29 A, P9
      0x2a A, P10
      0x2b A, P11
      0x2c A, P12
      0x2d A, P13
      0x37 A, A
      0x38 A, SFLG
      0x39 A, SCMP
      0x3a A, LOCK
      0x3b A, PCTX
      0x3c A, PEEK
      0x3d A, PULL
      0x3e A, PUSH
      0x3f A, NULL
    ------------------------------------------------------------
    REGISTERS (without Accumulator)

      0x00 P0
      0x01 P1
      0x02 P2
      0x03 P3
      0x04 P4
      0x05 P5
      0x06 P6
      0x07 P7
      0x08 P8
      0x09 P9
      0x0a P10
      0x0b P11
      0x0c P12
      0x0d P13
      0x17 A
      0x18 SFLG
      0x19 SCMP
      0x1a LOCK
      0x1b PCTX
      0x1c PEEK
      0x1d PULL
      0x1e PUSH
      0x1f NULL
    ------------------------------------------------------------
    MACROS

      0x20 EQUB
      0x40 EQUW
      0x60 EQUD
      0x80 EQUS
      0xa0 EQUP

Registers are pointers to units of data. They can point to text strings, nodes (objects in the nexus), program code, and a variety of other data types. The type of data is always associated with the register, so that bytecode instructions can validate, to a certain degree, that the arguments they have been given are correct.

Besides registers, other valid arguments that may be passed to PAL instructions are called indices. An index parameter is so called because when it is compiled into bytecode, it is saved to a table, and the argument itself converted to a numeric index. There are 7 types of index, although only 5 of these can be used explicitly within a PAL source file. Indices allow for data such as arbitrary text, code references and object references to be input.

The following code example demonstrates the use of registers and indices. It loads text and code pointers into a number of registers, and then dumps the contents to the screen.

    % This program will only have a ._init section for now
    ._init

    % Load registers P0 and P1
    reg/load     P0, [A violin concerto], P1, [Brahms]
    reg/load     P10, &[._init], P11, [Brahms]

    % Report contents of registers
    reg/dump     P0, P1, P10, P11
    obj/dump     P0, P1, P10, P11

Compile and run the code:

    $ prism test
    $ prose test
    register: P0
    type: PSUNIT_TYPE_PALTEXT
    0x2b82b788504b (len 0x000011)

    register: P1
    type: PSUNIT_TYPE_PALTEXT
    0x2b82b788505e (len 0x000006)

    register: P10
    type: PSUNIT_TYPE_PALCODE
    0x0000

    register: P11
    type: PSUNIT_TYPE_PALTEXT
    0x2b82b788505e (len 0x000006)

    0x2b82b788504b (len 0x000011)
    A violin concerto
    0x2b82b788505e (len 0x000006)
    Brahms
    0x2b82b788505e (len 0x000006)
    Brahms
    prose: ERROR: no main() function found, nothing to do

The error 'no main() function found' means that the PROSE engine could only call _init sections and there was no main function from which to begin full program execution. For the purposes of this exercise the error can be ignored. For simplicity we have omitted defining a main function (as functions are described later on in this tour).

If you run prism -v on the binary, you'll see that the text 'A violin concerto' and 'Brahms' were given the index numbers 0 and 1 respectively. Note that even though the text 'Brahms' appeared twice in the source file, it only has one entry in the text table.

    $ prism -v test
    ------------------------------------------------------------
    PROSE HEADER
      File: test.pro
      Size: 112 bytes
      Compiled by: prism
      Compiler version code: 0.6.0
      Compile date: Wed Sep 30 11:51:25 2009
    ------------------------------------------------------------
    INSTRUCTION CODE

      000000 :  ca 00 4c 00 01 4c 01 ca   0a 5d 09 0b 4c 01 cf 00
      000010 :  01 0a 0b 91 00 01 0a 0b

      Size: 24 bytes
    ------------------------------------------------------------
    CODE LABELS

      idx 000000 len 000005 [_init]

      Size: 7 bytes
    ------------------------------------------------------------
    CODE ADDRESSES

      idx 000000 ref 000000

      Size: 1 bytes
    ------------------------------------------------------------
    TEXT DATA

      idx 000000 len 000011 [A violin concerto]
      idx 000001 len 000006 [Brahms]

      Size: 27 bytes
    ------------------------------------------------------------
    DATA LABELS


      Size: 0 bytes
    ------------------------------------------------------------
    DATA XREF TABLE


      Size: 0 bytes
    ------------------------------------------------------------
    DATA SEGMENTS


      Size: 0 bytes
    ------------------------------------------------------------
    END OF FILE

How to use the stack

The program stack is sized dymamically up and down as items are pushed onto it, and pulled off it. There are no theoretical limits to the number of items that can be pushed onto the stack, as long as there is available system memory.

Data is pushed onto the stack either using specific instructions that manipulate it, or by using special register keywords such as PUSH, PULL and PEEK.

The stack may also be locked. This doesn't prevent data from being pushed onto the stack, but it does protect the data already on the stack from being pulled off it. A lock may only be removed using the stack/unlock instruction, when the topmost item on the stack is the lock to be removed.

The following demonstrates stack operations:

    % This program will only have a ._init section for now
    ._init

    % Load registers P0 and P1 and push onto stack
    reg/load     P0, [A violin concerto], P1, [Brahms]
    stack/push   P0, P1

    % Push code pointer onto stack
    reg/load     PUSH, &[._init]

    % Pull items off stack
    reg/load     P10, PULL
    reg/dump     PULL, PULL
    reg/dump     P10

When run, the above code yields the following output:

    register: PULL
    type: PSUNIT_TYPE_PALTEXT
    0x2b8eb7de505c (len 0x000006)

    register: PULL
    type: PSUNIT_TYPE_PALTEXT
    0x2b8eb7de5049 (len 0x000011)

    register: P10
    type: PSUNIT_TYPE_PALCODE
    0x0000

    prose: ERROR: no main() function found, nothing to do

Local branching

Without code sections or the ability to branch, the program pointer would only be able to move from one instruction to the next during bytecode execution, therefore the instructions would always be read in the order in which they appear in the source file.

PAL provides several mechanisms for modifying the program pointer. Branching is used to describe the ability to jump around to different sections of the local file. If you want to jump to a section of code in a different file, you need to use functions, which are described a little later on in this tour.

To branch to a code label, use either the local/jsr instruction which executes a subroutine that should be returned from using local/rtn, or use the local/jmp instruction which you cannot return from. These are demonstrated below:

    % This program will only have a ._init section for now
    ._init
    local/jsr    &[.routine1]
    local/jmp    &[.routine2]

    % This code section will return to the caller
    .routine1
    obj/dump     [routine1]
    local/rtn

    % This code section will exit
    .routine2
    obj/dump     [routine2]
    local/rtn

Conditional branching allows sections of code within the local file to be called only if a certain condition is met. In this version of PROSE, the only type of test that can be performed is on registers, and it is achieved using the reg/jmpeq, reg/jsreq, reg/jmpneq and reg/jsrneq instructions. These instructions operate like their local/jmp and local/jsr cousins, but only if the two arguments are equal (eq) or not equal (neq).

    % This program will only have a ._init section for now
    ._init
    reg/load     P0, [Mozart]
    reg/load     P1, [Beethoven]

    reg/jsreq    &[.equal], P0, [Mozart]
    reg/jmpneq   &[.notequal], P1, [Mozart]

    .equal
    obj/dump     [register comparison equal]
    local/rtn

    .notequal
    obj/dump     [register comparison not equal]
    local/rtn

Register comparison

Other types of comparison may be performed by using specific comparison instructions such as reg/cmp. These affect the value of the SCMP and SFLG registers.

The SCMP register holds the result of the last test, which can be one of:

0 Not equal
1 Equal
2 Less than
4 Greater than

The SFLG register holds the equality status of the last 32 tests in successive bits. It is a 32-bit integer where the lowest bit (bit 0) is set if the last test was equal, the next bit (bit 1) is set if the test before that was equal and so on.

The reg/clr instruction can be used for clearing the SFLG and SCMP registers.

The following code demonstrates using the SCMP and SFLG registers:

    % This program will only have a ._init section for now
    ._init
    reg/load     P0, #59, P1, #72
    reg/cmp      P0, P1
    reg/jsreq    &[.lt], SCMP, #2
    reg/jsreq    &[.gt], SCMP, #4

    reg/clr
    reg/load     P2, P0, P4, P2, P3, P1, P5, P3
    reg/cmp      P0, P2, P1, P3, P0, P4, P1, P5
    reg/jmpeq    &[.equal], SFLG, #15
    obj/dump     [second test - not all comparisons were equal]
    local/rtn

    .lt
    obj/dump     [first test - result: less than]
    local/rtn

    .gt
    obj/dump     [first test - result: greater than]
    local/rtn

    .equal
    obj/dump     [second test - all comparisons equal]

Data segments

Arbitrary data can be entered into a PAL source file using data entry macros. This data is stored within sections of the bytecode file called data segments, which can be processed using the reg/load instruction in one of its indirect addressing modes. Indirect addressing is enabled if one or more of the instruction arguments are enclosed in parentheses. Different argument types enable different functionality with some of the PAL instructions when indirect addressing is used. It is a powerful feature that is described in length in the online manual pages.

Assemble the following example:

    ~strings
    EQUS {[first violins], [second violins]}
    EQUS {[violas], [cellos], [bass]}

    ~woodwind
    EQUS {[piccolos], [flutes], [clarinets], [oboes]}

    ~odd_numbers
    EQUB {1, 3, 5, 7, 9, 11, 13, 15, 17, 19}
    EQUB {21, 23, 25, 27, 29, 31, 33, 35, 37, 39}
    EQUD {0x292b2d2f, 0x31333537}

    ~pointers
    EQUP {&[.jump1], &[.jump2], &[.jump3], &[.jump4]}

    ~segments
    EQUP {&[~strings], &[~woodwind]}
    EQUP {&[~odd_numbers], &[~pointers]}

    ._init
    % Process each data segment listed in ~segments
    reg/load     P0, (&[~segments])

    .loop
    reg/load     P1, (P0)
    reg/jmpeq    &[.stop], P1, NULL

    % Now read each item from the selected data segment
    reg/load     P2, (P1)

    .loop2
    reg/load     P3, (P2)
    reg/jmpeq    &[.next], P3, NULL
    obj/dump     P3

    % This might be a code pointer, so try calling it
    % If it isn't a code pointer, we discard the error
    error/jmp    &[.trap]
    local/jsr    P3

    .trap
    error/clr
    error/jmp
    local/jmp    &[.loop2]

    .next
    reg/clr P2
    local/jmp    &[.loop]

    .stop
    local/rtn

    .jump1
    obj/dump     [jump1 called]
    local/rtn

    .jump2
    obj/dump     [jump2 called]
    local/rtn

    .jump3
    obj/dump     [jump3 called]
    local/rtn

    .jump4
    obj/dump     [jump4 called]
    local/rtn

When executed, the output should look like this:

    0x2b961ac240b0 (len 0x00000d)
    first violins
    0x2b961ac240bf (len 0x00000e)
    second violins
    0x2b961ac240cf (len 0x000006)
    violas
    0x2b961ac240d7 (len 0x000006)
    cellos
    0x2b961ac240df (len 0x000004)
    bass
    0x2b961ac240e5 (len 0x000008)
    piccolos
    0x2b961ac240ef (len 0x000006)
    flutes
    0x2b961ac240f7 (len 0x000009)
    clarinets
    0x2b961ac24102 (len 0x000005)
    oboes
    0x1
    0x3
    0x5
    0x7
    0x9
    0xb
    0xd
    0xf
    0x11
    0x13
    0x15
    0x17
    0x19
    0x1b
    0x1d
    0x1f
    0x21
    0x23
    0x25
    0x27
    0x292b2d2f
    0x31333537
    0x2b961ac24109 (len 0x00000c)
    jump1 called
    0x2b961ac24117 (len 0x00000c)
    jump2 called
    0x2b961ac24125 (len 0x00000c)
    jump3 called
    0x2b961ac24133 (len 0x00000c)
    jump4 called
    prose: ERROR: no main() function found, nothing to do

You can see how the data segments are stored in the bytecode when you run prism -v, as in the following example:

    $ prism -v test
    ------------------------------------------------------------
    PROSE HEADER
      File: test.pro
      Size: 455 bytes
      Compiled by: prism
      Compiler version code: 0.6.0
      Compile date: Fri Sep 11 17:45:39 2009
    ------------------------------------------------------------
    INSTRUCTION CODE

      000000 :  d0 00 44 04 d0 01 00 cb   5c 20 01 1f d0 02 01 d0
      000010 :  03 02 cb 5c 10 03 1f 91   03 be 5c 04 c7 03 bf be
      000020 :  c6 5d 12 d1 02 c6 5d 22   c8 91 4c 09 c8 91 4c 0a
      000030 :  c8 91 4c 0b c8 91 4c 0c   c8

      Size: 57 bytes
    ------------------------------------------------------------
    CODE LABELS

      idx 000000 len 000005 [jump1]
      idx 000001 len 000005 [jump2]
      idx 000002 len 000005 [jump3]
      idx 000003 len 000005 [jump4]
      idx 000004 len 000005 [_init]
      idx 000005 len 000004 [loop]
      idx 000006 len 000004 [stop]
      idx 000007 len 000005 [loop2]
      idx 000008 len 000004 [next]
      idx 000009 len 000004 [trap]

      Size: 66 bytes
    ------------------------------------------------------------
    CODE ADDRESSES

      idx 000000 ref 000029
      idx 000001 ref 00002d
      idx 000002 ref 000031
      idx 000003 ref 000035
      idx 000004 ref 000000
      idx 000005 ref 000004
      idx 000006 ref 000028
      idx 000007 ref 00000f
      idx 000008 ref 000023
      idx 000009 ref 00001e

      Size: 10 bytes
    ------------------------------------------------------------
    TEXT DATA

      idx 000000 len 00000d [first violins]
      idx 000001 len 00000e [second violins]
      idx 000002 len 000006 [violas]
      idx 000003 len 000006 [cellos]
      idx 000004 len 000004 [bass]
      idx 000005 len 000008 [piccolos]
      idx 000006 len 000006 [flutes]
      idx 000007 len 000009 [clarinets]
      idx 000008 len 000005 [oboes]
      idx 000009 len 00000c [jump1 called]
      idx 00000a len 00000c [jump2 called]
      idx 00000b len 00000c [jump3 called]
      idx 00000c len 00000c [jump4 called]

      Size: 145 bytes
    ------------------------------------------------------------
    DATA LABELS

      idx 000000 len 000007 [strings]
      idx 000001 len 000008 [woodwind]
      idx 000002 len 00000b [odd_numbers]
      idx 000003 len 000008 [pointers]
      idx 000004 len 000008 [segments]

      Size: 52 bytes
    ------------------------------------------------------------
    DATA XREF TABLE

      idx 000000 ref 000000
      idx 000001 ref 000001
      idx 000002 ref 000002
      idx 000003 ref 000003
      idx 000004 ref 000004

      Size: 5 bytes
    ------------------------------------------------------------
    DATA SEGMENTS

      idx 000000 len 000007 {
            81 80 81 82 82 83 84                              .......
      }
      idx 000001 len 000005 {
            83 85 86 87 88                                    .....
      }
      idx 000002 len 00001f {
            29 01 03 05 07 09 0b 0d 0f 11 13 29 15 17 19 1b   )..........)....
            1d 1f 21 23 25 27 61 29 2b 2d 2f 31 33 35 37      ..!#%'a)+-/1357
      }
      idx 000003 len 000009 {
            a3 40 00 40 01 40 02 40 03                        .@.@.@.@.
      }
      idx 000004 len 00000a {
            a1 44 00 44 01 a1 44 02 44 03                     .D.D..D.D.
      }

      Size: 67 bytes
    ------------------------------------------------------------
    END OF FILE

Note that a record is kept in the bytecode of which data entry macros were used to create the data segments. That way the data segments will return the same data items to reg/load(), and they will also appear the same as the source file when disassembled with prism -d.

Functions

Loading multiple programs

Generating and handling errors

Navigating the nexus

Further reading