• BAJA OpCode suggestion

    From Angus Netfoot@VERT to Digital Man on Thu Dec 16 11:56:00 1999
    RE: BAJA OpCode suggestion
    BY: Digital Man to Angus Netfoot on Thu Dec 16 1999 01:00 pm

    I'd like to suggest the introduction of a new BAJA opcode. I don't know you would call it, but in effect it would allow run-time modification of BAJA code-space in a way similar to the POKE opcode known to BASIC progra

    Do you have a suggested syntax? Classic poke functions take an address and value, but how would the programmer know what address to poke? Perhaps an offset from the start of the current module's image? An absolute address wo probably be pretty useless.

    I've been thinking about it some more.

    The idea is to allow the programmer to generate code on-the-fly for immediate execution without the overhead of writing a .BIN file to disk and then executing it. I figure that means either a POKE of some sort allowing the
    code space to be changed, or an EXEC_STR that allows a sequence of BAJA/SBBS instructions in string to be executed. I'm not sure which one is the most suitable there are probably good reasons for and against both.

    If we could do this, we could look at BAJA operation codes as low-level building blocks, and construct higher-level operations out of them. These higher-level operations would only need to be known to the BAJA compiler,
    which would translate any high-level concept into the lower-level codes that SBBS can understand and interpret.

    If you are going to POKE stuff into the current code-space you probably need
    to say WHAT to poke, WHERE to poke it, and HOW MUCH of it to poke. You might want to poke the least significant 1, 2 or 4 bytes of an integer (or integer constant?) or maybe the first 18 bytes of a string (string constant?). You would want to poke relative to your own position, or absolute to the code- space (POKER / POKEA maybe? or with a flag?). So maybe something like:

    POKE <location> <value> <size> <flags>

    <flags> might select RELATIVE or ABSOLUTE location, or even something weird like RELATIVE_TO_ABSOLUTE_LABEL so you could POKE at Label+2 or something.
    You might introduce a Pseudo-Label meaning "Here" so you could poke to $+6
    or whatever.

    With <size> defaulted to 4 and <flags> defaulted to maybe RELATIVE then the last two would disappear in a lot of cases and the last one in almost all cases, so the normal usage would be:

    POKE <location> <value>

    As an alternative to the POKE idea, there is EXEC_STR as in something like:

    EXEC_STR <string to execute> <start offset> <flags>

    The <flags> parameter could allow you to select whether to execute in the current scope or it's own local scopeperhaps? And default to the current scope. The <start offset> would default to zero mostly, I suppose, but would allow you to jump into the code-string somewhere in the middle.

    This looks like it could be good. EXEC_STR would allow you to construct essentially mini-routines of a few dozen instructions in a string, without worrying about where each part of each instructions was to be poked to. But there are probably lots of worrysome complications. Recursive EXEC_STR operations embedded in a string that you are already EXEC_STR-ing?
    Performing loops and other branch-type operations within the string being EXEC_STR-ed? Needs thought.

    Might as well through a peek function in there too, eh? :-)

    Couldn't hurt, although no immediate use comes to mind... Are you thinking of anything specific?

    To make it even more useful, pre-allocated data blocks (in the module code space, so they could be peeked/poked) would be nice.

    I'm not sure I get what you are aiming at.

    Probably could use a WORD and DWORD variant of peek/poke too.

    I think that you _do_ need to allow for variable length POKE operations, like poking a single or two-byte op-code or a 4-byte operand CRC32 "address". I rather like the idea of being able to poke arbitrary strings of bytes so that you could construct a sequence of instructions using a single SPRINTF or something (or even read a string from disk), POKE the entire thing in one go, and then CALL it, GOTO it, or fall through into it.

    Hmmm. Some other instructions might be useful to support this, like a way to declare a block of space with a label on it maybe? Oh, this is what you meant earlier by pre-allocated data blocks. Yes. Good plan.

    #Hello World the HARD way
    SPRINTF Some_Str "\x51Hello, World!" # x51 = PRINT
    POKE PokeHere Some_Str 13
    :PokeHere
    DEF_BYTES 13

    sort of thing.

    That could really open up possiblities to the enterprising programmer.

    Actually, it's sorta scary! :) Suppose we had the POKE op-code and we agreed on a notational equivilence for array syntax. Example, suppose we agreed that

    ADD MyArray[Dest] YourArray[Source]

    is only a notational shorthand for

    ADD MyArray.32 YourArray.Wednesday

    assuming that Dest = 32 and Source = "Wednesday". Then we could do something like:

    INT SAddr # we need these
    STR IAddr1 IAddr2

    SPRINTF SAddr "MyArray.%ld" Dest # compute CRC\Address of Dest
    STRUPR SAddr
    CRC32 Iaddr1 Saddr

    SPRINTF SAddr "YourArray.%s" Source # compute CRC\Address of Source
    STRUPR SAddr
    CRC32 IAddr2 Saddr

    POKE Modify+2 IAddr1 # Modify code-space Dest
    POKE Modify+6 IAddr2 # Modify code-space Source

    :Modify
    ADD Dummy Dummy # This is modified by the POKE

    We have just added two array elements together, computing the elements at run- time and we have used numeric and non-numeric indexes to boot. And the only new instruction introduced is the POKE itself.

    Or instead of poke, you could do

    SPRINTF CodeStr "\x7F\x38%ld%ld" IAddr1 IAddr2 # neat!
    EXEC_STR CodeStr

    The coding looks awful, but it could be simplified by a few new support operations. Maybe something similar to an SPRINTF that combined it's
    arguments in some standard way, did a strupr() on the entire thing and
    computed the CRC all in one operation. But this is only icing, because
    you could simply change BAJA.EXE to read

    ADD MyArray[Dest] YourArray[Source]

    and generate the necessary long-hand code invisibly, "behind the scenes" as
    it were. Or you could leave the BAJA compiler alone and let programmers develop their own pre-processor.

    To declare an array,

    INT Array[10..16] Yours[RED, GREEN, BLUE] X[1..3][A, B, C]

    could be a notational convenience meaning

    INT Array.10 Array.11 Array.12 Array.13 Array.14 Array.15 Array.16
    INT Yours.RED Yours.GREEN Yours.BLUE
    INT X.1.A X.1.B X.1.C X.2.A X.2.B X.2.C X.3.A X.3.B X.3.C

    And either left to the programmer to hand-translate or handled by some sort of preprocessor or by BAJA.EXE itself.

    You could do multiple indexes (multi-dimensional arrays) by a similar process where the "address" of the element could be computed by:

    SPRINTF SAddr "Array.%ld.%ld.%ld" X Y Z # three dimentions

    And structs/records could be done in some similar fashion.

    I have not worked out the details, but you could probably do a computed CALL
    or GOTO if you poked the destination part of a regular CALL or GOTO.

    I think there is a lot of potential here with the addition of a POKE or maybe EXEC_STR opcode, and possibly (but not necessarily) a couple other opcodes to simplify generation of CRC32 "addresses" of variables. The nice thing is, once these building-block op-codes are in, people can come up with clever ways to use them _afterwards_, without the SBBS executable having to be changed. Only the BAJA compiler or a pre-processor would need to be updated to handle the
    new ideas, whatever they were, or the programmer could hand-translate at a pinch.

    Enough talk from me. What do you think?

    ---
    Synchronet telnet://talamasca-bbs.com http://www.talamasca-bbs.com
  • From Digital Man@VERT to Angus Netfoot on Fri Dec 17 06:17:24 1999
    RE: BAJA OpCode suggestion
    BY: Angus Netfoot to Digital Man on Thu Dec 16 1999 07:56 pm

    I moved the message to DOVE-Net/Baja Programming where it's more "on topic". :-)

    I've been thinking about it some more.

    The idea is to allow the programmer to generate code on-the-fly for immediat execution without the overhead of writing a .BIN file to disk and then executing it. I figure that means either a POKE of some sort allowing the code space to be changed, or an EXEC_STR that allows a sequence of BAJA/SBBS instructions in string to be executed. I'm not sure which one is the most suitable there are probably good reasons for and against both.

    If we could do this, we could look at BAJA operation codes as low-level building blocks, and construct higher-level operations out of them. These higher-level operations would only need to be known to the BAJA compiler, which would translate any high-level concept into the lower-level codes that SBBS can understand and interpret.

    If you are going to POKE stuff into the current code-space you probably need to say WHAT to poke, WHERE to poke it, and HOW MUCH of it to poke. You migh want to poke the least significant 1, 2 or 4 bytes of an integer (or integer constant?) or maybe the first 18 bytes of a string (string constant?). You would want to poke relative to your own position, or absolute to the code- space (POKER / POKEA maybe? or with a flag?). So maybe something like:

    POKE <location> <value> <size> <flags>

    <flags> might select RELATIVE or ABSOLUTE location, or even something weird like RELATIVE_TO_ABSOLUTE_LABEL so you could POKE at Label+2 or something. You might introduce a Pseudo-Label meaning "Here" so you could poke to $+6 or whatever.

    With <size> defaulted to 4 and <flags> defaulted to maybe RELATIVE then the last two would disappear in a lot of cases and the last one in almost all cases, so the normal usage would be:

    POKE <location> <value>

    As an alternative to the POKE idea, there is EXEC_STR as in something like:

    EXEC_STR <string to execute> <start offset> <flags>

    The <flags> parameter could allow you to select whether to execute in the current scope or it's own local scopeperhaps? And default to the current scope. The <start offset> would default to zero mostly, I suppose, but would allow you to jump into the code-string somewhere in the middle.

    This looks like it could be good. EXEC_STR would allow you to construct essentially mini-routines of a few dozen instructions in a string, without worrying about where each part of each instructions was to be poked to. But there are probably lots of worrysome complications. Recursive EXEC_STR operations embedded in a string that you are already EXEC_STR-ing? Performing loops and other branch-type operations within the string being EXEC_STR-ed? Needs thought.

    The first problem I see with EXEC_STR is the NUL-terminated string issue. Wouldn't be able to execute an IF_TRUE DO_SOMETHING END_IF, for example. So, perhaps an EXEC_BUF with a length argument. sprintf() could still be used to build the buffer however.

    If I do something like this, creating a CMDSHELL.INC (Baja equivalent for cmdshell.h) would improve the readability considerably:

    sprintf str "%c%cTEST\0%c" CS_IF_TRUE CS_PRINT CS_ENDIF
    exec_buf str 8

    Might as well through a peek function in there too, eh? :-)

    Couldn't hurt, although no immediate use comes to mind... Are you thinking anything specific?

    Nope. Just thinking if you can poke somewhere, it might be useful to peek at it later (like a data block). Or peek to see if the data hasn't been initialized or something.

    > > To make it even more useful, pre-allocated data blocks (in the module cod
    space, so they could be peeked/poked) would be nice.

    I'm not sure I get what you are aiming at.

    To poke data (rather than code), you'll need blocks of code space that can be safely written over. There is currently no way to do this other than creating a block of RETURNS (or some other unreachable code) at the end of your module (or some other place safe).

    Probably could use a WORD and DWORD variant of peek/poke too.

    I think that you _do_ need to allow for variable length POKE operations, lik poking a single or two-byte op-code or a 4-byte operand CRC32 "address". I rather like the idea of being able to poke arbitrary strings of bytes so tha you could construct a sequence of instructions using a single SPRINTF or something (or even read a string from disk), POKE the entire thing in one go and then CALL it, GOTO it, or fall through into it.

    Yeah, I see the need for a NOP instruction. :-)

    Hmmm. Some other instructions might be useful to support this, like a way t declare a block of space with a label on it maybe? Oh, this is what you mea earlier by pre-allocated data blocks. Yes. Good plan.

    #Hello World the HARD way
    SPRINTF Some_Str "\x51Hello, World!" # x51 = PRINT
    POKE PokeHere Some_Str 13
    :PokeHere
    DEF_BYTES 13

    sort of thing.

    That could really open up possiblities to the enterprising programmer.

    Actually, it's sorta scary! :) Suppose we had the POKE op-code and we agre on a notational equivilence for array syntax. Example, suppose we agreed th

    ADD MyArray[Dest] YourArray[Source]

    is only a notational shorthand for

    ADD MyArray.32 YourArray.Wednesday

    assuming that Dest = 32 and Source = "Wednesday". Then we could do somethin like:

    INT SAddr # we need these
    STR IAddr1 IAddr2

    SPRINTF SAddr "MyArray.%ld" Dest # compute CRC\Address of Dest
    STRUPR SAddr
    CRC32 Iaddr1 Saddr

    SPRINTF SAddr "YourArray.%s" Source # compute CRC\Address of Source
    STRUPR SAddr
    CRC32 IAddr2 Saddr

    POKE Modify+2 IAddr1 # Modify code-space Dest
    POKE Modify+6 IAddr2 # Modify code-space Source

    :Modify
    ADD Dummy Dummy # This is modified by the POKE

    We have just added two array elements together, computing the elements at ru time and we have used numeric and non-numeric indexes to boot. And the only new instruction introduced is the POKE itself.

    Or instead of poke, you could do

    SPRINTF CodeStr "\x7F\x38%ld%ld" IAddr1 IAddr2 # neat!
    EXEC_STR CodeStr

    The coding looks awful, but it could be simplified by a few new support operations. Maybe something similar to an SPRINTF that combined it's arguments in some standard way, did a strupr() on the entire thing and computed the CRC all in one operation. But this is only icing, because
    you could simply change BAJA.EXE to read

    ADD MyArray[Dest] YourArray[Source]

    and generate the necessary long-hand code invisibly, "behind the scenes" as it were. Or you could leave the BAJA compiler alone and let programmers develop their own pre-processor.

    To declare an array,

    INT Array[10..16] Yours[RED, GREEN, BLUE] X[1..3][A, B, C]

    could be a notational convenience meaning

    INT Array.10 Array.11 Array.12 Array.13 Array.14 Array.15 Array.16
    INT Yours.RED Yours.GREEN Yours.BLUE
    INT X.1.A X.1.B X.1.C X.2.A X.2.B X.2.C X.3.A X.3.B X.3.C

    And either left to the programmer to hand-translate or handled by some sort preprocessor or by BAJA.EXE itself.

    You could do multiple indexes (multi-dimensional arrays) by a similar proces where the "address" of the element could be computed by:

    SPRINTF SAddr "Array.%ld.%ld.%ld" X Y Z # three dimentions

    And structs/records could be done in some similar fashion.

    I have not worked out the details, but you could probably do a computed CALL or GOTO if you poked the destination part of a regular CALL or GOTO.

    I think there is a lot of potential here with the addition of a POKE or mayb EXEC_STR opcode, and possibly (but not necessarily) a couple other opcodes t simplify generation of CRC32 "addresses" of variables. The nice thing is, o these building-block op-codes are in, people can come up with clever ways to use them _afterwards_, without the SBBS executable having to be changed. On the BAJA compiler or a pre-processor would need to be updated to handle the new ideas, whatever they were, or the programmer could hand-translate at a pinch.

    Enough talk from me. What do you think?

    NOP, POKE and PEEK could do a lot. A data buffer is probably not necessary if you could just label a group of NOPs (perhaps a Baja function to simplify the syntax, however), but it should be made clear to programmers that a block of bytes defined in that manner are CODE bytes and if they're to be used as data, proper precautions should be taken to avoid inadvertent execution.

    EXEC_BUF could be pretty cool too, but would require more work in SBBS.EXE. Since you can't "insert" opcodes into the current module image, it would have to create a temporary module image to execute the buffer. While it would be faster and more convenient than creating .BIN files on the fly, it would in reality be a very similar implementation.

    Since NOP/POKE/PEEK have the most flexibility, I'll code them first and leave EXEC_BUF for a later effort.

    Combining SPRINTF, STRUPR, and CRC32 into a single function is probably not very realistic (at least at the PCMS level). A combined STRUPR/CRC32 is a no-brainer and is a good idea (CRC32UPR?). It would leave the original variable case intact too.

    Thanks for the suggestions.

    Rob
    ---
    Synchronet Vertrauen Home of Synchronet telnet://vert.synchro.net
  • From Angus Netfoot@VERT to Digital Man on Mon Jan 3 00:24:00 2000
    RE: BAJA OpCode suggestion
    BY: Digital Man to Angus Netfoot on Fri Dec 17 1999 02:17 pm

    As an alternative to the POKE idea, there is EXEC_STR as in something like:

    EXEC_STR <string to execute> <start offset> <flags>

    The first problem I see with EXEC_STR is the NUL-terminated string issue. Wouldn't be able to execute an IF_TRUE DO_SOMETHING END_IF, for example.
    So, perhaps an EXEC_BUF with a length argument. sprintf() could still be used to build the buffer however.

    :) I never thought of the null-byte problem. EXEC_BUF sounds like an ideal way to circumvent it. There may be other problems that have not come to mind as yet. I have been thinking about POKE for a while, but the EXEC_BUF idea is a relatively new one here.

    Some things will have to be decided. For example, if, within your buffer you code something like

    GOTO My_Label

    Would that refer to My_Label _within_ the buffer, or to a My_Label outside
    the buffer and within the calling module? Declaring My_Label within a
    buffer itself is probably tricky, so maybe I should have used

    GOTO 45 # offset 45

    instead of a named label. Anyway, you get my point. If you have an END_CMD
    in a buffer that you EXEC_BUF, should there also be a CMD_HOME in that buffer or will it refer to a CMD_HOME in the calling module?

    If I do something like this, creating a CMDSHELL.INC (Baja equivalent for cmdshell.h) would improve the readability considerably:

    sprintf str "%c%cTEST\0%c" CS_IF_TRUE CS_PRINT CS_ENDIF
    exec_buf str 8

    That would certainly be easier than having to look up the hex for each opcode you planned to use. I used to program the Z80 in hex from memory, but that
    was a loonnng time ago!

    NOP, POKE and PEEK could do a lot. A data buffer is probably not necessary if you could just label a group of NOPs (perhaps a Baja function to
    simplify the syntax, however), but it should be made clear to programmers that a block of bytes defined in that manner are CODE bytes and if they're to be used as data, proper precautions should be taken to avoid inadvertent execution.

    Yes, this would allow a lot of extra flexibility with a relative minimum of
    new opcodes for you to implement. BAJA Programmers would then have the
    ability to very forcefully shoot themselves in the foot, and will have to
    take all the usual precautions against doing so. :)

    EXEC_BUF could be pretty cool too, but would require more work in SBBS.EXE. Since you can't "insert" opcodes into the current module image, it would have to create a temporary module image to execute the buffer. While it would be faster and more convenient than creating .BIN files on the fly,
    it would in reality be a very similar implementation.

    I imagined that this was the more difficult approach. If you set up a
    separate module image, can it still resolve the variables in the calling
    module as local? Or would it have to refer to them as GLOBAL INT/STR as per
    a separate .BIN file? I suspect that a module calling EXEC_BUF to manipulate it's own (IOW the callers) variables is likely to be a fairly common theme.

    If you use a separate module image, that answers my questions WRT the use
    of CMD_HOME and CMD_END and similar matters.

    Since NOP/POKE/PEEK have the most flexibility, I'll code them first and leave EXEC_BUF for a later effort.

    I am keen to see it in action! Err... What were you thinking of doing WRT
    the addressing of the PEEKed/POKEd data? At or relative to a label?

    A combined STRUPR/CRC32 is a no-brainer and is a good idea (CRC32UPR?).
    It would leave the original variable case intact too.

    Not essential at all, but might ease things for the BAJA programmer a bit.
    You could call it VAR_ADDR or something like that, since it essentially computes the "address" of a variable, but CRC32UPR is probably better,
    since there is no reason not to use it for other purposes that are
    unrelated to addressing variables.

    ---
    Synchronet telnet://talamasca-bbs.com http://www.talamasca-bbs.com