About Encoding And Decoding Base-64 In FORTH

In This Previous Posting, I wrote that I had written some source-code in the language FORTH, that decodes standard Base-64 into a binary array of data, in output sizes that are multiples of 36 Bytes. For my own purposes, there might be no need to output Base-64, because I can use command-line utilities to prepare Base-64 strings, and then only use those as a means to enter the data, and embed it into future, hypothetical source code.

But the purposes of other, hypothetical software-developers have not been met with this exercise, because those people may need to be able to output Base-64, which means they’d need a matching encoder.

Unfortunately, the language does not lend itself to that easily, if a standard Base-64 radix is being implied, because 6-bit output-numerals would need to be bit-aligned, and trying to align fields of bits in FORTH is difficult.

(Edit 07/25/2017 : )

One subject which I have investigated more completely now, is the fact that the numeral-to-text conversion utilities built-in to FORTH, seem to continue to produce output, even if a Base of 64 has been set. In theory, the FORTH developers could have adopted a custom radix, in order to be able to state, that their binary-to-FB64 conversion is computed faster, than standard Base-64 could be. But OTOH, the characters output, could just become garbage, by the time 24-bit numerals are to be streamed:

 


dirk@Klystron:~$ gforth
Gforth 0.7.2, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
: list-forth-b64 [ base @ decimal ] 64 base ! &255 &0 do i . space loop [ base ! ] ;  ok
list-forth-b64 0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  [  \  ]  ^  _  `  a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v  10  11  12  13  14  15  16  17  18  19  1A  1B  1C  1D  1E  1F  1G  1H  1I  1J  1K  1L  1M  1N  1O  1P  1Q  1R  1S  1T  1U  1V  1W  1X  1Y  1Z  1[  1\  1]  1^  1_  1`  1a  1b  1c  1d  1e  1f  1g  1h  1i  1j  1k  1l  1m  1n  1o  1p  1q  1r  1s  1t  1u  1v  20  21  22  23  24  25  26  27  28  29  2A  2B  2C  2D  2E  2F  2G  2H  2I  2J  2K  2L  2M  2N  2O  2P  2Q  2R  2S  2T  2U  2V  2W  2X  2Y  2Z  2[  2\  2]  2^  2_  2`  2a  2b  2c  2d  2e  2f  2g  2h  2i  2j  2k  2l  2m  2n  2o  2p  2q  2r  2s  2t  2u  2v  30  31  32  33  34  35  36  37  38  39  3A  3B  3C  3D  3E  3F  3G  3H  3I  3J  3K  3L  3M  3N  3O  3P  3Q  3R  3S  3T  3U  3V  3W  3X  3Y  3Z  3[  3\  3]  3^  3_  3`  3a  3b  3c  3d  3e  3f  3g  3h  3i  3j  3k  3l  3m  3n  3o  3p  3q  3r  3s  3t  3u  3v   ok
bye 
dirk@Klystron:~$ 


 

My conclusion is, that This pseudo- Base-64 streaming remains usable, even when 24-bit numerals are given.

This conclusion reverses a negative, tentative conclusion, which I had only given yesterday.

I have by now coded both the encoder and decoder for standard Base-64, which I’ve named ‘b64-stream’ and ‘b64-parse’ respectively, but as well the encoder and decoder for the pseudo- Base-64, which I call ‘fb64-stream’ and ‘fb64-parse’. At this point, Base-64 has been implemented in a way software-experts would consider complete, with a full non-standard version of Base-64. This is what the code ultimately does:

 


dirk@Klystron:~$ cd ~/Programs
dirk@Klystron:~/Programs$ gforth fb64-parse-6.fs
Gforth 0.7.2, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
S" generate1234TERRIBLE.,?-" 4 / b64-stream b64-parse 0 4 * type generate1234TERRIBLE.,?- ok
S" generate1234TERRIBLE.,?-" 4 / fb64-stream fb64-parse 0 4 * type generate1234TERRIBLE.,?- ok
bye 
dirk@Klystron:~/Programs$ 


 

My custom-semantics assume that on the stack, a binary array exists, with a numeric value placed on top of it, which warns each encoder, how many 32-bit words each array holds. OTOH, the input to each decoder expects a standard, full string, which the corresponding encoder outputs, and which also exist as two items on the stack each time, where the top numeral states how many characters long the string is, as per standard FORTH.

And below is the source-code (Updated 08/02/2017 : )

 


variable fb64-array-buffsize

: infix-number 0. bl word count >number 2drop drop ;

: b64-array
  create swap dup fb64-array-buffsize ! 3 * 1+ 4 * *  allot  ( buffer-size depth )
  does> rot rot fb64-array-buffsize @ 3 * 1+ * + 4 * + ; immediate ( word buffer )


create b64-ref-array
  'A , 'B , 'C , 'D , 'E , 'F , 'G , 'H , 'I , 'J , 'K , 'L , 'M , 'N , 'O , 'P ,
  'Q , 'R , 'S , 'T , 'U , 'V , 'W , 'X , 'Y , 'Z , 'a , 'b , 'c , 'd , 'e , 'f , 'g ,
  'h , 'i , 'j , 'k , 'l , 'm , 'n , 'o , 'p , 'q , 'r , 's , 't , 'u , 'v , 'w , 'x , 'y , 'z ,
  '0 , '1 , '2 , '3 , '4 , '5 , '6 , '7 , '8 , '9 , '+ , '/ ,

create fb64-ref-array
  '0 , '1 , '2 , '3 , '4 , '5 , '6 , '7 , '8 , '9 , 'A , 'B , 'C , 'D , 'E , 'F ,
  'G , 'H , 'I , 'J , 'K , 'L , 'M , 'N , 'O , 'P , 'Q , 'R , 'S , 'T , 'U , 'V , 'W ,
  'X , 'Y , 'Z , 91 , '\ , 93 , '^ , '_ , '` , 'a , 'b , 'c , 'd , 'e , 'f , 'g , 'h , 'i , 'j ,
  'k , 'l , 'm , 'n , 'o , 'p , 'q , 'r , 's , 't , 'u , 'v ,

3 2 b64-array b64-output
3 2 b64-array fb64-output

variable b64-sum
variable fb64-sum
variable b64-out-temp
variable fb64-out-temp

create pad2 5 allot

create b64-pad 240 allot
create fb64-pad 240 allot

: test-ref-array
  64 0 do
    b64-ref-array i cells + c@ emit
  loop ;


: b64-field ( string -- 32-bit-number )
  0 b64-sum !
  dup 0 do
    over i + c@ ( string in-char )
    b64-sum @ 64 * b64-sum !
    64 0 do
      dup ( string in-char in-char )
      b64-ref-array i cells + c@ ( string in-char in-char test-char )
      = if
	b64-sum @ i + b64-sum !
      then
    loop
    drop
  loop
  drop drop

  b64-sum @ $00FF0000 and 16 rshift
  b64-sum @ $0000FF00 and +
  b64-sum @ $000000FF and 16 lshift +

\  b64-sum @
 ;

: fb64-field ( string -- 32-bit-number )
  0 fb64-sum !
  dup 0 do
    over i + c@ ( string in-char )
    fb64-sum @ 64 * fb64-sum !
    64 0 do
      dup ( string in-char in-char )
      fb64-ref-array i cells + c@ ( string in-char in-char test-char )
      = if
	fb64-sum @ i + fb64-sum !
      then
    loop
    drop
  loop
  drop drop

  fb64-sum @ $00FF0000 and 16 rshift
  fb64-sum @ $0000FF00 and +
  fb64-sum @ $000000FF and 16 lshift +

\  fb64-sum @
 ;

: fb64-out ( 32-bit number -- string-out )
  base @ swap ( prev-base 32-bit-number )
  64 base !
  $00FFFFFF and ( prev-base 24-bit-number )
  dup $00FF0000 and 16 rshift fb64-out-temp !
  dup $0000FF00 and fb64-out-temp @ + fb64-out-temp !
  $000000FF and 16 lshift fb64-out-temp @ + ( prev-base 24-bit-big-endian )
  0 ( prev-base dnumber )
  <# # # # # #> ( prev-base string-out )
  rot base ! ( string-out )
 ;

: b64-out ( 32-bit-number -- string-out )
  $00FFFFFF and ( 24-bit-number )
  dup $00FF0000 and 16 rshift fb64-out-temp !
  dup $0000FF00 and fb64-out-temp @ + fb64-out-temp !
  $000000FF and 16 lshift fb64-out-temp @ + ( 24-bit-big-endian )

  s" AAAA" pad2 place

  262144 /mod ( rem q )
    cells b64-ref-array + c@ pad2 1+ c!
  4096 /mod ( rem q )
    cells b64-ref-array + c@ pad2 2 + c!
  64 /mod ( rem q )
    cells b64-ref-array + c@ pad2 3 + c!

    cells b64-ref-array + c@ pad2 4 + c!

  pad2 count ( string-out ) ;

: b64-parse ( string -- array size )
  dup 16 / ( string job-size )

  0. bl word count >number 2drop drop ( string job-size buffer-number )
  swap 2swap ( buffer-number job-size string )

2 pick 0 do

    2dup ( size string string )

    drop 4 ( size string string/4 )
    b64-field ( size string number )

    rot rot ( size subv1 string )

    swap 4 + swap ( size subv1 +string )
    2dup ( size subv1 +string +string )
    drop 4 ( size subv1 +string +string/4 )
    b64-field ( size subv1 +string number )

    $100 /mod ( size subv1 +string mod q ) swap 24 lshift

    4 roll ( size +string q mod subv1 ) +

    i 3 * 6 pick b64-output ! ( size +string q )

    rot rot ( size subv2 +string )

    swap 4 + swap ( size subv2 ++string )
    2dup drop 4 ( size subv2 ++string ++string/4 )
    b64-field ( size subv2 ++string number )

    $10000 /mod ( size subv2 +string mod q ) swap 16 lshift

    4 roll ( size ++string q mod-aug subv2 ) +

    i 3 * 1+ 6 pick b64-output ! ( size ++string q )

    rot rot ( size subv3 ++string )

    swap 4 + swap ( size subv3 +++string )
    2dup drop 4 ( size subv3 +++string +++string/4 )

    b64-field ( size subv3 +++string number ) 8 lshift
    3 roll ( size +++string number subv3 ) +

    i 3 * 2 + 5 pick b64-output ! ( size +++string )


    swap 4 + swap ( size ++++string )

  loop ( buffer-number job-size garbage garbage ) drop drop



  swap 0 swap b64-output swap
  3 * ( array word-size ) ;


: fb64-parse ( string -- array size )
  dup 16 / ( string job-size )

  0. bl word count >number 2drop drop ( string job-size buffer-number )
  swap 2swap ( buffer-number job-size string )

2 pick 0 do

    2dup ( size string string )

    drop 4 ( size string string/4 )
    fb64-field ( size string number )

    rot rot ( size subv1 string )

    swap 4 + swap ( size subv1 +string )
    2dup ( size subv1 +string +string )
    drop 4 ( size subv1 +string +string/4 )
    fb64-field ( size subv1 +string number )

    $100 /mod ( size subv1 +string mod q ) swap 24 lshift

    4 roll ( size +string q mod subv1 ) +

    i 3 * 6 pick fb64-output ! ( size +string q )

    rot rot ( size subv2 +string )

    swap 4 + swap ( size subv2 ++string )
    2dup drop 4 ( size subv2 ++string ++string/4 )
    fb64-field ( size subv2 ++string number )

    $10000 /mod ( size subv2 +string mod q ) swap 16 lshift

    4 roll ( size ++string q mod-aug subv2 ) +

    i 3 * 1+ 6 pick fb64-output ! ( size ++string q )

    rot rot ( size subv3 ++string )

    swap 4 + swap ( size subv3 +++string )
    2dup drop 4 ( size subv3 +++string +++string/4 )

    fb64-field ( size subv3 +++string number ) 8 lshift
    3 roll ( size +++string number subv3 ) +

    i 3 * 2 + 5 pick fb64-output ! ( size +++string )


    swap 4 + swap ( size ++++string )

  loop ( buffer-number job-size garbage garbage ) drop drop



  swap 0 swap fb64-output swap
  3 * ( array word-size ) ;


: fb64-stream ( array word-size -- string )
   3 / ( array job-size )

  s" " pad place

  0 do ( array )
    dup i 3 * 4 * + @ $1000000 /mod swap ( array q/...8 mod/.24 )
    fb64-out ( array q/...8 string )
    pad +place ( array q/...8 ) $FF and swap ( subv1/...8 array )

    dup i 3 * 1+ 4 * + @ $10000 /mod swap ( subv1/...8 +array q/..16 mod/..16 ) 8 lshift ( mod/.16. )
    3 roll + ( +array subv3/..16 subv2/.24 )
    fb64-out ( +array subv3/..16 string )
    pad +place ( +array subv3/..16 ) $FFFF and swap ( subv3/..16 +array )

    dup i 3 * 2 + 4 * + @ $100 /mod swap ( subv3/..16 ++array q/.24 mod/...8 ) 16 lshift ( mod/.8.. )
    3 roll + ( ++array subv5/.24 subv4/.24 )
    fb64-out ( ++array subv4/.24 string )
    pad +place ( ++array subv5/.24 )

    fb64-out ( ++array string )
    pad +place ( ++array )

  loop

  drop pad count swap fb64-pad 2 pick cmove fb64-pad swap
 ;


: b64-stream ( array byte--size -- string )
  3 / ( array job-size )

  s" " pad place

  0 do ( array )
    dup i 3 * 4 * + @ $1000000 /mod swap ( array q/...8 mod/.24 )
    b64-out ( array q/...8 string )
    pad +place ( array q/...8 ) $FF and swap ( subv1/...8 array )

    dup i 3 * 1+ 4 * + @ $10000 /mod swap ( subv1/...8 +array q/..16 mod/..16 ) 8 lshift ( mod/.16. )
    3 roll + ( +array subv3/..16 subv2/.24 )
    b64-out ( +array subv3/..16 string )
    pad +place ( +array subv3/..16 ) $FFFF and swap ( subv3/..16 +array )

    dup i 3 * 2 + 4 * + @ $100 /mod swap ( subv3/..16 ++array q/.24 mod/...8 ) 16 lshift ( mod/.8.. )
    3 roll + ( ++array subv5/.24 subv4/.24 )
    b64-out ( ++array subv4/.24 string )
    pad +place ( ++array subv5/.24 )

    b64-out ( ++array string )
    pad +place ( ++array )

  loop

  drop pad count swap b64-pad 2 pick cmove b64-pad swap
 ;


 

There was an important caveat to this project, which slowed down my ability to obtain full results, when running on a 64-bit CPU.

It can happen that according to FORTH syntax, we are starting with a 32-bit value, and dividing it by a 24-bit modulus, expecting that our quotient will be a mere 8-bit quotient. Similarly, if we divide a 32-bit value by a 16-bit modulus, we expect to obtain a mere 16-bit quotient.

But, because the CPU is using 64-bit registers, we have obtained a 40-bit and a 48-bit quotient unknowingly, the higher bits of which are filled with garbage. We must therefore constrain our assumed 8-bit quotient to 8 bits, and constrain our assumed 16-bit quotient to 16-bits explicitly, and using 32-bit language. This bug really annoyed me, before I found it.

(Edit 07/29/2017 : I have examined my code more-closely, because a main question which was still unanswered to my mind, was why a similar bug did not occur when I had successfully implemented ‘b64-parse’ and ‘fb64-parse’ , without taking special consideration for the possibility that to perform a modulus-division by ‘$100′ in those two subroutines, might result in a quotient with more than 16 bits. Here we started out with a 24-bit number, and correctly obtained a 16-bit quotient.

That number originated as a summation that started with the constant ‘0’ . Further, it should matter not, what the literal value of the 32-bit word was, because as I had found in implementing ‘b64-stream’ and ‘fb64-stream’ , the garbage-bits that ended up in the quotients, cannot have originated within the 32-bit word, that FORTH is officially manipulating.

Because there were no errors in the initial implementation of ‘b64-parse’ or ‘fb64-parse’ , I have to conclude that the malfunction took place in the 64-bit implementation of the basic Fetch operator, which is written ‘@’ . Simply manipulating 32-bit words on the stack, does not corrupt any higher register-bits, where the top of the stack is presumably optimized to be stored in CPU registers, not RAM locations. I.e., in the implementation of ‘b64-parse’ , the sequence ‘ 4 roll ‘ brings a 32-bit value named ‘subv2′ to the top of the stack, that has no garbage higher than the 16th bit. But the Fetch operator, followed by the modulus division, apparently does.

Apparently, to implement in FORTH what the 64-bit compilation fails to do, would be to give the instructions:

 


@ $FFFFFFFF and

 

Fetch fails to make sure that the higher field in the CPU register, indeed contains only zeroes. This could be due to the possibility, that source code for the 32-bit FORTH ‘gforth’ was only compiled in a 64-bit environment, without any special consideration for that environment.

OTOH, there exists the standard 2-Fetch operator, which is written ‘2@’ , which assumes that a 32-bit address is on the top of the stack, and which fetches a 64-bit data-word, into 2 explicit 32-bit positions on the top of the stack. Later, after ‘2@’ has been given, certain FORTH words and subroutines will treat pairs of stack-positions as double-width numbers. But because here, attention has been called to the fact that those positions represent 64-bit values, I expect to find no bugs.

Now, I could engage in some wanton speculation, that if FORTH has been compiled as a 64-bit implementation, each stack-position is a 64-bit position. This wild idea coincides well with the fact that in 64-bit FORTH, the word ‘cells’ translates into ‘ 8 * ‘ instead of into ‘ 4 * ‘ . And, when accessing my arrays ‘b64-ref-array’ and ‘fb64-ref-array’ , just to use the word ‘cells’ generally continues to work. This suggests that each element of an array has a width of 64 bits, but that I only chose to store single ASCII character-codes in each one.

If this were true, then to use the standard, 32-bit operations for double-width arithmetic should become unnecessary and wasteful, on a 64-bit FORTH. And to translate a single 64-bit position into two standard, 32-bit stack values, would only require a single operation, while to convert back should only require 2 operations. The 32-bit, double-width operations would still be defined for compatibility with 32-bit FORTH.

But then the great failing of 64-bit FORTH would be, that the standard Fetch operator is no longer compatible, as explained above. )


 

( Added Note about the Source-Code : )

The above example is one of the few, which I actually encourage the reader to Copy-and-Paste, If the reader possesses a FORTH-interpreter. But if the reader wishes either to try out or use the code, then there are a few things he must know.

The actual code-box above is set to ‘contenteditable’. This means that the reader can actually position his cursor inside it, in such a way that the cursor stays. Then, he can use <Ctrl>+A to select the entire text, and then <Ctrl>+C to copy it to his clipboard.

Unfortunately this could also mean, that the reader can alter the contents of the box above, just by typing – either intentionally, or more probably, accidentally. If that happens, just press <F5>, or whichever other command refreshes the page on the browser, to refresh my original version of the code.

Also, merely Pasting the above text into a console-session of a FORTH-interpreter, will not work. And the reason is the newline-characters, each of which when Pasted, will prompt for some sort of reaction from the interactive session-window. This will even happen if a newline-character has been Pasted between a ‘:’ and a ‘;’ , resulting in an error message.

In order to include the above source code into a FORTH-interpreter-session, it would be necessary to create a new text file – usually filenames that end in a ‘.fs’ are used – and using a text-editor, not a word-processor, to paste the text into this file. Then, this file can be included into a FORTH-interpreter-session. Under Linux, we can just give the filename on the command line that starts FORTH. But there are also ways to do that, from within a FORTH session.

(Edit 07/26/2017 : )

Also please note, that ‘gforth’ possesses the string-concatenation syntax used above:

 


    pad +place

 

Some versions of FORTH do not recognize this form, and instead accept:

 


    pad append

 

Please also note that, as I stated in the linked, earlier posting, my code does not implement the ‘=’ symbol which usually belongs to Base-64, and instead assumes that an integer number of work-units is being supplied, that consist of either:

  • 4 * 3 Bytes of binary data, or
  • 16 Base-64 characters.

If the length-indicating numbers imply incomplete work-units, then those will not be processed. What this means in practice, is that any usage scenario will practically require that embedded Base-64 strings be prepared using external software, such as the Linux ‘base64′ command, or that code written to output Base-64, be carefully crafted to feed complete work unit(s) to the FORTH subroutines ‘b64-stream’ or ‘fb64-stream’.

Also, if the reader does wish to use this code, please note the two lines as above:

 


3 2 b64-array b64-output
3 2 b64-array fb64-output

 

The only way I could get this to work for now, was to tell the FORTH interpreter to allocate 2 * 2 buffers of binary output statically, into which Base-64 input-strings are to be parsed, and that each buffer is to hold a maximum of 3 work-units.

The number of work-units per buffer must be equal, according to the present version.

These lines can and should be modified to suit the reader’s needs. It’s important that these buffers be understood, because when the subroutines ‘b64-parse’ and ‘fb64-parse’ finish, all they do is place the addresses of these buffers on the stack. Hence, any changes made to buffer (0) if it is being output to more than once, will have a persistent effect on the data which an earlier invocation ‘sees’, even though these subroutines are only meant to be executed sequentially.

The two parse-subroutines expect to be followed immediately by a numeral, that states which buffer is to be used, starting count from (0), and using infix notation. Because of The problems in using infix notation with FORTH, this will also mean, that if these parsing-subroutines are to be compiled into a larger piece of code, then that larger piece of code will expect to receive the infix number in turn when it is executed, not ‘b64-parse’ or ‘fb64-parse’, where those are being compiled into the larger piece of code.

And, the encoder-functions ‘b64-stream’ and ‘fb64-stream’ have as additional limitation, that for internal purposes, they can only concatenate strings up to a maximum length of 255 characters. This implies 15 work units top, which can either correspond to 180 Bytes or to 240 characters.

(Edit 08/02/2017 : )

Just as I felt I needed to improve the subroutines ‘b64-parse’ and ‘fb64-parse’, over their use of a globally-declared output-buffer, a homologous improvement has been made to the subroutines ‘b64-stream’ and ‘fb64-stream’, because they use the global, built-in string-concatenation tool named ‘pad’, which is supplied by FORTH as a generic 256-byte array. This array normally holds a ‘c-string’, as opposed to an ‘s-string’. In FORTH, a string does not end in a NULL character to denote its ending, but is rather preceded by a numeric value, that denotes its length, followed in some way by a packed array of 8-bit ASCII-values.

In a FORTH s-string, the numeric value precedes the address of the array on the stack, and can become quite high, depending on how large the character-array is.

A c-string consists on the stack as the address of a single array, the first byte of which states the length, followed directly by the first character (if there is any) of the string. Because the maximum value that a single byte can have is (255), this means that a c-string can maximally have 255 characters, and there is no point of making the array larger than 256 bytes.

What some people may not know about FORTH, is that when using c-strings and operations on ‘pad’, we are not obliged to use the one, global array ‘pad’ that the language defines. FORTH is not even a weakly-typed language, it is a completely untyped language. We can define a 5-element array, name it ‘pad2′, and use ‘place’ on it, as I did above. The magic is in the FORTH-word ‘place’.

Well the problem with my ‘b64-stream’ and ‘fb64-stream’ was, that FORTH only has 1 such array predefined, and when we use ‘count’ on it, to convert our c-string into an s-string, ‘count’ performs no allocations. ‘count’ simply takes the first byte as the length, and leaves a pointer on the stack, to the second byte within ‘pad’. ‘count’ then pushes the count, onto the stack as a separate value.

If we had regular functions in FORTH, that routinely perform allocations, this little language would get into trouble, because in addition, it has no garbage-collector. Garbage-collection belongs to the realm of high-level languages, and FORTH is a low-level language. FORTH is supposed to have a small Dictionary, and some small amount of space with static allocations, in which to do all its work.

But I have done, was to declare two new 265-byte arrays, named ‘b64-pad’ and ‘fb64-pad’. We could even use those the same way I used ‘pad2′ above, but this time, use these as our output-buffers, from which ‘count’ can make an s-string each time. But I suggest this would not be how to use them.

That way, somebody who uses our code could also freely change the contents of the built-in ‘pad’ tool afterward, and not discover that in doing so, the user also destroyed the output of my functions.

This trivial approach would also not be complete, because according to how I just described it, the built-in ‘count’ subroutine leaves an address on the stack, with an odd byte-number – i.e. that is neither 64-bit nor 32-bit aligned. If all that the following code wanted to do, was fetch individual characters from that address, as ‘c@’ also does, then this would be fine. The built-in subroutine ‘type’ reveals no errors.

I have defined the output-buffers to be s-strings, the arrays of which are 32-bit if not 64-bit aligned, so that to format the output correctly, required the use of the built-in ‘cmove’ subroutine. This work was no longer trivial, so I’ve implemented it. The upgrade should allow entire cells of data to be copied from the output-buffer in one operation, so that the regular ‘@’ operator can be used on its addresses.

Trying to perform a 32-bit Fetch from a non-aligned address will only result in a run-time error.

 


pad count swap fb64-pad 2 pick cmove fb64-pad swap

 


 

I suppose then, that I should also warn the reader about what ‘allot’ does. It accepts a numeric parameter on the stack, according to which some minimum number of bytes is needed – in my example of ‘pad2′ above, (5). Then, this word allocates a number of cells, according to what the smallest allocation unit in the FORTH implementation is, that satisfy the number of bytes requested. So in my case, because each cell is 8 bytes large, it allocates 1 cell. In other cases, a cell might be 4 bytes large, so that 2 cells might be allocated.

But, if ‘allot’ is fed the parameter (0), it follows the somewhat quirky behavior of allocating 1 cell anyway. The reason for this seems to be the fact that in such a quirky request from the coder, the subroutine must still fit the prototype, according to which it leaves the address of a memory location on the stack, in place of the parameter it was given. If this word had indeed allocated zero cells, then there would be no way that return-value could be valid, and the quirky programmer’s code would fail ungracefully.

If a programmer wanted to allocate zero bytes, due to the quirky behavior of his program, then at least, he’d want his code to fail gracefully, or perhaps not fail at all…

(Edit 08/02/2017 : )

One feature which high-level languages such as Java, C and C++ have, is to allow the programmer to initialize variables, where he has declared them.

But because FORTH insists on global declarations of anything which is not on the stack, the initialization does not usually take place in the global space. Yet this does not mean, that variables and arrays can go uninitialized. We must initialize everything as before, but may be prone to forget doing so, as this language has no features to remind us.

In the examples of ‘b64-stream’ and ‘fb64-stream’ above, giving the code before the loop:

 


s" " pad place

 

Initializes the array ‘pad’, by setting its first byte to zero, which also implies an empty string. After that, multiple applications inside the loop, of:

 


pad +place

 

Grow that string. And then, the number of bytes copied to ‘b64-pad’ and ‘fb64-pad’, follow from the size of the string in ‘pad’, as well as that same size being pushed back onto the stack, as an indication of how many of the bytes of those arrays are valid, for the next piece of FORTH-code to use as input.

Dirk

 

Print Friendly, PDF & Email

3 thoughts on “About Encoding And Decoding Base-64 In FORTH”

Leave a Reply

Your email address will not be published. Required fields are marked *

Please Prove You Are Not A Robot *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>