Difference between revisions of "BASIC file formats"

Revision as of 13:00, 29 March 2015

BASIC programs written for TI BASIC and Extended BASIC are not stored as plain text in memory. This is different with assembler programs which are edited as text files and then assembled to a Tagged Object Code file.

This is not appropriate for BASIC. When the program is started, and it would be stored as plain text, the BASIC interpreter would have to parse the line first, finding out the commands and the arguments, and then execute it. This is typical for script languages of today, but it would be just too slow, and we know well that TI BASIC and Extended BASIC are quite slow, compared with other platforms.

BASIC lines are tokenized. For each command or special character or character sequence that has a meaning in BASIC there is a one-byte code, the token. Example:

Command	Token (hex)
NEW	00
SAVE	07
EDIT	09
PRINT	9c
&	b8
"..." (quoted string)	c7
SEG$	d8
VALIDATE	fe

You can find a complete table here.

So let us take a simple BASIC line like

PRINT "HELLO"

There will not be a string like "PRINT" in memory, because the parser recognized this word as a command and replaced it with its token. Second, there is a string following the command, which is enclosed in quotes. The contents can be anything, so the parser must copy it into memory as is.

Finally, the line is converted to the following byte sequence:

09	9c	c7	05	48	45	4c	4c	4f	00
line length	PRINT	"..."	string length	H	E	L	L	O	end

Sample program

Let's have a look at a real Extended BASIC program. This is an output of TIImageTool which shows the contents of a PROGRAM file.

000000: 00 3f 37 a7 37 98 37 d7 00 28 37 a9 00 1e 37 ac     .?7.7.7..(7...7.
000010: 00 14 37 b2 00 0a 37 ca 02 8b 00 05 96 52 4f 57     ..7...7......ROW
000020: 00 17 a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04     .....ROW...1....
000030: 54 45 53 54 b4 52 4f 57 00 0e 8c 52 4f 57 be c8     TEST.ROW...ROW..
000040: 01 31 b1 c8 02 32 30 00                             .1...20.

The numbers on the left (xxxxx:) are the offset from the beginning of the file. At the right side we see the ASCII representation of the bytes, where unprintable characters are shown by a dot. The offsets and the ASCII column are not part of the file but added for better readability.

There are no commands to be seen, but we should expect nothing like that, after reading the above paragraphs.

At first we cut away the offsets and the ASCII column, and we add some line breaks so we see the file structure. We join some bytes together as they are parts of words.

003f 37a7 3798 37d7 
0028 37a9 
001e 37ac
0014 37b2 
000a 37ca 
02 8b 00
05 96 52 4f 57 00 
17 a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04 54 45 53 54 b4 52 4f 57 00 
0e 8c 52 4f 57 be c8 01 31 b1 c8 02 32 30 00

Everything is still the same. We can now analyse the contents of the file.

Meaning	Contents
Header	003f 37a7 3798 37d7
Line Number Table	0028 37a9
	001e 37ac
	0014 37b2
	000a 37ca
Program lines	02 8b 00
	05 96 52 4f 57 00
	17 a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04 54 45 53 54 b4 52 4f 57 00
	0e 8c 52 4f 57 be c8 01 31 b1 c8 02 32 30 00

TODO: continue

@@ Line 43: / Line 43: @@
 There will not be a string like "PRINT" in memory, because the parser recognized this word as a command and replaced it with its token. Second, there is a string following the command, which is enclosed in quotes. The contents can be anything, so the parser must copy it into memory as is.
-Eventually, the line is converted to the following byte sequence:
+Finally, the line is converted to the following byte sequence:
 {| class="plainc"
@@ Line 51: / Line 51: @@
 |}
+=== Sample program ===
+Let's have a look at a real Extended BASIC program. This is an output of [[TIImageTool]] which shows the contents of a [[PROGRAM]] file.
+: 00 3f 37 a7 37 98 37 d7 00 28 37 a9 00 1e 37 ac     .?7.7.7..(7...7.
+: 00 14 37 b2 00 0a 37 ca 02 8b 00 05 96 52 4f 57     ..7...7......ROW
+: 00 17 a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04     .....ROW...1....
+: 54 45 53 54 b4 52 4f 57 00 0e 8c 52 4f 57 be c8     TEST.ROW...ROW..
+: 01 31 b1 c8 02 32 30 00                             .1...20.
+The numbers on the left (xxxxx:) are the offset from the beginning of the file. At the right side we see the ASCII representation of the bytes, where unprintable characters are shown by a dot. The offsets and the ASCII column are not part of the file but added for better readability.
+There are no commands to be seen, but we should expect nothing like that, after reading the above paragraphs.
+At first we cut away the offsets and the ASCII column, and we add some line breaks so we see the file structure. We join some bytes together as they are parts of words.
+f 37a7 3798 37d7
+37a9
+e 37ac
+37b2
+a 37ca
+8b 00
+96 52 4f 57 00
+a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04 54 45 53 54 b4 52 4f 57 00
+e 8c 52 4f 57 be c8 01 31 b1 c8 02 32 30 00
+Everything is still the same. We can now analyse the contents of the file.
+{| class="plainc"
+! Meaning
+! Contents
+|-
+| Header
+| style="text-align:left" | 003f 37a7 3798 37d7
+|-
+| rowspan="4" | Line Number Table
+| style="text-align:left" |0028 37a9
+|-
+| style="text-align:left" |001e 37ac
+|-
+| style="text-align:left" | 0014 37b2
+|-
+| style="text-align:left" | 000a 37ca
+|-
+| rowspan="4" | Program lines
+| style="text-align:left" | 02 8b 00
+|-
+| style="text-align:left" | 05 96 52 4f 57 00
+|-
+| style="text-align:left" | 17 a2 f0 b7 52 4f 57 b3 c8 01 31 b6 b5 c7 04 54 45 53 54 b4 52 4f 57 00
+|-
+| style="text-align:left" | 0e 8c 52 4f 57 be c8 01 31 b1 c8 02 32 30 00
+|}
 '''TODO: continue'''

Difference between revisions of "BASIC file formats"

Revision as of 13:00, 29 March 2015

Sample program

Navigation menu

Search