Text-to-Speech

From Ninerpedia
Revision as of 18:28, 14 December 2014 by Stephen Shaw (talk | contribs) (New article on disk based Text to Speech TI Software)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Text to Speech was a disk based program, catalog number PHD5076, that permitted the TI99/4a to read any text in an Extended Basic program. It did not restrict speech to the built in vocabulary of the speech synthesiser. The speech synthesiser peripheral was required and an Extended BASIC module.

Unfortunately to run a program on disk also required the Peripheral Expansion Box, a disk controller, a disk drive and an expansion memory, which was a costly purchase.

The Terminal Emulator II module also permitted text to speech without requiring the expensive peripherals. Like the TE2 module the Text to Speech disk permitted intonations and pitch to be altered.

Extra CALLs

The disk made additional calls available to Extended Basic programmers:

The subroutine package contains the following subroutines:

  • SETUP which is the Text to Speech—English initialization subroutine;
  • XLAT which is the text-to-allophone translator
  • SPEAK which is the allophone-to-LPC (Linear Predictive Coding) translator.

Briefly, the steps involved in using these subroutines are initializing the system's expansion memory, loading the Text to Speech—English subroutines, then calling and executing the subroutines.

Initialisation

The routines may be loaded into memory for use on their own at the command line, or the initialisation steps can be introduced into a user written Extended Basic program. Because the routines and data require memory the user will have less memory available to store their own program.

To make the routines available they must be loaded into memory from the disk as follows:

CALL INIT :: CALL LOAD("DSK1 .SETUP","DSK1.XLAT","DSK1.SPEAK")

Note that this process may take as long as two minutes.

Now the database needs to be loaded: CALL LINK("SETUP","DSK1.DATABASE")

CALL XLAT

The XLAT routine creates an allophone string from any given text string.

The general format of a call to XLAT is

    CALL LINK("XLAT", text-string, allophone-string)

In this call, the text-string is a string expression giving the text string to be translated; and allophone-string is a string variable (or string array element) that receives the resulting allophone string.

If the text string to be translated is longer than 128 bytes, an error message STRING TRUNCATED is returned. If the resulting allophone string is longer than 255 bytes, an error message SPEECH STRING TOO LONG is returned.

An example of the XLAT routine call is:

  CALL LINK("XLAT", "PRESS Y FOR YES. PRESS N FOR NO", B$)

In addition to English words the routine recognises special characters. If it finds a character which is not in the alphabet and not a recognised special character the word will be terminated at that point.

The special characters recognized by the XLAT routine can be divided into four functional groups:

  • numerical characters,
  • pause and break characters,
  • inflection symbols, and
  • special symbols.

Numeric Characters

The numerical character group consists of the standard numerical characters 0 through 9, and the symbols: , . - +

The characters 0 through 9 are always pronounced in the usual way. The other symbols, however, are only recognized if they directly precede a numerical character. Then, and only then, it is pronounced.

Pause Characters

This character group consists of the characters:

   .   ,   !   ?   :    ;

Note that the period and comma symbols must be followed by a space unless it is the last character in the string.

These characters generate a pause code, the length of which depends upon the character. The "," comma symbol causes a short break (2 frames), while all of the others cause a long break (9 frames). The duration of one frame is approximately 25 milliseconds.

In addition to generating pause codes, these characters also affect the inflection contour of the sentence if a primary stress-point (vocal emphasis) has been indicated. The "," and "?" characters both specify rising contour for the preceding phrase, while all other codes generate the standard falling contour.

Inflection Characters

As noted above inflection can be influenced by punctuation (or pause characters) but there are also other effective characters.

This group consists of two main symbols, the ";" semi-colon and the "_" underline, plus one other symbol, the " >".

  • The ";" symbol denotes primary stress.
  • The "_" symbol denotes secondary stress.
  • The " > " symbol is the shift indicator and is used to shift the stress within a word or phrase.

When you input a word or phrase, for example, the word IMPORTANT, without any stress-points, it will have the standard falling contour.

If you want the primary stress to be on the second syllable, then type in " ; >IMPORTANT". The stress is shifted to the second syllable.

The primary stress-point triggers a rising or falling pitch that starts at the inflection symbol and continues to the end of the phrase.

The pause and break symbols within the phrase determine whether the pitch rises or falls. The "?" and "," make the pitch rise and all other symbols make the pitch fall.

The following examples illustrate the use of the primary stress-point and the effect that the pause and break characters have on those stress-points.

    THE SUN RISES IN THE ; WEST?
    THE SUN ; SETS IN THE WEST.

If a secondary stress-point is used, it forces the specific syllable to be pronounced at a higher pitch than the other syllables within the word or phrase.

If several secondary stress-points are used, then the first one starts at a higher pitch and all subsequent secondary stress-points step down to a lower pitch but still remain higher than the pitch level of the syllables that have no-stress.

Normally, if you use secondary stress-points within a phrase, you should use a primary stress-point also. If a primary stress-point is not used, then all syllables after the last secondary stress-point will take on a flat monotone level.

Only one primary stress-point can be used in a phrase If you use more than one primary stress-point, then all other primary stress-points after the first one will be changed to secondary stress-points.

Below are more examples of the usage of the inflection symbols.

    ;PRACTICAL
        Primary stress-point on the first syllable
    ;>OBSCURE
        Primary stress-point on the second syllable
    ;>>GRAVITATION
        Primary stress-point on third syllable
    _ONE _TIME ; ONLY?
        Secondary stress-points and primary stress-point within a phrase with a rising contour.


Special Symbols

This group consists of the symbols:

   @   $   %   &   *   (   )  =    /

These symbols are pronounced as follows:

   Symbol      Pronounced as
   @           at
   $           dollar
   %           percent
   &           and
   *           asterisk
   (           open
   )           close
   =           equals
   /           slash