Speech contouring algorithms for TI Text to Speech—English

From Ninerpedia
Revision as of 20:53, 14 December 2014 by Stephen Shaw (talk | contribs) (Brief note on speech contouring algorithms)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Extracted from the Text-to-Speech disk program manual.

Text to Speech—English uses a pre-defined set of rules to translate secondary and primary stress-points into pitch variations.

How those rules interpret the stress-points, and what effects stress-points have on the pitch in a phrase.

Sentence profiles can be subdivided into two major groups:

    1. Rising phrase mode
    2. Falling phrase mode

Typically, a rising phrase mode occurs in sentences terminated by a "," comma or a "?".

The falling phrase mode prevails in any other situation.

Stress-points are only used for vowel allophones. These allophones are all grouped in the range 1 through 73. All the other allophones are not used for sentence profiling.

Falling phrase mode

A falling phrase centers around a falling pitch on the primary stress-point.

If the primary stress-point is followed by one or more secondary stress-points, the pitch on the primary stress-point will fall from above the average pitch level, back to the average pitch level.

Rising phrase mode

A rising phrase, like a falling phrase, is centered around the primary stress-point.

As the name implies, the primary stress-point will follow a rising contour. However, if the primary stress-point is followed by one or more vowels, the rising contour is spread out over all the vowels following it starting at the default level for the primary stress-point itself, and ending above the default level for the last vowel.