Author: Alec Bergamini
How can I get my application to read text?
Answer:
On Aug 11, 2001 Microsoft released the SAPI 5.1 SDK. This is significant because
SAPI 5.1 is fully automated. That is you can use it from any language that
supports OLE automation. These are not Active X controls and can be either early or
late bound.
In this article I’m going to show you how to get and install the SAPI 5.1 SDK. Then
I’m going to show how to use the SDK convert text to synthesized speech in a Delphi
application. The synthesized speech is played over you computers speakers. I test
this in Delphi 5 and 6.
To get SAPI 5.1 you need to go to Microsoft’s Speech.net Technologies web site at
http://www.microsoft.com/speechhttp://www.microsoft.com/speech
and follow the link to the download. Right next to the download link is the release
notes link. READ THE RELEASE NOTE! Especially if your development machine is using
a default language other than US English.
If you are running a beta version of the XP operating system you might have some
problems. This is because SAPI 5.1 is built into XP and the most recent public beta
of XP as of this writing (RC 2) includes an earlier version of SAPI 5.1. Don’t try
to install the release version of SAPI 5.1 into XP, it will not work.
Once you read the release notes follow the link to the Speech SDK 5.1 Download
page. In most cases all you need to download is the link labeled “Speech SDK 5.1
(68 MB). This contains the SDK, the documentation and the free Microsoft English
text to speech and speech recognition engines. The download is very large, 68 MB,
so unless you have a high speed connection to the internet you might want to order
the SDK CD from Microsoft.
…. Time passes while you download or wait for the postman ….
Ok, now you have the SAPI 5.1 SDK. Run the speechsdk51.exe to install it on your
development system.
DELPHI 6 Users IMPORTANT
There is a bug in the type library import in Delphi 6 see article "Delphi 6 -
Imported Automation Events Bug". This sample will still work with the unit created
by the type libary import in Delphi 6 but only because none of the events for the
component are used. If you want to use any of the SPVoice events you will need to
read article "Delphi 6 - Imported Automation Events Bug".
What you need to do now is make Delphi aware of the new SAPI automation objects. To
do this, start up Delphi 5 or 6 (I didn’t try earlier versions) and go to Project |
Import Type Library. In the Import Type Library dialog highlight “Microsoft Speech
Object Library (Version 5.1)”. If you don’t find this in the list then something’s
wrong with the installation of SAPI 5.1.
Delphi is going to want to put the SAPI components on your ActiveX palette page. I
recommend you put these on a new palette page called “SAPI 5” since the number of
components installed is large (19). You may also want to choose a “Unit dir name”
of something other than the default. Make sure the “Generate Component Wrapper”
check box is checked and press the >Install< button.
In the Install dialog choose the “Into new package” tab and in the “File name:”
field give a package name like “SAPI5.dpk” press the browse button and make sure
the dpk is created in the same directory where you created the components. Actually
this isn’t completely necessary it just helps keep things together. In the Install
dialog’s Description field give some meaningful description like “SAPI 5 automation
components”. Press OK
Press yes in the confirm dialog and the new components will be created and
installed.
If you now look in the directory you specified for the components you should find
SpeechLib_TLB.pas (and dcr) which contains all the component code as well as
interface, const, type and other useful information. This is your most valuable
piece of documentation on the SDK. I’ve found it even better than the Microsoft
SAPI 5.1 documentation which is pretty good. This directory should also contain (if
you followed the above instructions) the SAPI5.dpk which is your package source.
If you go to the far eastern end of your component palette you should find the new
SAPI5 palette page with its 19 speech components.
Now for the fun part.
Let’s make an application that can synthesize speech. In Delphi start a new
application and drop a button on the form. On the SAPI5 palette page find the
SpVoice component and drop it on the form. On my machine this component is the 5th
one reading from left to right.
Now create an onClick event for you button that looks something like this;
1 procedure TForm1.Button1Click(Sender: TObject);
2 begin
3 SpVoice1.Speak('Hello world!', SVSFDefault);
4 end;
Run the program and press the button. Cool hu?
At this level it’s amazingly simple. The SPVoice objects Speak method is very
powerful. This power comes from the second parameter. For the above example I
choose to use the default mode which causes the speak method to return only when
the synthesis is complete, not to purge pending speech requests, to respond to
special XML control tags embedded in the text.
The SDKs documentation is contained in sapi.chm which you will find in the
\Program Files\Microsoft Speech SDK 5.1\Docs\Help directory.
Sapi.chm contains a lot of information. To go directly to the meat of the subject
go to the last folder on the outlines 1st level titled Automation and go down to
SPVoice and then to the Speak method read what’s there and also be sure to follow
the link to the SpeechVoiceSpeakFlags info. You will find that in addition to just
speaking passed in text that can also do much more some of the more interesting
flags are;
Pass in a file name and speak the text in the file. (SVSFIsFilename)
Make the function either return immediately (asynchronously) or only after the
synthesis is complete(synchronously). If you speak asynchronously there are events
available to fire when the speech is done. (SVSFlagsAsync)
Embed flags in the text that can control various aspects of the synthesis like
pitch, rate, emphasis, and much more (see the included White Paper titled “XML TTS
Tutorial”). I found this feature a bit addicting as I attempted to make the
synthesized voice sing.( SVSFIsXML)
One interesting thing I found (but not documented) was that you can speak a web
sites title by setting the flag to SVSFIsFilenam and passing a URL. If you are
connected to the internet, try replacing the speak line in the sample line with
5 SpVoice1.Speak('http://www.o2a.com', SVSFIsFilename);
And run it.
Even more bizarre is you can use the speak method to play wav files. Try
SpVoice1.Speak('C:\WINNT\MEDIA\Windows Logon Sound.wav', SVSFIsFilename);
There’s a lot more to SAPI then text to speech and there’s more to text to speech
then what I’ve covered here. Hopefully this will be the first of a number of
articles on SAPI but I’ll only do them if you’re interested so please be sure to
comment. Also I’m completely open to suggestions on what you’d like to see next (if
anything at all).
If you want to talk privately I’m at alecb@o2a.commailto:alecb@o2a.com.
|