Chapter 20

Building the Voice-Activated Text Reader


CONTENTS


In this chapter, you'll build a complete Visual Basic 4.0 application that uses both speech recognition and text-to-speech services. You'll use the MDINOTE project that ships with Visual Basic 4.0 as a starting framework for creating a voice-activated text reader. This application will allow users to use voice commands to load text documents and then tell the workstation to play back the loaded documents using TTS services.

You'll learn how to declare your own custom voice commands and how to use the voice commands automatically built by Windows each time you load an application. You'll also learn how to add menu options and toolbar buttons that provide TTS services for your application. When you are done with the example project in this chapter, you'll know how to build SAPI-enabled applications using Visual Basic 4.0 or any other VBA-compliant language.

Designing the Application

The first stage in the process of building a SAPI-enabled Windows application is designing the SAPI interface. Adding SAPI services to an application is much like adding other Windows extensions (such as messaging, telephony, and so on). Usually, you can start from an existing application framework and add SAPI services where appropriate.

In order for you to focus on the process of incorporating SAPI services into an application, the project in this chapter starts from a working Visual Basic 4.0 application-the MDINOTE.VBP project. This application allows users to create, edit, and save text files.

Note
This project ships with Visual Basic 4.0 and can be found in the SAMPLES\MDI folder of the Visual Basic home directory. You can also find it on the CD-ROM that ships with this book.

For the example in this chapter, you'll add both speech recognition and text-to-speech services to this application. When you complete the project outlined here, you'll have a fully functional text editor that provides both SR and TTS services to users.

Adding TTS Services

Adding TTS services to the MDINOTE.VBP project is really quite simple. You will need to provide an option that allows users to tell the TTS engine to read the selected text. You should also allow users to fast-forward the reader, rewind it, and stop it as needed. The easiest way to do this is to provide a set of menu options that correspond to the SAPI TTS Speak, FastForward, Rewind, and StopSpeaking services. In addition to adding menu options, you'll also add command buttons to appear on the toolbar.

Adding TTS services also requires some initialization code to declare and initialize the Voice Text object.

Adding SR Services

Adding speech recognition services to Windows 95 applications is very easy. First, as soon as the SAPI services are installed and activated, all the menu options of a Windows application are valid voice commands. In other words, even if you have done no coding at all, your Windows 95 applications are ready to receive and respond to voice commands.

Tip
As soon as you load a Windows application, the menu options are activated as valid voice commands. You can view the valid voice commands at any moment by saying What Can I Say? or by selecting the alternate mouse button while pointing to the Microsoft Voice balloon icon in the system tray.

In addition to the default menu option commands, you'll also add a set of custom commands for the MDINOTE.VBP project. These are one-word commands that the user can speak. They correspond to the set of toolbar buttons that appear on the form while the project is running.

Along with the code that declares the menu commands and their objects, you'll also write code to check the CommandSpoken property and respond to the voice commands as needed.

Coding the MDISpeech Module

The first job in adding speech services to a project is declaring global variables and adding the initialization routines for starting and ending the application. If you are adding TTS services, you need to add code that will respond to the various TTS engine requests (Speak, FastForward, Rewind, and Stop). If you are enabling custom SR voice commands, you'll need to add code that registers the new commands.

Note
You'll also need code that checks the CommandSpoken property of the Voice Command object to see if one of your custom commands was uttered by the user. You'll add that code to the main form later in this chapter.

Declaring the Global Variables

For the project described in this chapter, you'll need to add both SR and TTS services. First, load the MDINOTE project and add a new BAS module. Set its Name property to modSpeech and save it as MDISP.BAS. Next add the code shown in Listing 20.1 to the general declaration section of the module.


Listing 20.1. Adding code to the declaration section.
Option Explicit
'
' *********************************************
' This module adds SAPI support to the program.
' *********************************************

'
' VCmd speech stuff
Global objVCmd As Object ' SR Command object
Global objVMenu As Object ' SR Menu object
Global cVCmdMenu() As String ' SR command strings
Global lSRCmd As Long ' SR command ID
'
' VTxt Speech stuff
Global objVText As Object ' TTS object
Global bVText As Boolean ' TTS flag
Global Const vTxtSpeak = 0
Global Const vTxtForward = 1
Global Const vTxtRewind = 2
Global Const vTxtStop = 3

The first several code lines declare the variables and objects needed to provide SR services. The SAPI voice command object library requires two different objects: the Voice Command object and the Voice Menu object. You'll also need an array to hold the commands to be added to the menu and a variable to hold the menu ID returned in the CommandSpoken property.

The TTS service requires only one object. You'll use a Boolean flag to hold the activity status of the TTS engine returned by the IsSpeaking property. The last four constants are used by Visual Basic to keep track of menu and button options selected by the user.

Save the code module as MDISP.BAS and the project as MDINOTE.VBP before continuing on to the next section.

Coding the InitSAPI and UnInitSAPI Routines

The next code section you need to add is the code that will initialize and uninitialize SAPI services for the application. The first routine (InitSAPI) will be called when the program first starts. The last routine (UnInitSAPI) will be called when the project ends. Add a new subroutine called InitSAPI to the modSpeech module. Enter the code shown in Listing 20.2.


Listing 20.2. Adding the InitSAPI routine.
Public Sub InitSAPI()
    '
    ' voice command objects
    Set objVCmd = CreateObject("Speech.VoiceCommand")
    objVCmd.Register "" ' use default location
    objVCmd.Awake = True ' awaken speech services
    Set objVMenu = objVCmd.MenuCreate(App.EXEName, "MDI", 1033, "", vcmdmc_CREATE_TEMP)
    InitVoice ' go build command list
    '
    ' voice text objects
    Set objVText = CreateObject("Speech.VoiceText")
    objVText.Register "", App.EXEName
    objVText.Enabled = True
    InitVText ' go build menu/buttons
    '
End Sub

This routine first initializes the Voice Command objects, registers the application, and starts the SR engine. The call to the InitVoice routine adds the custom voice commands to the declared menu. You'll see that code in the next section.

After handling the registration of SR services, the routine adds TTS services to the project. After initializing the Voice Text object, registering the application, and enabling TTS, the InitVText routine is called to build the menu and button options on the forms. You'll see this code later.

The next set of code to add is the UnInitSAPI routine. This routine will be called at the end of the program. The UnInitSAPI routine deactivates all SAPI objects and releases all links to SAPI services. Add a new subroutine to the modSpeech module and enter the code shown in Listing 20.3.


Listing 20.3. Adding the UnInitSAPI code.
Public Sub UnInitSAPI()
    '
    ' remove sapi services
    '
    objVMenu.Active = False ' stop menu
    Set objVMenu = Nothing  ' remove link
    objVCmd.Awake = False   ' close down SR
    Set objVCmd = Nothing   ' remove link
    '
    objVText.Enabled = False ' stop TTS
    Set objVText = Nothing  ' remove link
    '
End Sub

Coding the InitVoice Routine

The InitVoice routine adds all the new custom commands to the menu object declared in the InitSAPI routine. After it adds the new commands, a timer object is initialized and enabled. The timer will be used to check for a valid voice command. Add a new subroutine called InitVoice and enter the code shown in Listing 20.4.


Listing 20.4. Adding the InitVoice routine.
Public Sub InitVoice()
    '
    ' build added voice menu commands
    '
    Dim x As Integer
    ReDim Preserve cVCmdMenu(11) As String
    '
    cVCmdMenu(1) = "New"
    cVCmdMenu(2) = "Open"
    cVCmdMenu(3) = "Exit"
    cVCmdMenu(4) = "Toggle Toolbar"
    cVCmdMenu(5) = "Read"
    cVCmdMenu(6) = "Forward"
    cVCmdMenu(7) = "Rewind"
    cVCmdMenu(8) = "Stop"
    cVCmdMenu(9) = "Cut"
    cVCmdMenu(10) = "Copy"
    cVCmdMenu(11) = "Paste"
    '
    For x = 1 To 11
        objVMenu.Add 100 + x, cVCmdMenu(x), "MDI Menu", cVCmdMenu(x)
    Next x
    objVMenu.Active = True
    '
    ' start timer loop
    frmMDI!SRTimer.Interval = 500 ' every 1/2 second
    frmMDI!SRTimer.Enabled = True ' turn it on
    '
End Sub

As you can see, adding custom menu options involves adding the command strings to the menu object and then activating the menu object. The code that sets the timer is needed to poll the SpokenCommand property every half second.

Tip
You do not have to add custom commands to SAPI-enabled applications that have a declared menu. All menu items are automatically registered as voice commands by the operating system. The commands added here are really shortcuts to existing menu options.

Coding the InitVText Routine

When you add TTS services to the application, you need only to initialize the TTS object and then "turn on" the menu and/or command buttons that allow users to gain access to the TTS engine. In this application a set of buttons for the toolbar needs initialization. The code in Listing 20.5 shows how this is done.


Listing 20.5. Adding the InitVText routine.
Public Sub InitVText()
    '
    ' set up vText buttons and menu
    '
    Dim cDir As String
    Dim cPic(4) As String
    Dim x As Integer
    '
    cDir = "d:\sams\cdg\chap20\mdi\"
    cPic(0) = "arw01rt.ico" ' read
    cPic(1) = "arw01up.ico" ' forward
    cPic(2) = "arw01lt.ico" ' rewind
    cPic(3) = "arw01dn.ico" ' stop
    '
    For x = 0 To 3
        frmMDI.ImgVText(x).Picture = LoadPicture(cDir & cPic(x))
    Next x
    '
    VTextAction vTxtStop ' force "stop"
    '
    ' start timer loop
    frmMDI.TTSTimer.Interval = 500 ' every 1/2 second
    frmMDI.TTSTimer.Enabled = True ' turn it on
    '
End Sub

You'll notice a call to the VTextAction routine. This routine handles the TTS service requests (Speak, FastForward, Rewind, and StopSpeaking). A timer is also enabled in order to track the active status of the TTS engine.

Coding the VTextAction Routine

The VTextAction routine is the code that handles the various TTS service requests made by the user. This one set of code can handle all of the playback options for the program. It also handles the enabling of the toolbar buttons and the menu options. Listing 20.6 shows the code needed for the VTextAction subroutine.


Listing 20.6. Adding the VTextAction routine.
Public Sub VTextAction(Index As Integer)
    '
    ' handle request to start/stop reading
    '
    Static bVPause As Boolean
    '
    Screen.ActiveForm.MousePointer = vbHourglass
    '
    ' handle service request
    Select Case Index
        Case vTxtSpeak ' speak=0
            If Not bVText Then
                If Len(Trim(Screen.ActiveForm.Text1.Text)) <> 0 Then
                    objVText.Speak Screen.ActiveForm.Text1.Text, vtxtsp_NORMAL
                    bVText = True
                End If
            End If
        Case vTxtForward ' fast forward=1
            If bVText Then
                objVText.AudioFastForward
            End If
        Case vTxtRewind ' rewind=2
            If bVText Then
                objVText.AudioRewind
            End If
        Case vTxtStop ' stop speaking=3
            If bVText Then
                objVText.StopSpeaking
                bVText = False
                bVPause = False
            End If
    End Select
    '
    ' update menu
    If bVText Then
        With Screen.ActiveForm
            .mnuVText(0).Enabled = False
            .mnuVText(1).Visible = True
            .mnuVText(2).Visible = True
            .mnuVText(3).Visible = True
        End With
    Else
        If Screen.ActiveForm.Caption <> "MDI NotePad" Then
            With Screen.ActiveForm
                .mnuVText(0).Enabled = True
                .mnuVText(1).Visible = False
                .mnuVText(2).Visible = False
                .mnuVText(3).Visible = False
            End With
        End If
    End If
    '
    ' update buttons
    If bVText Then
        frmMDI.ImgVText(0).Visible = True
        frmMDI.ImgVText(1).Visible = True
        frmMDI.ImgVText(2).Visible = True
        frmMDI.ImgVText(3).Visible = True
    Else
        frmMDI.ImgVText(0).Visible = True
        frmMDI.ImgVText(1).Visible = False
        frmMDI.ImgVText(2).Visible = False
        frmMDI.ImgVText(3).Visible = False
    End If
    '
    ' no open text pages
    If Not AnyPadsLeft() Then
        frmMDI.ImgVText(0).Visible = False
    End If
    '
    Screen.ActiveForm.MousePointer = vbNormal
    '
End Sub

There are three parts to this one routine. The first part of the code handles the actual TTS request. Only the first option (vTxtSpeak) involves any coding. Since the MDINOTE project uses a single text box for user input, this text box is automatically used as the source for all TTS playback.

Note
You'll notice that an index value is used to power this routine. This index value comes from an array of command buttons or from a menu array. You'll build these later. Control and menu arrays are excellent ways to build compact code in Visual Basic.

The second part of the code handles the enabling and disabling of the menu items on the edit form. When the edit form is opened, a single menu item is activated (Read). Once the TTS engine starts reading, the other options (Rewind, Forward, and Stop) are made visible. The third and last part of this routine makes sure the proper buttons appear on the toolbar. This works the same as the menu array. You'll build the menu and control array in the next section.

This is the end of the code module for the project. Save the module as MDISP.BAS and the project as MDINOTE.VBP before continuing.

Modifying the MDINote Forms

There are two forms in the MDINOTE project that must be modified: the frmMDI form and the frmNotePad form. The frmMDI form holds the button array and calls the InitSAPI and UnInitSAPI routines. It also has code to handle the timer controls. The frmNotePad form holds the menu array and has code to control the starting and stopping of the TTS engine.

Modifying the MDI Form

The first task in modifying the frmMDI form is adding the button array to the toolbar. You'll use image controls to hold the buttons views. You'll also need to add code to the Form_Load and Form_Unload events. Finally, you'll add code for the two timer events.

Adding the Timer and Button Objects

First, you need to open the frmMDI form and add an array of four image controls and then add two timers. Table 20.1 shows the controls that need to be added to the form. Refer to Figure 20.1 and Table 20.1 for the placement of the image controls on the form.

Figure 20.1 : Placing the controls on the frmMDI form.

Table 20.1. Adding the controls to the frmMDI form.
ControlProperty Setting
VB.Timer Name SRTimer
 Left 6180
 Top 0
VB.Timer Name TTSTimer
 Left 6480
 Top 60
VB.Image Name ImgVText
 Height 330
 Index 3
 Left 3120
 Stretch -1 'True
 Top 0
 Visible 0 'False
 Width 375
VB.Image Name ImgVText
 Height 330
 Index 2
 Left 2760
 Stretch -1 'True
 Top 0
 Visible 0 'False
 Width 375
VB.Image Name ImgVText
 Height 330
 Index 1
 Left 2400
 Stretch -1 'True
 Top 0
 Visible 0 'False
 Width 375
VB.Image Name ImgVText
 Height 330
 Index 0
 Left 2040
 Stretch -1 'True
 Top 0
 Visible 0 'False
 Width 375

Be sure to paint the image controls and the timers onto the toolbar. Also note that the imgVText control is a control array of four controls. You'll use the index value returned by this array to tell the VTextAction routine what service was requested by the user.

Note
You don't need to set the picture property of the image controls at design time. This is handled by the InitText routine you wrote earlier.

After adding these controls, save the form (MDI.FRM) and the project (MDINOTE.VBP) before continuing.

Adding Code to the Form_Load and Form_Unload Events

After adding the controls, you need to add some code to the Form_Load and Form_UnLoad events. This code will be executed only once at the start or end of the program. Listing 20.7 shows the complete Form_Load code with the single line that calls the InitSAPI routine added at the end. Modify the Form_Load event code to look like the code in Listing 20.7.


Listing 20.7. Modifying the Form_Load event code.
Private Sub MDIForm_Load()
    '
    ' Application starts here (Load event of Startup form).
    Show
    ' Always set the working directory to the directory containing the application.
    ChDir App.Path
    ' Initialize the document form array, and show the first document.
    ReDim Document(1)
    ReDim FState(1)
    Document(1).Tag = 1
    FState(1).Dirty = False
    ' Read System registry and set the recent menu file list control array Âappropriately.
    GetRecentFiles
    ' Set global variable gFindDirection which determines which direction
    ' the FindIt function will search in.
    gFindDirection = 1
    '
    InitSAPI ' <<< added for SAPI >>>
    '
End Sub

You also need to make a similar modification to the Form_Unload event code. Listing 20.8 shows the entire code listing for the Form_Unload event with the call to UnInitSAPI at the start of the routine. Modify the Form_Unload code to match the code in Listing 20.8.


Listing 20.8. Modifying the Form_Unload event code.
Private Sub MDIForm_Unload(Cancel As Integer)
    '
    UnInitSAPI ' <<< added for SAPI >>>
    '
    ' If the Unload event was not cancelled (in the QueryUnload events for the ÂNotepad forms),
    ' there will be no document window left, so go ahead and end the application.
    If Not AnyPadsLeft() Then
        End
    End If
    '
End Sub

Coding the Timer and Button Events

The final code you need to add to the frmMDI form is the code to handle the button array and the code to handle the timer events. First, add the code in Listing 20.9 to the ImgVText_Click event. This will pass the array index to the VTextAction routine to handle the TTS service request.


Listing 20.9. Adding code to the ImgVText_Click event.
Private Sub ImgVText_Click(Index As Integer)
    '
    VTextAction Index ' handle vText SAPI request
    '
End Sub

The code for the TTSTimer event is also quite simple. The timer event simply checks to see if the TTS engine is actually speaking any text. The results of this check are loaded into the global variable bVText to inform the VTextAction routine how to display the menu and button array objects. Add the code in Listing 20.10 to the TTSTimer_Timer event.


Listing 20.10. Adding code to the TTSTimer_Timer event
Private Sub TTSTimer_Timer()
    '
    bVText = objVText.IsSpeaking ' load results
    '
End Sub

The code for the SRTimer_Timer event is more involved. The SRTimer must check the value returned by the CommandSpoken property to see if it is a valid custom command. If the value is part of the Select Case structure, the corresponding program routine is called. Add the code shown in Listing 20.11 to the SRTimer_Timer event.


Listing 20.11. Adding code to the SRTimer_Timer event.
Private Sub SRTimer_Timer()
    '
    ' check status of SR Engine
    '
    lSRCmd = objVCmd.CommandSpoken
    objVCmd.CommandSpoken = 0 ' clear command
    '
    Select Case lSRCmd
        Case 101 ' new
            FileNew
        Case 102 ' open
            FOpenProc
        Case 103 ' exit
            Unload frmMDI
        Case 104 ' toggle toolbar
            OptionsToolbarProc frmMDI
        Case 105 ' read
            VTextAction vTxtSpeak
        Case 106 ' forward
            VTextAction vTxtForward
        Case 107 ' rewind
            VTextAction vTxtRewind
        Case 108 ' stop
            VTextAction vTxtStop
        Case 109 ' cut
            EditCutProc
        Case 110 ' copy
            EditCopyProc
        Case 111 'paste
            EditPasteProc
    End Select
    '
    lSRCmd = 0
End Sub

Warning
You'll notice that the CommandSpoken property is loaded into a local variable and then set to zero. This is an important step. Failure to clear the CommandSpoken property can result in locking your program into a loop that keeps executing the last requested command. To prevent getting caught in a loop, be sure to clear the CommandSpoken property as soon as you read it.

Those are all the modifications needed to the frmMDI form. Save the form and the project before continuing.

Modifying the NotePad Form

There are only two main modifications to the frmNotePad form. First, you need to add the TTS menu options to the form. Next, you need to add a few lines of code to the Form_Load and Form_Unload events. Once these are done, you are ready to test your finished application.

Adding the Menu Options

You need to add four menu options to the File menu. These four options correspond to the TTS engine service options: Speak, FastForward, Rewind, and StopSpeaking. Refer to Table 20.2 and Figure 20.2 to see how to add these menu options to the frmNotePad form.

Figure 20.2 : Using the Menu Editor to add the TTS engine options.

Table 20.2. Adding the TTS engine options to the menu.
ControlProperty Setting
VB.Menu Name mnuVText
 Caption "&Read"
 Index 0
VB.Menu Name mnuVText
 Caption "&Forward"
 Index 1
 Visible 0 'False
VB.Menu Name mnuVText
 Caption "R&ewind"
 Index 2
 Visible 0 'False
VB.Menu Name mnuVText
 Caption "S&top"
 Index 3
 Visible 0 'False
VB.Menu Name mnuFileSp02
 Caption "-"

Note that all but the first menu option (Read) have their Visible property set to FALSE. You'll only show the other options after the TTS engine starts speaking some text.

After adding the menu object, you need to add some code to the mnuVText_Click event to handle menu selections. Since this is a menu array, you'll only need one line of code to pass the selected service request to the VTextAction routine. Listing 20.12 shows the code you should add to the mnuVText_Click event.


Listing 20.12. Adding code to the mnuVText_Click event.
Private Sub mnuVText_Click(Index As Integer)
    '
    VTextAction Index ' handle vtext service request
    '
End Sub

Adding Code to the Form_Load and Form_Unload Events

You need to add only one line to the Form_Load and Form_Unload events. This line forces the TTS engine to stop speaking any code already in progress. That prevents the engine from attempting to speak two sets of text at once.

Listing 20.13 shows the complete Form_Load event with the SAPI-related line at the end. Modify the code in Form_Load to match the code in Listing 20.13.


Listing 20.13. Modifying the Form_Load event.
Private Sub Form_Load()
    Dim i As Integer

    mnuFontName(0).Caption = Screen.Fonts(0)
    For i = 1 To Screen.FontCount - 1
        Load mnuFontName(i)
        mnuFontName(0).Caption = Screen.Fonts(i)
    Next
    '
    VTextAction vTxtStop ' <<< force stop all speaking >>>
    '
End Sub

The code modification to the Form_Unload event is also quite easy. Listing 20.14 shows the whole Form_Unload event code with the SAPI-related line at the end. Modify the Form_Unload event code to match that shown in Listing 20.14.


Listing 20.14. Modifying the Form_Unload event code.
Private Sub Form_Unload(Cancel As Integer)
    FState(Me.Tag).Deleted = True

    ' Hide the toolbar edit buttons if no notepad windows exist.
    If Not AnyPadsLeft() Then
        frmMDI!imgcutbutton.Visible = False
        frmMDI!imgcopybutton.Visible = False
        frmMDI!imgPasteButton.Visible = False
        gToolsHidden = True
        GetRecentFiles
    End If
    '
    VTextAction vTxtStop ' force stop all speaking
    '
End Sub

That is the end of the code modification for the MDINOTE.VBP project. Save this form (MDINOTE.FRM) and the project (MDINOTE.VBP) before beginning to test your new SAPI-enabled version of the MDINOTE project.

Testing the SAPI-Enabled MDI NotePad

You now have a SAPI-enabled MDI NotePad project ready to test.

Once you compile the MDINOTE project, you can view the new voice commands that were added and also test the TTS playback of text documents.

Warning
Be sure your workstation has speech services installed and activated before you start this project. If not, you may encounter an error and may have to reboot your system.

First, start the MDINOTE application and ask your workstation to tell you what commands are available (ask What can I say?). You should see two additional sections in the voice menu. The first one ("MDI NotePad voice commands") was added by the operating system when you first loaded the program. The Windows operating system will automatically create a set of voice commands that matches the menus of the program. Figure 20.3 shows the list of commands automatically built by Windows under the "MDI NotePad..." heading.

Figure 20.3 : Viewing the automatic voice commands.

You'll also see the custom commands added to the command list under the "MDINOTE" heading. These were added by the InitVoice routine in the project.

You can test the SR options of the program by speaking any of the menu commands or custom voice commands. When you say New, you should see a new blank page appear, ready for text input. You can test the TTS services by loading a text document and selecting Read from the menu or speaking the Read voice command. You'll see a set of buttons appear on the toolbar and an expanded list on the File menu (see Figure 20.4).

Figure 20.4 : Viewing the expanded file menu during a Read command.

Summary

In this chapter, you learned how to add SR and TTS services to Visual Basic 4.0 applications using the OLE Voice Command and Voice Text libraries. You modified the MDINOTE project that ships with Visual Basic 4.0 to add options to speak command words and have the TTS engine read loaded text documents back to you.

You also learned that you do not need to add any code to Windows programs in order to make them SR-capable. Every menu option that is declared in a Windows program is automatically loaded as a valid SAPI voice command by the Windows operating system (as long as SAPI services are active). You also learned how to add custom voice commands to speed access to key menu items.

The next chapter is a summary of the SAPI section. The next section of the book describes the Windows telephony application programming interface and how you can use it to control incoming and outgoing voice and data telephone calls.