Speech to Text Conversion

Initializing the pyttx3 engine

pyttsx3 is a text-to-speech conversion library in Python, pyttx3 engine is initialized and used for text to speech conversion.

'''Initialize the pyttx3 engine for speech recognization'''
engine = pyttsx3.init()

pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline.

Included Text to Speech Conversion engines are

  • sapi5

  • nsss

  • espeak

Voice Speed Property

This is the rate at which Aryan or Aryan speaks the Text, Default being 200. Currently 150 is assigned as the rate to have clear voice.

engine.setProperty("rate", 150)

Voice Assignment

Voice is assigned for the respective AI Friend (Aryan or Arya), Also Voice assignment is based on Operating System. Different voice are assigned if Aryan is running on Windows or Linux.

For Windows (Dev and Testing)

  • Arya - Zira Voice (Female) is assigned to Arya

  • Aryan - David Voice (Male) is assigned to Aryan

Linux ( Testing and Prod)

  • Arya - 'english_rp+f3' is assigned to Arya.

  • Aryan - 'english_rp+m1' is assigned to Aryan.

*Female voice is not available for the default 60 voices available in pyaudio lib, thus voices from epeak lib are used in Linux.

if aifriend == "Arya":
            
    #For Linux Female voice is set to english_rp+f3
    engine.setProperty('voice', 'english_rp+f3')
            
    #For Windows Female voice is set to Zira
    engine.setProperty('voice', "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_ZIRA_11.0")
                   
else:

    #For Linux male voice is set to english_rp+m1
    engine.setProperty('voice', 'english_rp+m1')

    #For Windows Male voice is set to David
    engine.setProperty('voice', "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens\TTS_MS_EN-US_DAVID_11.0")

Engine Execution

Engine is executed to convert text into speech for the given text (command). Snippet below.

engine.say(command)  
engine.runAndWait()

engine.runAndWait(): This function makes the speech audible in the system.

Last updated