Are you interested in transforming text into spoken words with a few lines of Python code? In this blog post, we’ll explore how to build a simple Text-to-Speech (TTS) application using the gTTS (Google Text-to-Speech) library. This is a fantastic way to get started with natural language processing and voice synthesis.
Table of Contents
ToggleWhy Text-to-Speech?
Text-to-Speech technology converts written text into audio, enabling applications like voice assistants, audiobooks, and more. It can enhance accessibility for visually impaired users and improve user experiences in various applications.
We’ll use the gTTS library, which provides a simple interface to Google Translate’s TTS API.
First, you need to install the gTTS library. Open your terminal or command prompt and run:
pip install gtts
I will simply start with importing all the necessary libraries that we need for this task to create a program that takes the text and converts it into speech:
#Import the libraries from gtts import gTTS import os import nltk
Now, if you have faced any error in importing the libraries, then you must have not installed any of these libraries. You can easily install any library in python, by writing a very simple command in your terminal – pip install package name.
Now after installing and importing the libraries, we need to create a function that takes an English language text so that we can create a program to convert text to speech from the user text.
def text_to_speech(text, language='en'):
Now we have to choose the language of speech. Note “en” means English. You can also use “pt-br” for Portuguese and there are others:
language = 'en' #English
Now we need to pass the text and language to the engine to convert the text to speech and store it in a variable. Mark slow as False to tell the plug-in that the converted audio should be at high speed:
tts = gTTS(text=text, lang=language, slow=False)
After that we can save the audio file into mp3 format so that we can play.
# Save the audio file audio_file = "output.mp3" tts.save(audio_file)
Now let’s play the converted audio file from text to speech in Windows, using the Windows command “start” followed by the name of the mp3 file:
# Play the audio file os.system(f"start {audio_file}") # For Windows # os.system(f"afplay {audio_file}") # For macOS # os.system(f"mpg321 {audio_file}") # For Linux
Now it’s time to add an entry point to execute our application.
# Main section for standalone script execution if __name__ == "__main__": text = input("Enter the text you want to convert to speech: ") language = input("Enter the language code (default is 'en' for English): ") or 'en' text_to_speech(text, language)
Here’s a well-organized version of the code
from gtts import gTTS import os def text_to_speech(text, language='en'): try: # Create a gTTS object tts = gTTS(text=text, lang=language, slow=False) # Save the audio file audio_file = "output.mp3" tts.save(audio_file) # Play the audio file os.system(f"start {audio_file}") # For Windows # os.system(f"afplay {audio_file}") # For macOS # os.system(f"mpg321 {audio_file}") # For Linux except Exception as e: print(f"An error occurred: {e}") if __name__ == "__main__": text = input("Enter the text you want to convert to speech: ").strip() if not text: print("Text cannot be empty!") else: language = input("Enter the language code (default is 'en' for English): ").strip() or 'en' text_to_speech(text, language)
Save the above script to a file, for example, tts.py. Then, run the script using Python:
python tts.py
When prompted, enter the text you want to convert to speech and the language code if different from the default (English). The script will generate an audio file (output.mp3) and play it.
Building the Streamlit Application
import streamlit as st from gtts import gTTS import os # Function to convert text to speech def text_to_speech(text, language='en'): try: # Create a gTTS object tts = gTTS(text=text, lang=language, slow=False) # Save the audio file audio_file = "output.mp3" tts.save(audio_file) return audio_file except Exception as e: st.error(f"An error occurred: {e}") return None # Streamlit app st.title("Text to Speech App") # Input text text = st.text_area("Enter the text you want to convert to speech:") # Input language language = st.text_input("Enter the language code (default is 'en' for English):", 'en') if st.button("Convert to Speech"): if text.strip(): audio_file = text_to_speech(text, language) if audio_file: # Display audio player st.audio(audio_file, format='audio/mp3') # Provide a download link with open(audio_file, "rb") as file: btn = st.download_button( label="Download Audio", data=file, file_name="output.mp3", mime="audio/mp3" ) else: st.error("Text cannot be empty!")
To run your Streamlit app, navigate to the directory where you saved app.py and run:
streamlit run app.py
I hope you liked this article on converting text to speech using python. Feel free to ask your valuable questions in the comments section below.