Text-to-Speech Application with Python

Are you interested in transforming text into spoken words with a few lines of Python code? In this blog post, we’ll explore how to build a simple Text-to-Speech (TTS) application using the gTTS (Google Text-to-Speech) library. This is a fantastic way to get started with natural language processing and voice synthesis.

Table of Contents

Why Text-to-Speech?

Text-to-Speech technology converts written text into audio, enabling applications like voice assistants, audiobooks, and more. It can enhance accessibility for visually impaired users and improve user experiences in various applications.

We’ll use the gTTS library, which provides a simple interface to Google Translate’s TTS API.

First, you need to install the gTTS library. Open your terminal or command prompt and run:

pip install gtts

I will simply start with importing all the necessary libraries that we need for this task to create a program that takes the text and converts it into speech:

#Import the libraries
from gtts import gTTS
import os
import nltk

Now, if you have faced any error in importing the libraries, then you must have not installed any of these libraries. You can easily install any library in python, by writing a very simple command in your terminal – pip install package name.

Now after installing and importing the libraries, we need to create a function that takes an English language text so that we can create a program to convert text to speech from the user text.

def text_to_speech(text, language='en'):

Now we have to choose the language of speech. Note “en” means English. You can also use “pt-br” for Portuguese and there are others:

language = 'en' #English

Now we need to pass the text and language to the engine to convert the text to speech and store it in a variable. Mark slow as False to tell the plug-in that the converted audio should be at high speed:

tts = gTTS(text=text, lang=language, slow=False)

After that we can save the audio file into mp3 format so that we can play.

# Save the audio file
audio_file = "output.mp3"
tts.save(audio_file)

Now let’s play the converted audio file from text to speech in Windows, using the Windows command “start” followed by the name of the mp3 file:

# Play the audio file
os.system(f"start {audio_file}")  # For Windows
# os.system(f"afplay {audio_file}")  # For macOS
# os.system(f"mpg321 {audio_file}")  # For Linux

Now it’s time to add an entry point to execute our application.

# Main section for standalone script execution
if __name__ == "__main__": 
text = input("Enter the text you want to convert to speech: ") 
language = input("Enter the language code (default is 'en' for English): ") or 'en' 
text_to_speech(text, language)

Here’s a well-organized version of the code

from gtts import gTTS
import os

def text_to_speech(text, language='en'):
    try:
        # Create a gTTS object
        tts = gTTS(text=text, lang=language, slow=False)
        
        # Save the audio file
        audio_file = "output.mp3"
        tts.save(audio_file)
        
        # Play the audio file
        os.system(f"start {audio_file}")  # For Windows
        # os.system(f"afplay {audio_file}")  # For macOS
        # os.system(f"mpg321 {audio_file}")  # For Linux
    except Exception as e:
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    text = input("Enter the text you want to convert to speech: ").strip()
    if not text:
        print("Text cannot be empty!")
    else:
        language = input("Enter the language code (default is 'en' for English): ").strip() or 'en'
        text_to_speech(text, language)

Save the above script to a file, for example, tts.py. Then, run the script using Python:

python tts.py

When prompted, enter the text you want to convert to speech and the language code if different from the default (English). The script will generate an audio file (output.mp3) and play it.

Building the Streamlit Application

import streamlit as st
from gtts import gTTS
import os

# Function to convert text to speech
def text_to_speech(text, language='en'):
    try:
        # Create a gTTS object
        tts = gTTS(text=text, lang=language, slow=False)
        
        # Save the audio file
        audio_file = "output.mp3"
        tts.save(audio_file)
        
        return audio_file
    except Exception as e:
        st.error(f"An error occurred: {e}")
        return None

# Streamlit app
st.title("Text to Speech App")

# Input text
text = st.text_area("Enter the text you want to convert to speech:")

# Input language
language = st.text_input("Enter the language code (default is 'en' for English):", 'en')

if st.button("Convert to Speech"):
    if text.strip():
        audio_file = text_to_speech(text, language)
        if audio_file:
            # Display audio player
            st.audio(audio_file, format='audio/mp3')
            
            # Provide a download link
            with open(audio_file, "rb") as file:
                btn = st.download_button(
                    label="Download Audio",
                    data=file,
                    file_name="output.mp3",
                    mime="audio/mp3"
                )
    else:
        st.error("Text cannot be empty!")

To run your Streamlit app, navigate to the directory where you saved app.py and run:

streamlit run app.py

I hope you liked this article on converting text to speech using python. Feel free to ask your valuable questions in the comments section below.

Shivan Kumar

Kaggle Master & Senior Data Scientist ( Ambitious, Adventurous, Attentive)

Text-to-Speech Application with Python

Why Text-to-Speech?

Here’s a well-organized version of the code

Building the Streamlit Application

Shivan Kumar

Leave a Reply Cancel reply

Share This Post

Latest Post

Leave a Reply Cancel reply

Related Posts:

Linear Regression: A Comprehensive Guide with 7 Key Insights

Ultimate Guide to Mastering Prompt Engineering Techniques – Part 2

Mistral 7B: Game-Changing Foundation Model Research Paper Summary

Join Us

Copyright © 2024 Site Developed by:- Coding With Yash