What is Text to Speech conversion project?

Text to speech conversion is a simple application where any text whether it is in PDF, text, or any written format the application will create a audio file and you can listen the text as per your pace. With the help of this application you can create your own audio books.

How text to speech conversion works?

The application tries to encode each words into a unique vector and then combines them as a neural network is formed and convert to voice.

Built Text to Speech Conversion API using Python

Hello, hope you are fine and doing well. It is extremely delighted to implement such an application where we can convey a message to the machine to perform something and Python is the most popular language that helps you create amazing projects. Are you excited to know how Machine is capable to speak? How Machine capable to understand Human language? So today whatever tools you work and use with like Alexa, Siri, Google Assistant use a buzzing technology called NLP(Natural Language Processing). If you want to know more about NLP please read this blog. But before learning about this it's important to know the technology and its advancement closely and that is what this article will help you with. We are going to implement an amazing mini project called text to speech conversion using plain python. So tight your seat belt and start your python IDE and get ready to learn, and code.

Overview on what is Text to Speech

Text to speech is a process to convert any text into voice. Text to speech project takes the text in words from a file or we can also provide it directly. And with the help of google's prebuilt library, it converts it into voice. The project is very much helpful who are struggling with reading PDFs and documents. So you can also implement one text reader for you where you can upload any PDF, and it will read for you.

Hands-on Implementing Text to Speech API

First, we will implement the basic application which converts the direct text of text file into voice. For this, you require to install the google text to speech library known as gtts library. you can simply install it using the pip command.

pip install gTTS

This is a simple program that helps you read all your PDFs and text. You can upload the name of your pdf or the directory where it is present and It will start reading for you. We can also change the language as per our preference like 'fr' used for french. For more such codes you can simply google it. Now you must be thinking like whenever we want to use it so we have run the program and edit it. So the answer s No! because every project has a user interface in form of GUI, web, or app so let's implement a user interface for the project.

Implementing GUI For Text to speech API using Tkinter

Tkinter is a Python popular library that is used to implement a GUI and we will implement a simple GUI for text to speech conversion.

Step-1) Import libraries and set the window

The first step is to import all the required libraries. The required libraries are gtts, Tkinter, and playsound. If you do not have this library installed then you can install it using the pip command in the command prompt in the directory you are working. playsound is a python module that is used to play any audio. It has only one function named playsound that accepts one argument that is path to file with audio or URL. It works with both MP3 and WAV audio extensions.

pip install gtts      
pip install tkinter   
pip install playsound

Now after importing let's write a code of the application in our file. Create a new python file and start implementing GUI.

step-2) Initialize the window with header and footer

Define the top-level header and bottom footer of our application to make it look attractive.

label(): widget used to add a label to the window.
font: style in which font should be written
Pack: It is a Tkinter geometry used to organize the widget in a block.
Msg: string type variable
Entry(): used to create input text field(text area)
textvariable: used to retrieve current text to the entry widget
place(): It helps to organize the widget to a specific location by defining x and y-axis parameters.

Step-3) Implement Functions

Now, we will implement the main functionality of our application. The first function will be to convert text to speech. The second function is to exit the status and the third function is to reset the text.

message: It will store the values from entry_field.
text: a sentence that should be read
lang: it takes the language in which text should be read.
slow: used to read the text more slowly. by default it is false.
speech: stores the converted voice from the text
.save() function: It will create and save the new file with output in the mp3 extension.
playsound(): used to play the sound. we can also use os to start reading as we have seen in the above snippet.

Step-4) Define Buttons

Now, we have made a function and it's time to call them and add their functionality in use. we will create 3 buttons for three different functions in a row.

Button(): widget used to display button on the window.

That's the end of our application. And now you can end it to keep the scene after running the program using the mainloop() method.

root.mainloop()

Run the program from the terminal using the below command in the working directory. And final output you can see as shown below in the demo output image.

>>python text_to_speech.py

Summary

We have implemented a simple application that will help us to read our various PDFs and understand how machine speaks or how google assistant speaks in our native language. In the upcoming article, we will see how to recognize speech and convert speech to text as well as language translators with voice cloning. If you have any queries or suggestions then please let us know. It will help us and others to know something new and different.
Be Safe, keep learning, keep smiling😊

Thank You!.

Built Text to Speech Conversion API using Python | Python Project for Beginners