Real Voice – AI Text To Speech Plugin For WordPress - Rating, Reviews, Demo & Download
Plugin Description
Real Voice is a versatile text-to-speech plugin for WordPress. It supports all the major text-to-speech services in a single convenient $29 package.
It comes with a customizable audio player, a dedicated dashboard to monitor the API calls to the text-to-speech services, and many customization options.
Supported Text to Speech Services
SpeechSynthesis (Web Speech API)
This option is for using the text-to-speech functionalities included by the browser without subscribing to a cloud service.
Technically, the text-to-speech conversion is performed using the SpeechSynthesis interface of the Web Speech API. Now, speech synthesis has become a viable solution for any production website since all major browsers now support it.
The customization options made available by SpeechSynthesis are available in the Real Voice plugin settings:
- Language – With this option, you can select the language of the utterance.
- Voice – Select one of the voices available in the user’s browser.
- Pitch – The pitch value determines the perceived “highness” or “lowness” of the generated voice.
- Rate – Select the speed at which the utterance should be spoken.
- Volume – This determines the volume value
Amazon Polly
Amazon Polly is a cloud service that converts text into spoken audio that is part of Amazon Web Services (AWS).
It supports a wide selection of standard (TTS) and neural (NTTS) voices for almost any language.
In the Real Voice plugin, we included all the essential options to get the most out of Amazon Polly. Specifically, you will be able to configure:
- AWS Region – The AWS region that you prefer to use.
- Voice ID – Select one of the many voices available in Amazon Polly.
- Engine – Select between Standard and Neural. We recommend using the Neural engine to produce the most natural and human-like text-to-speech voices possible.
- Language Code – Select one of the language codes supported by Amazon Polly.
- Lexicon Names – Here, you can set the lexicon names you want to apply during the synthesis.
- Output Format – You can select between mp3 or ogg_vorbis.
- Sample Rate – Multiple sample rates are available.
- Text Type – Plain text or SSML are supported.
For more information, see the Amazon Polly features here.
Google Text-to-Speech AI
Text-to-Speech AI is a service available in Google Cloud that converts text into natural-sounding speech using an API powered by the best of Google’s AI technologies.
This service supports a high number of voices and languages. Google categorizes the voices based on the technology used to produce them. Technical details on this are available here.
It’s easy to configure the audio generated by Google Text-to-Speech AI with the options included in the Real Voice settings:
- Audio Encoding – This option allows you to select the encoding of the audio files.
- Speking Rate – Here, you can select the speed at which the utterance is spoken.
- Pitch – Select the relative highness or lowness of the voice.
- Gain – The volume gain applied to produced audio.
- Effects Profile ID – With this option, you can apply specific audio profiles to the generated speech.
- Language Code – Here, you can select the language of the utterance.
- Voice Name – Use this field to choose one of the many voices the service provides.
Azure Text to speech
Text to speech is a service available in Microsoft Azure that converts text to lifelike speech.
This powerful service comes with a wide variety of voices that you can test in the voice gallery.
Let’s see the Azure Text to speech options included in the Real Voice plugin:
- Region – Select the Azure region that best fits your needs.
- User Agent – A custom value used to identify the requests performed by the Real Voice plugin to the cloud service.
- Output Format – The format in which the audio files should be encoded. This option determines the quality and the space occupied by the generated audio files.
ElevenLabs
ElevenLabs is a software company developing natural-sounding speech synthesis and text-to-speech software using artificial intelligence and deep learning.
This service can generate audio in multiple languages using the following AI models.
In Real Voice, we included these ElevenLabs options:
- Voice ID – This option determines the voice to be used.
- Optimize Streaming Latency – Use this option to optimize the generative process of the AI.
- Stability – Select how stable the voice is and the randomness between each generation.
- Similarity Boost – Optimize for clear, artifact-free voices or enhance for speaker resemblance.
- Style – Select the style of the voice.
Customize the audio player
We built a custom HTML audio player that can be customized by the user from the plugin settings.
You can, for example, configure the colors of all the elements displayed in the audio player and configure the typography of the audio player by setting a custom font size, font style, font weight, line height, and font family. You can even load a custom font family for the audio player from Google Fonts. We have also added the ability to create a drop shadow and determine its color.
Our custom audio player is also responsive. From the plugin options, you can set the responsive breakpoint used to switch the player UI from desktop to mobile.
Monitor your API calls from a dedicated dashboard
This Dashboard menu allows you to monitor the requests sent by the plugin to the cloud services used to generate the audio version of the articles.
Specifically, here you can:
- Read summary statistics like the total number of requests and the number of characters sent in a specific period.
- Visualize the requests to the API in a line chart.
- Browse the single API requests from a paginated table. Here, you can find the logged message returned by the cloud service.
- Filter the data by selecting a specific time interval or the considered text-to-speech converter.
Configure the plugin behavior
For the Real Voice plugin, we built a settings menu with React that currently counts 65 customization options. Here is what you can do from this menu.
Configure the post types where you want to add the audio player
The plugin allows you to enable the audio player only on specific post types. For example, if you want to add the text-to-speech audio player to your blog articles and exclude standard pages, add “Posts” in this selector.
Add custom text before or after the audio player
You can optionally display a custom message before or after the player. This message can be used, for example, to inform the visitor about the possibility of listening to the audio version of the article.
Display the spoken words
You can optionally display the words currently spoken by the player. Note that this feature is available only with the SpeechSynthesis audio player.
Additional audio content
Configure specific text that should be spoken before or after the post’s content.
Read the title
Automatically prepend the post’s title to the content that should be spoken.
Automatically generate the audio files
The plugin allows you to generate audio files that include the audio version of the post manually or automatically when the post is viewed on the website’s front end. You can control this behavior with a dedicated option.
Customize the capabilities
Configure who has access to specific plugin features by setting custom WordPress capabilities. For example, you can allow the editors to generate the audio files from the post editor, allow access to the dashboard with the statistics only to the site administrator, and more.