flite_hts_engine: at last, good quality Puppy speech

Audio editors, music players, video players, burning software, etc.
Message
Author
mcewanw
Posts: 3169
Joined: Thu 16 Aug 2007, 10:48
Contact:

0.91.mce02.pet uploaded

#16 Post by mcewanw »

Changes: Fixed scripts to allow wav output.

This version of flite_hts_engine writes to either wav or to stdout (allows direct pipe to aplay; no fifo required)
github mcewanw

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#17 Post by amigo »

Great work William -no problem about the diff. I appreciate the fine work that you have done in the past and on this.

Text has to be 'vetted' for use by flite and other t2s programs as they don't always know what to do. Text with formatting like:
##########
==========
underscores and other special chars will not be read pleasantly at all.

mcewanw
Posts: 3169
Joined: Thu 16 Aug 2007, 10:48
Contact:

#18 Post by mcewanw »

flite_hts_engine sometimes gets oddly muddled. Try these two simple examples. On my machine at least, the second example flops...

flitet "Here is the new thread" | aplay

flitet "Here is the new thread and" | aplay

A long text always contains such difficulties here and there which flite_hts_engine makes a mess off. Otherwise it is great; if only it was consistent in that sense!
github mcewanw

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#19 Post by amigo »

I've been looking at my flite sources again. I had forgotten that I had found a file called speak.c on the net a long time ago, which someone had written to read a file line-by-line using flite. Actually, the program links in the flite libs so it is standalone. Anyway, it may be a good candidate to do the same with flite_hts_engine.

I'll post it tomorrow after working out (again) the details of how to use it. I had built flite and speak using 'diet' or uClibc, so the build recipe needs to be re-worked for using glibc.

I also was looking at the full sources for flite -especially the built-in sound server/client bits. It might not be too hard to build this into flite_hts and thereby reduce some of the latency of starting aplay each time.

Trobin
Posts: 968
Joined: Fri 19 Aug 2005, 03:16
Location: BC Canada

#20 Post by Trobin »

Anyway rto attach this to emacspeak?
[url]http://speakpup.blogspot.com[/url]

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#21 Post by jemimah »

Here is a gtkdialog frontend for flite. It takes an argument for the filename too, so you can use it with RoxRightClicks.
Attachments
flite-speak.gz
(546 Bytes) Downloaded 639 times

User avatar
abushcrafter
Posts: 1418
Joined: Fri 30 Oct 2009, 16:57
Location: England
Contact:

#22 Post by abushcrafter »

There's a new version out. Could you package it up please? Could you also slow down the speed at which it talks? With the current package, I find it is stumbling on a 1mb text file. Is that buffer size issues? In which case can you give the new version (If you package it up.) a larger buffer please. Thanks so far.

I have attached a bug fix of jemimah's gtkdialog frontend for flite.
Attachments
flite-speak-0.0.2.bz2
(829 Bytes) Downloaded 503 times
[url=http://www.adobe.com/flashplatform/]adobe flash is rubbish![/url]
My Quote:"Humans are stupid, though some are clever but stupid." http://www.dependent.de/media/audio/mp3/System_Syn_Heres_to_You.zip http://www.systemsyn.com/

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#23 Post by amigo »

Yes, there is a new version of flite available. It now comes with three or four voices which you can choose from using command-line options Yeah! The female voice seems to be the most natural sounding.

User avatar
jemimah
Posts: 4307
Joined: Wed 26 Aug 2009, 19:56
Location: Tampa, FL
Contact:

#24 Post by jemimah »

I don't think flite-hts-engine has been updated - the regular flite is much larger.

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#25 Post by amigo »

You are correct jemimah, flite-hts-engine has not been updated. The flite API has changed, so flite-hts-engine it may or may not adapt to the new API. flite-hts-engine has much better voice quality, but there are problems with it garbling some texts, and flite-hts can only read fairly short texts.

I've just found something new called svox-pico which is used on the Android platform. There is a pico2wave utility(if you use the ubuntu sources) which provides several very nice voices to choose from. Problem is it only outputs a *.wav file. Maybe one of us can figure out how to get it use stdout, or some audio lib to output sound directly...

I'm still searching for something light and usable...

mcewanw
Posts: 3169
Joined: Thu 16 Aug 2007, 10:48
Contact:

New version of flite_hts_engine uploaded

#26 Post by mcewanw »

flite_hts_engine is a reasonably good quality Text To Speech (TTS) synthesiser in an incredibly small package considering it includes the voice data (approx 1.6MB download).

New version (deb package for DebianDog and dotpet for Puppy), that seems to have fixed the problem of occasional garbled text can be downloaded, for now at least, from the first post of this thread.

William
github mcewanw

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#27 Post by greengeek »

Thanks William - I've just tried the dotpet and it works well on my slacko 5.6 derivative. Seems to respond well to commas and fullstops too. Here is my sample:

Code: Select all

# flitet "hello world, I am going to the bathroom. I may be some time" | aplay
(I am always searching for tts and stt methods for a voice controlled pup that i need to make some improvements on...)

EDIT : It makes an interesting difference if I add a hyphen after a vowel:

Code: Select all

# flitet "hello world, I am going to the ba-throom. I may be some time" | aplay
EDIT2 : It also seems to have the ability to change the inflection of the last word in the sentence. There is a slight difference in the ending of these two phrases:

Code: Select all

#  flitet "Where is president kennedies brain" | aplay
and

Code: Select all

#  flitet "Where, is president kennedies brain" | aplay

mcewanw
Posts: 3169
Joined: Thu 16 Aug 2007, 10:48
Contact:

Mage Platform for Performative Speech Synthesis

#28 Post by mcewanw »

Thanks, greengeek - you post interesting examples and observations regarding the subtleties of using flite_hts_engine.

As I mentioned in my first thread post, you may also find the project http://mage.numediart.org/ of interest. Mage is easy to compile (in DebianDog at least - it uses cmake rather than make during compilation) and is nicely described in a number of videos on its website. I have compiled Mage on my computer and am working on it just now. Hope to provide some useful dotpet/deb eventually for that too, but early days, so don't know if I will succeed or if too much work involved. The Mage videos are worth a look though, since show future possibilities surrounding some of the research going on out there.

William

EDIT: The required build system proved too large for the resources my machine has... Also the HTS code in Mage doesn't seem to be used or immediately usable with a realtime TTS like flite, as far as I could determine. Best I managed was to use its HTS batch binary to create a wav file from a provided .lab text file. Wasn't a particularly interesting exercise; flite_hts_engine appears to be more appropriate for practical use in Puppy as things are at the moment.
Last edited by mcewanw on Sat 06 Sep 2014, 10:35, edited 1 time in total.
github mcewanw

User avatar
Flash
Official Dog Handler
Posts: 13071
Joined: Wed 04 May 2005, 16:04
Location: Arizona USA

#29 Post by Flash »

Sounds a lot better if you put hyphens like this:

Code: Select all

# flitet "hel-lo world, I am going to the bath-room. I may be some time" | aplay

User avatar
sunburnt
Posts: 5090
Joined: Wed 08 Jun 2005, 23:11
Location: Arizona, U.S.A.

#30 Post by sunburnt »

Very nice William, this is something to inspire most anyone.

I`m thinking BaCon maybe able to help if the output is a problem.

mcewanw
Posts: 3169
Joined: Thu 16 Aug 2007, 10:48
Contact:

#31 Post by mcewanw »

sunburnt wrote:Very nice William, this is something to inspire most anyone.

I`m thinking BaCon maybe able to help if the output is a problem.
Hi Terry,

Yes, a bit pre-processing of the input text can work wonders. However, depends to some extent on a person's cultural background which voices they consider more realistic or better sounding. Here in New Zealand, for example, people speak English with very clipped short vowels, whereas, in Scotland, where I come from, long vowels are the norm! So when a Kiwi (i.e. New Zealander) says 'cat' it sounds something like 'cut' to my ears, and when I say 'cat', I imagine it sounds like 'caaat' to theirs! (And when they say 'sex' it sounds like 'six' to me, so I often misunderstand their jokes and humour! More worrying is the probability that when I say 'six' they probably think I'm saying 'sex'...)

By the way, I now have a version of DebianDog as a full install on a usb stick set up basically as a development environment. For the moment its purpose is to allow me to check out different program enviroments I'm interested in, so the usb stick currently includes Lazarus (with free Pascal), openframeworks (C++ framework) and I've just finished compiling BaCon from sources. Of course the first two, create pretty big executables compared to BaCon, there being lots of overhead from dependencies loaded by the frameworks themselves. For Lazarus, a stripped executable of around 3 MB or more can be expected, but the programming environment is like Delphi and very nice to use. BaCon can clearly produces much much tinier GUI executables comparatively, though not as small I think as straight C with gtk+ calls (xhippo GUI executable, for example, is under 50kB). However, executable size is less important to me nowadays than RAM usage and ease of creation... Whilst gtkdialog based scripts are tiny, the RAM used by gtkdialog is really quite substantial and a good BaCon or straight C/gtk_ program can probably do much better in many cases.

One of the many great things about DebianDog is just how easy it is to install these kind of programming frameworks - leaves more time for actually playing with them! :-)

Almost summer here though, so truth is I'm not at the computer quite as much at the moment, but looking forward to at last trying out BaCon.

William
github mcewanw

User avatar
sunburnt
Posts: 5090
Joined: Wed 08 Jun 2005, 23:11
Location: Arizona, U.S.A.

#32 Post by sunburnt »

Hey William; Sounds like you`re really fine tuning your dev environment.
Yes BaCon entails a small overhead when using HUG for GTK windows.
And BaCon itself is not quite as efficient as C itself of course, but close.
The exec. files are largish, but I have a compile command that helps.
And using upx on the final exec. files really brings them down to normal.

Remember to download hug.bac from the BaCon home page for GTK stuff.
And get BaCon and HUG html docs, I made text versions so no browser.

I keep the BaCon files in /media/sda3/BaCon/BaCon
And make links for /bacon and /hug.bac in: /usr/share/Bacon/
This way we don`t have Puppy`s problem of loosing everything with an
inevitable Save file corruption, same thing relocatable apps give us.

Keep all apps, media, and docs outside of the Save file on a real drive.

Yeah, DebianDog sure makes things easy that I struggled with in Puppy.
And the changeover learning curve was not all that bad ( multiuser ).
My time has been rather limited, but I`m hoping to get back at all of this.

What are your plans for BaCon? I`ll help and the forum guys are great.!
Good to talk to you William, keep the projects coming.!
.

Post Reply