What device ?
Only x86 Linux since we’ll be using Wine to run a SAPI server.
Under which conditions ?
On Debian Linux Jessie
What to expect ?
An excellent quality Text to Speech (later on TTS) for any textual input, in French (English would do too, but it’s more frequent) without Internet connection, using Best-Of-Vox voices.
What were the pitfalls ?
Wine & XvFB will have to run (but it takes almost no memory nor resident CPU).
Best-of-Vox voices are 39$ in SAPI version, while only 3$ in Android version. Why ?
Why bothering using SAPI and emulating Win32 ?
Following my previous article, we have narrowed down the voice providers to Cereproc and Voxygen. Unfortunately I’ve contacted Voxygen team, and while being very friendly, their commercial linux embedded option is out of my budget. However, since they sell a lot of voices, either via Android or SAPI, and since I’ve already covered how to use Android voices, this times I’ve tried to use x86 based SAPI voices.
Since Intel’s Bay Trail CPU architecture, x64 architecture is accessible with the same performance per watt as ARM devices, and it opens the doors to many existing libraries, and most notably Windows.
Step by step
Step 1 : Build and install Wine
First things, you need to download a recent Wine, since Jessie does not come with the minimum requirement for SAPI for Linux (S4L for short). You also need debug version because using a debugger is required later on.
As the day of writing, I’ve downloaded version 1.7.41.
Because Wine is emulating (yes, it’s emulation) x86 code, you’ll need to install multiarch libraries on your system so 32 bits code runs on your 64 bits system. For debian it’s easy, just type:
$ sudo dpkg --add-architecture i386 [...] $ sudo apt-get build-dep wine
Then goes into your wine archive extracted folder, and run usual autogen/configure/make/sudo make install
Wine is a real mess, build requires many different tricks (at the time I’ve hacked it, I had to google a lot to be able to built it with minimal dependencies)
Step 2: Install S4L
Follow instructions from here.
Update: The website seems down so here’s the link to the file you need to download.
You don’t care about test failing in pyspda.
At the end of the process, you should be able to use sapilektor to make a nice “Microsoft SAM” jingle’s wav file (if not, re-read the installation instructions from the site above, you’ve probably missed something).
Step 3: Install Best-Of-Vox voice
Once you’ve ordered the voice you like in Best-Of-Vox website, you’ll receive an email with a link to download a Win32 .exe installer, and a “coupon” which is what you need to get a license.
Start a X server on your computer (or remotely, as I did), and set the Wineprefix you need:
$ export WINEPREFIX=/home/whatever/.winesapi $ winecfg -> Change windows version to WinXP $ winetricks winhttp -> If you don't have winetricks, you'll find it here: -> http://winetricks.org/winetricks $ wine BOV-SAPI-Helene_126.96.36.199_1.exe -> This might or might not show a crash dump while installing for a "update.exe" process, but you don't care
When the installation dialog show a “enter license code”, you’ll need to copy the code you’ve received by email after your purchase.
However, the installation will crash/fails if you press “Next”, so you need to “hack” to get it pass this step.
Until you get the “Timeout error” message box, you’ll need to hack with winetricks (typically, default Wine installation lacks many many basic features in the DLL it implements, and you need to download many DLL from microsoft). Winetricks does this for you, unfortunately, I’ve installed many of them, and I don’t know which one are absolutely necessary. Here’s what I have in my wineprefix:
$ ./winetricks list-installed crypt32 dotnet20 msasn1 msls31 msxml3 msxml6 secur32 setupapi vcrun2005 vcrun2008 winhttp wininet wsh56vb -> install with "winetricks wininet winhttp wsh56vb etc..."
If you get a “Message 42” or any other error than “timeout”, then play with winetricks. You can not continue further until you get the “Timeout” error message box.
I’ve unpacked the installer, and disassembled the script. The installer tries to post many system specific information to https://keys.voxygen.fr. Unfortunately, wine default implementation for socket does not map 1:1 to what the installer expect, and it timeout, so you can’t pass this last step of installation.
The returned data from the server is a license file (a .lic file) that’s absolutely required to get the SAPI voice to work.
So you need to hack it, and here’s how:
Start installation up to serial enter code dialog, enter it, but don’t press “Next” button.
In another terminal, enter these commands:
$ winedbg Winedbg> info process pid threads executable (all id:s are in hex) 00000010 1 'explorer.exe' 0000000e 5 'services.exe' [...] // Look the pid for BOV-SAPI...7.4.1.tmp *tmp is important* Winedbg> attach 0x23 // Use PID from the list above 0xf77a8d5e __kernel_vsyscall+0xe in [vdso].so: int $0x80 Winedbg> // On the installation windows, press "Next" button, and quickly press Ctrl+c in this terminal Winedbg> <Ctrl+c> Ctrl-C: stopping debuggee 0xf774dd5e __kernel_vsyscall+0xe in [vdso].so: int $0x80 Winedbg> bt // Walk up with "up" command until you get into the socket code Winedbg> up Winedbg> list [...] ret = recv(fd, msg, len, flags); [...] Winedbg> fini Winedbg> set ret = 1 // This actually replace the timeout return code from recv to a success code Winedbg> c
Voilà, you should now get the last message from the installer dialog, and mainly, the license file was created, and registry stuff is installed by the installer so the SAPI voice is working.
In my case, Helene voice was listing with “sapiconfig” command, so I only needed to validate it with “sapiconfig -s” and I was able to use in with sapilektor & sapitest too.
Without this hack (that is, by interrupting the installer abruptly at license page), when using the Helene voice, SAPI failed with an 0x80045012 error (exception in SAPI engine).
Also, if the above does not work for you, you can replace the “Ctrl+c” sequence while waiting in winedbg by:
Winedbg> b WinHTTPSendRequest Winedbg> c // enter the coupon code and when it breaks in the debugger: // Inspect the stack Winedbg> x/1x $ebp+0x14 0017c65c // This will change for you Winedbg> x/1024c 0x0017c65c // Use the number from your output 0x0017c65c: p r o t o c o l v e r s i o n = 0x0017c67c: 1 & p r o d u c t = B O V _ S A 0x0017c69c: P I & e n g i n e v e r s i o n 0x0017c6bc: = 7 . 3 & u i d = x x x x x x x [...]
Copy the last part in a text editor, remove all useless spaces (so you get “protocolversion=1&product=…”)
Then use curl to POST to “https://keys.voxygen.fr/ask” with the data above. If you are very careful, you’ll get back a license file you can save in your installation folder “\voices\Helene\licence\voxygen.lic”
This is very painful to get right, so the “replace the return value from recv” method is much easier.