Server crash, and moving to a new hoster


After 6 long years, my good old Shuttle server died a week ago :(
Even if both the power unit and the CPU burned (!), there were absolutely no data dammage.

This little fellow used to run web sites, home-automation services and share internet connections between all my devices.

I tried to replace my Shuttle with a plain-vanilla mini-PC I quickly shopped. Then … I ran into this fuc*ing UEFI/Secure Boot mess which wouldn’t let me restore my GNU/Linux system, went mad, and returned back this piece of crap one hour later (yes, I owned for one hour a Windows 8 System. I’m still ashamed of myself).

Since most of my home services are running now on Raspberry Pies, there were no real need for me to keep self-hosting this blog. Thus, I got me a regular internet-box from my ISP and found a hoster for the web sites.

I’m back home, after a few days in Florence (Italy). And quantum-bits.org is back online :)

Hopefully, without too much regressions.

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on August 23rd, 2013 | Filed under Analog life, Random thoughts
Comment now »

Project “Jarvis”: step five (look at me)


Tonight, I’ll go for a very simple hack: connect a webcam, detect motions and stream a live feed over HTTP. Not sure how it’ll fit with Project “Jarvis”, but who knows …

Hardware

First things first: the hardware. For some reasons, the only webcam that I had was an old Apple iSight. You know, the old firewire one… Since there is no IEEE 1394A port on the Raspberry Pi, I had to buy a new one.

I settled for a Hewlett-Packard HD-2300 USB webcam. I took a chance, since it was not on hardware compatibility list, but it was available, reasonably priced for its category, and didn’t look like too bad (I know, it is silly, but it is actually one of my buying criteria):

Nevertheless, it appeared right away:

root@applepie /etc/motion # lsusb
Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp. 
Bus 001 Device 004: ID 050d:1102 Belkin Components F7D1102 N150/Surf Micro Wireless Adapter v1000 [Realtek RTL8188CUS]
Bus 001 Device 005: ID 050d:0234 Belkin Components F5U234 USB 2.0 4-Port Hub
Bus 001 Device 006: ID 04b8:0007 Seiko Epson Corp. Printer
Bus 001 Device 007: ID 03f0:e207 Hewlett-Packardroot@applepie /etc/motion #

A subsequent “lsusb -v” command gave my nice details about the webcam. Cool :cool:

Setting up “Motion”

Now, the software part. Motion is a nice piece of Open Source software that (among other thing):

  • Does motion detection (and optionnaly record video and/or frames whenever a motion is detected)
  • Takes timed snapshots regardless of motion detection.
  • Lives video IP stream in MJPEG format.

The installation was pretty easy:

1
apt-get install motion

I only changed a a few settings. Namely:

  • I turned “start_motion_daemon” to “yes” in /etc/default/motion to enable the daemon
  • I switched “location” to “on” in /etc/motion/motion.conf to turn motion detection on
  • I switched “webcam_localhost” to “off” in /etc/motion/motion.conf to enable access to live streams from anywhere on the LAN
  • I tweaked the “text_left” setting to let the message “ApplePie” appear on the bottom left of the streams

I left all the other setting as they were (including frame rates and resolution, as it was only a matter of test), and fired up the daemon:

1
/etc/init.d/motion start

I set the Firefox’s URL to http://applepie:8081 (as 8081 is the default port used by motion), pointed the webcam at my “Forbidden Planet” poster and shook the cam a bit to simulate a motion:

Not bad :) It thought that Robby the robot moved. Now, that’s kinda cool :cool:

I guess, that’s it for tonight (and was quite easy, as it was widely documented on many other web sites).

Note: for some reason, I have not received yet a password for my requested account on elinux.org wiki, even though I got the e-mail address confirmation message and activated the account. I guess I’ll update the Raspberry Pi compatibility list later.

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on February 20th, 2013 | Filed under Free Software, Raspberry Pi
3 Comments »

Project “Jarvis”: step four (GUI)


Here comes another step in the conception on project “Jarvis”. For once, let’s not be too geeky. I feel artsy today. I’ll focus on the GUI and fire up Inskcape et the Gimp. Yeah, still a bit geeky, I know…

 
Jarvis-like GUI

Let’s try the first idea take comes to my mind: create an interface … just like Marvel’s Jarvis. I googled a bit, watched a couple of scenes from the movies and came up with a first rendering:

And it sucks :poop: !
Yes, Jarvis on the movies looks badass. But as a GUI, no matter how hard I tried, there no way it can be usefull to anything at all.

Siri-like GUI

The second idea would be a Siri-like interface. Just like I drew in the first scketches. Here’s a first rendering:

And … that sucks too :poop: ! It is a just a boring copy of Apple’s SIRI :yawn:

Plus, I want something dead simple, and that would provides natural ways to navigate between Wolfram|Alpha’s pods.

Haze-like GUI
Haze is a fantastic weather forcecast app for iOS, with a very unique interface. Almost peotic. That’s definitly what I want for project Jarvis :) !

Here’s a sketch of what I have in mind:

I feel, the GUI should be as minimalistic and intuitive as possible:

  • A simple button to trigger voice acquisition
  • A smaller button to rise a keyboard and enter a query
  • A third button to share the results
  • On top of these buttons, the textual reformulation of the query
  • At the center, Jarvis’ answer to the query
  • Around the main bubble, various pods from Wolfram|Alpha

 
I like that :silly: !

And I guess, that’ll be all for this week-end !

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on February 17th, 2013 | Filed under Art, Free Software, Raspberry Pi
Comment now »

Project “Jarvis”: step three (the brain)


During the last steps, voice recording, speech-to-text and text-to-speech feasibility was studied. Now enters another difficicult part: the brain !

The last steps implied the use of external services for voice-recognition and text-to-speech capabilities.

When it comes to Jarvis’ brain, the idea is twofold:

  • Onboard answer engine: part of the analysis will be done onboard with simple regular expressions (as it was seen on the “proof of concept” step).
  • External answer engine: the other part of the analysis is triggered whenever the first one fails. More than a fallback engine, this service should enrich the answer as much as possible.

Wolfram|Alpha is a computational knowledge engine. It’s an online service that answers factual queries directly by computing the answer in terms of structured data, rather than providing a list of documents or web pages that might contain the answer as a search engine would do. It’s a pretty good candidate for Jarvis.

Jarvis’ Workflow

This simplified illustration describes Jarvis’ workflow, including voice acquisition, external speech-to-text, parsing & analyzing (including Wolfram|Alpha service), external text-to-speech and actions:

Wolfram|Alpha

The Wolfram|Alpha API provides a web-based API which allows clients to submit free-form queries. The API is implemented in a standard REST protocol using HTTP GET requests. Each result is returned as a descriptive XML structure wrapping the requested content format.

Roughtly speaking, these results are divided into sections: assumptions and pods.

The sections tell you what assumptions Wolfram|Alpha made about certain parts of the input, and what the alternative values are for each assumption.

The sections correspond to one category of result. Each pod has a title (“Input interpretation” is the title of the first pod) and a content (a GIF image). Pods may also have additional features such as a plaintext representation that appears in a popup when you mouse over the image, and JavaScript buttons that replace the pod with different information in an AJAX-style operation.

The web-service API can be accessed from any language that supports web requests and XML. Furthermore, the Wolfram|Alpha community provides a set of bindings for Ruby, Perl, Python, PHP, .Net (juk!) Java (and Mathematica, of course). :cool:

I created an account on Wolfram|Alpha and applied for an AppID (which I received right away). I free account allows up to 2000 queries a month, which should plenty.

I downloaded the PHP binding to my Raspberry Pi, read the documentation (no, I didn’t, I’m kidding. Who reads manuals ???), and in less than 100 lines of code I got this running:

It’s a first try, but it looks promising. Jarvis may have soon a brain.

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on February 17th, 2013 | Filed under Free Software, Raspberry Pi
1 Comment »

Project “Jarvis”: step two (speak to me)


In my previous post, I conducted a few experiments with speech recognition via Google’s Speech API and get enough results to push the project “Jarvis” a bit further.
Now it is time for Jarvis to speak !

 
Text-To-Speech engines

There are many “Text-To-Speech” engines already packaged for the Rasberry Pi. Namely:

  • espeak: eSpeak is compact Open Source speech synthetizer (for English and other languages). It is available as a shared libray and as a command line program to speak from a file or from stdin. It can be used as a front-end to mbrola diphone voices.
  • festival: Festival Speech Synthesis System is a multi-lingual Open Source speech synthetizer which offers Text-To-Speech capabilities with various API.
  • flite: festival-lite is a small run-time speech synthesis engine developed at Carnegie Mellon University, derived from Festival.

Let’s install and try these three engines:

1
2
3
apt-get install espeak
apt-get install festival
apt-get install flite

Unfortunatley, I ran into a set of broken packages when I tried to install mbrola voices for espeak and festival:

root@applepie ~ # apt-get install mbrola-en1 mbrola-fr1 mbrola-fr4  mbrola-us1 mbrola-us2 mbrola-us3 festvox-en1 festvox-us1 festvox-us2 festvox-us3
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
 
The following packages have unmet dependencies:
 mbrola-en1 : Depends: mbrola but it is not installable
 mbrola-fr1 : Depends: mbrola but it is not installable
 mbrola-fr4 : Depends: mbrola but it is not installable
 mbrola-us1 : Depends: mbrola but it is not installable
 mbrola-us2 : Depends: mbrola but it is not installable
 mbrola-us3 : Depends: mbrola but it is not installable
E: Unable to correct problems, you have held broken packages.

It meant that the outputs from espeak and festival would quite probably be rather poor in quality. Thus, I introduced a new contender as an external service: Google Text-to-Speech API.

Here’s a little benchmark, where the speech outputs from each engine are compared, given the same quote from 2001 Space Odyssey.

Benchmark #1: espeak

Getting a .wav file from plain text is quite easy:

1
espeak "Look Dave, I can see you're really upset about this" --stdout > espeak.wav

Here’s the .wav output from espeak:

espeak

As expected, it is really bad. It reminds me of the speech synthetizer I used to play with on my Atari 1040STF in the 80′s :(

Benchmark #2: festival

Getting a .wav file from plain text is also easy:

1
echo "Look Dave, I can see you're really upset about this" | text2wave -o festival.wav

And the resulting .wav output is:

festival

Less robotic, but still very far from what I need for Jarvis :(

Benchmark #3: flite

Getting a speech output form flite is as simple as it is form espeak and festival:

1
echo "Look Dave, I can see you're really upset about this" | flite -o flite.wav

And the resulting .wav goes like this:

flite

Better. It’s getting HAL-like, but I really need something closer to a real human voice.

Benchmark #4: Google TTS

Google Text-To-Speech is a private REST API. Getting results is less straightforward but noneless very easily manageable. Here’s a little PHP script:

1
2
3
4
5
<?php
$voice = urlencode("Look Dave, I can see you're really upset about this");
$cmd ='/usr/bin/curl -A "Mozilla" "http://translate.google.com/translate_tts?tl=en_gb&ie="UTF-8"&q='.$voice.'" > google.mp3';
shell_exec($cmd);
?>

And here’s the result (converted to the same .wav format):

Google (en_gb)

Much much better :cool: . Maybe a little too slow. Let’s try to play with localizations and switch from British English to US English:

Google (en_us)

Surprisingly, the US voice is female :D
Not bad. Now, let’s try a French version:

Google (fr_fr)

Really good. Also a female voice. It is actually very close to the synthetic voice used at SNCF (French Railroads) stations. Kind of a scary voice. It feels like … I’m gonna miss a f**king train.

I think I’m gonna settle for the Bristish voice from Google’s Text-To-Speech Engine.

I’ll have to rely (once more) on an external service, but a electronic butler has to be British :p

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on February 16th, 2013 | Filed under Free Software, Raspberry Pi
2 Comments »

Project “Jarvis”: step one (proof of concept)


Adding Siri to both my old iPad 1 and iPhone 4 was a failure :(
Jailbreaking went nice, but messing up with SiriPort was a complete disaster, and it took me nearly 2 hours to turn back these devices into something different than a brick.

 
And thus … no SiriProxy for me. But then again, why should I mess with existing closed-source crap, when I can build my own stuff ? Hum ?

Project “Jarvis”

Here comes Project “Jarvis“. Ok, the name sucks… I shouldn’t watch these Marvel movies. And the logo is no more than a copy of Siri’s own logo, with a touch of Raspberry color. I’ll work on these later: now, it is time to proof check the ideas behind this project.

The principles are quite simple:

  • 1 – A mobile App is used to record a simple question and send it to the Raspberry Pi
  • 2 – The Raspberry Pi transforms the recorded voice into something understandable by Google’s Speech API and push the result to it
  • 3 – Google Speech API returns back its voice-to-text interpretation as a JSON data structure
  • 4 – The Raspberry Pi parses the data, builds something out of it and sends back its answer to the mobile App (and eventually to a Home Automation system)
  • 5 – The mobile app prints out the answer to the question.
  • 6 – Applauses and tears of joy

 
Proof of concept

First, let’s record a simple question. “Quelle heure est-il ?” (What time is it ?) will be a good start:

Then, let’s send it to the Rapberry Pi:

1
scp heure.caf root@applepie:/opt/jarvis

In order to get it interpreted by Google’s Speech API, one as to convert the record from Apple’s CAF (Core Audio Format) to the much more standard FLAC format:

1
2
apt-get install ffmpeg
ffmpeg -i heure.caf heure.flac

Let’s send it to Google Speech API:

1
2
3
curl -i -X POST -H "Content-Type:audio/x-flac; rate=44100" -T heure.flac "https:
//www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=fr-FR&maxr
esults=10&pfilter=0"

After a 1 or 2 seconds, I got the answer from Google:

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Disposition: attachment
Date: Sun, 10 Feb 2013 22:50:42 GMT
Expires: Sun, 10 Feb 2013 22:50:42 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE
Transfer-Encoding: chunked
 
{"status":0,"id":"f75093db420033490c2424cdb58de963-1","hypotheses":[{"utterance":"quel heure est il","confidence":0.61982137},{"utterance":"quelle heure est il"},{"utterance":"quel temps fait il"},{"utterance":"quelle heure est-il"},{"utterance":"quel temps va til"}]}

Not bad :cool:
 
Polishing up

First, let’s write a few lines of PHP on the Rasperry Pi (see previous post for the details of the Nginx/PHP installation):

  • to trigger the ffmpeg conversion
  • to sent the converted FLAC record to Google’s speech-to-text engine
  • to get the JSON data structure back
  • to parse the XML result (a few regexps would do)
  • to send back a well thought answer to the question

 
Then, let’s fire up XCode, and with the help of the Core Audio API documentation, let’s write down a few lines of Objective-C:

Pretty cool for a 2-hours work :cool:

 
Now what ?

I guess the proof of concept is conclusive :)

Now, the trick is that is not exactly fast. Almost … as slow as Siri.

The exchange with Google is the bottleneck. Also, I’d rather not depend on a private external API. I guess, one of the next step will be to see how would PocketSphinx fit into this project.

The CAF-to-FLAC convertion could also be done on the iOS side of the project. I’ll check out this project later: https://github.com/jhurt/FLACiOS.

Also, Jarvis is litterally speechless. Adding a few “text-to-wav” functionalities shouldn’t be too hard since espeak or festival are already packaged by Raspbian.

Then, of course, I’ll have to put a bit of thought into Jarvis’s brain (text analyzer) and hook the Raspberry Pi to some kind of Home Automation system.

And the iOS part needs a lot of looooove.

But I guess, that’s enough for a first step.

[del.icio.us] [Digg] [Facebook] [Google] [Technorati]

Published by: Fred on February 11th, 2013 | Filed under Free Software, Raspberry Pi
1 Comment »

»