Hack 53 A Talking, Lip-Synched Avatar 
Synchronize an animated head to the speech
synthesizer.
After completing the
speech synthesizer [Hack #52], I
showed it to Adam Phillips, and a few days later,
he'd drawn a suitably robotic character named
Charlie, shown in Figure 7-3.

Adam also provided a complete set of mouth shapes for lip sync
animations, as shown in Figure 7-4. Many of the
mouth shapes are used for more than one allophone. For example, the
symbol JShCh shows the mouth shape used for
three different allophones: "j for
jump," "sh for
ship," and "ch for
Charlie."

The next step was to map each of the 77 allophones to one of the 13
mouth shapes, in a way that allows a script to recognize when each
allophone is spoken and display the appropriate mouth shape.
First, I arranged the 13 mouth shapes as keyframes in a movie clip
instance named mouth, as shown in Figure 7-5.

I then created an array to provide the link between the frame numbers
in mouth with the allophone names. Rather than
create a separate array element for each allophone, I created a
separate array element for each voice shape. Why? Well, because there
are far fewer mouth shapes than allophones (13 mouth shapes versus 77
allophones). I took advantage of a quick way of searching though an
array [Hack #79] .
Here's my solution.
First, I created an array of strings, one per mouth shape. Each
string consists of all the allophones that are associated with that
mouth shape. The allophones are padded on either side by spaces,
making it possible to distinguish between
"aer" and
"r."
var shapes = new Array( );
// Define an array of mouth shapes with the corresponding allophones.
shapes[0] = " space ";
shapes[1] = " b bb m p ";
shapes[2] = " a aer ay ee er err i ii ";
shapes[3] = " aa ";
shapes[4] = " r " ;
shapes[5] = " o ";
shapes[6] = " or ow oy ";
shapes[7] = " oo ou ouu w wh ";
shapes[8] = " ck d dd dth g gg h hh n ng nn s t tt z zh ";
shapes[9] = " c e ear k y yy ";
shapes[10] = " f u uh ";
shapes[11] = " ch sh j ";
shapes[12] = " l ll ";
shapes[13] = " th ";
The mouthAnim( ) function looks up an allophone string
(such as "th") in the
shapes array. It then uses the element number of
the search result to jump to the frame in movie clip
mouth that contains the most appropriate mouth
shape. The first mouth shape is at frame 10, the second at frame 20,
and so on. To better see how this works, uncomment the trace action
(shown in bold) when you run the FLA.
function mouthAnim(phone) {
for (var i = 0; i < shapes.length; i++) {
if (shapes[i].indexOf(" " + phone + " ") != -1) {
// trace(phone + " found in " + shapes[i]);
mouth.gotoAndStop((i+1)*10);
break;
}
}
}
The full changes to the code from the earlier speech synthesizer hack
[Hack #52] are shown in bold:
makePhrase = function ( ) {
if (soundCount < soundMax) {
soundCount++;
speech.attachSound(aPhones[soundCount]);
mouthAnim(aPhones[soundCount]);
speech.start( );
} else {
delete speech.onSoundComplete;
}
};
function say(phrase) {
var i = j = 0;
aPhones = new Array( );
for (i = 0; i < phrase.length; i++) {
if (phrase.charAt(i) != "|") {
aPhones[j] += phrase.charAt(i);
if (phrase.charAt(i) == " ") {
aphones[j] = "space";
}
} else {
j++;
}
}
speech.attachSound(aPhones[0]);
mouthAnim(aPhones[0])
speech.start( );
speech.onSoundComplete = makePhrase;
soundCount = 0;
soundMax = j-1;
}
function mouthAnim(phone) {
for (var i = 0; i < shapes.length; i++) {
if (shapes[i].indexOf(" " + phone + " ") != -1) {
// trace(phone + " found in "+ shapes[i]);
mouth.gotoAndStop((i+1)*10);
break;
}
}
}
var speech = new Sound(this);
var shapes = new Array( );
shapes[0] = " space ";
shapes[1] = " b bb m p ";
shapes[2] = " a aer ay ee er err i ii ";
shapes[3] = " aa ";
shapes[4] = " r " ;
shapes[5] = " o ";
shapes[6] = " or ow oy ";
shapes[7] = " oo ou ouu w wh ";
shapes[8] = " ck d dd dth g gg h hh n ng nn s t tt z zh ";
shapes[9] = " c e ear k y yy ";
shapes[10] = " f u uh ";
shapes[11] = " ch sh j ";
shapes[12] = " l ll ";
shapes[13] = " th ";
say("h|e|ll|oo| | | | | |h|ow| |ar| |y|ouu| | | | |tt|u|d|ay| |");
stop( );
The Poser application [Hack #32]
can be made to lip sync to an animation (although you may have to buy
a separate application—Mimic by Daz,
http://www.daz3d.com—to do this). This allows you
to either create animated speaking characters/avatars or to create
guide images to help you in creating your animation keyframes.
Oddcast (http://www.oddcast.com)
is a good example of a professional application using speaking
avatars.
—Art input and animation expertise from Adam
Phillips
|