May 12, 2009

Clouds from Both Sides

Wordle describes itself as "a toy for generating 'word clouds' from text."

It's one thing to read that, compared to Shakespeare's nearly 30,000-word vocabulary, Racine only uses about 4,000 words in his entire works.

It's another thing to take the 1,642 words he used in Phèdre, and paint a picture with the top hundred and fifty:

FrenchWordle.jpg

The 2009 Stratford production of Phèdre will be the world premiere of a new version translated and adapted by Timberlake Wertenbaker. Here's her English word cloud:

TimberlakeWordle.jpg

And for comparison, here's what the play looks like using Robert Bruce Boswell's 1909 translation, Phaedra:

EnglishWordle.jpg

Apparently the Wordlization process is language-sensitive, as it has an option to "remove common words" in either English or French, without which the result would look something like this:

CommonEnglish.jpg

These common words are called "stop words" by computer programmers, who make sure search engines like Google filter them out of their results.

The whole computers-and-languages idea got me thinking. I tried running the original French text through Google Translate, to see how far a machine's literal-minded word cloud differs from that of a human translator.

EnglishWordle3.jpg

A popular parlour-game from the early World Wide Web involved feeding text through several different iterations of BabelFish and then retranslating the results back into English.

Noble et brillant auteur d'une triste famille,
Toi, dont ma mère osait se vanter d'être fille,
Qui peut-être rougis du trouble où tu me vois,
Soleil, je te viens voir pour la dernière fois.
Via Google Translate, gives us:
Noble and brilliant author of a sad family,
You, whom my mother dared claim to be girl
Who can be ashamed of trouble when you see me,
Sun, I just see the last time.

But by the time the website Lost in Translation has BabelFished it back and forth through Japanese, Chinese, German, Italian, Portuguese and Spanish, we end up with this:

The sad group is noble, wonderful author, with respect is to me substantial, if in me it is, if girl to doubt the ashamed one correctly, Sunday the search that the matrix in the terminal of fetthaltigen, because it is I.

And hey, why stop there? The 2007 Dublin Fringe Festival saw a real, live production of The BabelFish Tartuffe, "a new version of Moliere's classic comedy as translated into English by the internet."

But as much as we would all like to sit through five acts of The Deep Blue Phèdre, there is one bug in the ointment. Google Translate took the character name Panope and spat it out the other end as "geoduck".

Pronounced "gooey duck", this is, apparently, the common name for Panopea abrupta.

The world's largest burrowing clam.

Which is as good a reason as any to choose this translator...

Uhura.jpg

...over this.

Terminator.jpg

Skynet don't do word clouds.

Posted by Alison Humphrey at May 12, 2009 04:52 PM