Fall 1998


FEATUREFEATURE
ESSAYESSAY
BUSINESS WORDBUSINESS WORD
ORIGINSORIGINS
POET'S CORNERPOET'S CORNER
LETTERS TO WBLETTERS TO WB
TECHNOLOGYTECHNOLOGY
*
*
*
*
*

Writer's Block




Maple Leaf

Technology

*

My Computer Doesn't Understand Me
Automated Translation Tools

by S. D. Liddiard

Captain Kirk never needed to learn Klingonese (although courses are now offered) because the universal translator of the Starship Enterprise instantly converted the speech of all space aliens into flawless, nuanced English. A machine that automatically renders translations from one language to another was already a dream when computers were invented earlier this century. In the thirty-odd years since the mid-sixties, when the original Star Trek TV series sent Captain Kirk and his crew to boldly greet new species of aliens in English, computer technology on this planet has grown spectacularly. Computer hardware and software are much more powerful and sophisticated, and work has continued on the automation of translation. How much closer are we now to a universal translator than we were then?

Progress Toward the Universal Translator

The universal translator of the Starship Enterprise has three essential components that are currently under separate development: speech synthesis, speech recognition, and machine translation.

Speech synthesis is fairly well advanced. In the early eighties, I worked with an early speech synthesis system that pieced together sentences from pre-recorded words and sentence fragments. Soon after that I was exposed to a more sophisticated commercial system that was able to produce synthesized speech from text files stored on a computer. Most people are familiar with the rather stilted and mechanical product of speech synthesizers, such as the one used by physicist Stephen Hawking: although odd and awkward, it is understandable. What is missing from synthesized speech that makes it seem so unnatural are the pauses and emphasis that underline the meaning of normal discourse.

Speech recognition software is still in its early stages, even though it has been under development for at least as long as speech synthesis. Until relatively recently, voice recognition programs had very limited vocabularies, needed to be "trained" to become familiar with each user's voice, and were unable to cope with continuous speech. While progress is being made, a number of fairly intractable problems remain: there are often no recognizable breaks between words, some phonemes and even some words may fail to be pronounced, and homonyms are difficult to identify, even in context, without understanding the meaning of an utterance.

Machine translation was a high-priority research subject in the 1950s and 1960s. During this period, researchers were convinced that automatic translation of scientific and technical documentation was within reach. Toward the end of the sixties, the U.S. National Academy of Sciences published a report concluding that resources being spent on machine translation were being wasted until some more fundamental questions of language processing had been solved. Research tailed off until the 1980s, when a number of successes were demonstrated, such as the METEO system, developed by the University of Montreal to produce French translations of weather reports. Machine translation still struggles, however, when faced with literature and other examples of natural language.

Straight to the Heart

None of these three technologies has yet been a straightforward success. All of them need considerable more development to reach the level of sophistication and reliability of, say, word processing software. Machine translation has perhaps received less general attention than the other two technologies because its applications to everyday life are less obvious. Applications of this technology are nevertheless finding themselves more and more in demand.

Machine translation is, of course, the heart of the universal translator. Even if we never get speech synthesis and speech recognition working properly, we should still be able to have automated translation of written text. In fact, software for translating text is gaining itself an ever higher profile. Just in time, too.

Increasing Imperative

The world is shrinking. With the Internet, and the shrinking cost and increased speed of global communication, the physical distance between places is becoming less of a barrier to international contact and commerce. The barrier that remains is language. If you cannot speak to the person at the other end of the line, no communication can take place.

Nowadays, business is global. Multinational corporations rule the day. The most successful companies do business all over the planet. In order to conduct business successfully in so many places, global enterprise must deal with two language-related issues: localization and internationalization. Localization is the process of making a product usable by consumers in a particular place. Internationalization is the process of making a company known and able to do business in a number of different countries. Both of these requirements involve communicating in other languages.

In June 1997, at the Localization Industry Standards Association forum in Washington D.C., Franz Rau, director of Microsoft Corporation's division of Internal Tools and Programs said that in 1990, the U.S. computer industry translated 20% of their documents into 30 languages, but that by 2005, 60% of the industry's documents would be translated into 80 languages worldwide.

The increasing imperative to do business in other languages means a boom in the translation services industry. Can they cope with the increasing demand? Translators do not grow on trees. There is much more to it than just being able to speak two languages: a long and difficult training period is required to learn how to render a message (not just a text) to a new audience in a way that makes the message just as meaningful as it was to its original audience.

One way that business can keep up with the increasing demand is to use automated translation tools.

Unrealistic Expectations

A number of successes like METEO were reported in the 1980s. Some translation services began using automated translation tools to improve their productivity. As word of the success of automated translation spread, some potential translation clients developed unrealistic expectations.

In 1990, UPS issued a request for proposals for translation services. One of the requirements was a turnaround time of no more than four hours from submission of original text to delivery of its translation. Evidently UPS believed that automated translation was developed to the point that this was possible. Of course it wasn't in 1990 and it still isn't today.

Less Than Perfect

Automated translation still suffers from a major drawback: its results are not reliably accurate. Sometimes the product of a translation process is not even close to the meaning of the original. At worst, depending on the audience and the subject, every word may need to be scrutinized by an expert translator familiar with both languages and the subject area. At best, a knowledgeable reader may be able to get the gist of a text written in an unfamiliar language.

Automated translationThere are a number of companies offering automated translation tools for sale on the Web. A number of them offer evaluation packages so that you can see just how well they work. I took a look at the one that is most easily available, the translation service offered by the AltaVista Search site on the Web. This service is provided by Systran Software Inc.

I wanted to see how well the automated translation software worked with simple English sentences. I chose two sentences that are commonly used as typing exercises: "Now is the time for all good men to come to the aid of the party." and "The quick brown fox jumps over the lazy dog." So that we can all understand the results, I took the text produced in the other language and had the translation service translate it back into English. The results were uneven and varied from language to language.

Now is the time for all good men to come to the aid of the party.

This sentence was the more difficult of the two because it does not follow common English syntax and because some of the words can have more than one meaning either in English or the target language.

  • French: Is now the hour for all the good men to come using the part.
  • German: Is now the time for all good men to come to the aid of the involved one.
  • Italian: Hour is the moment for all the good men to come to the subsidy of the party.
  • Spanish: Now it is the time for all the good men to come to the aid of the party.
  • Portuguese: It is now the moment for all the men good to come to dae (automatic device of input) of the party.

As you can see, some of the translations are a little off the mark. The Spanish one is the closest to the original. The French translation seems very odd indeed, but the actual French language version was not nearly so bad. The translation software chose one of two acceptable translations for "party" (not the one I would have chosen), a term which is more commonly rendered as "part" in English. The literal translation of "to the aid of" is a French idiom meaning "using." You can see how the ambiguities in the original sentence led to differences that were compounded on the translation back to English.

The quick brown fox jumps over the lazy dog.

This sentence fared much better than the first because it is a simple declarative sentence using concrete nouns, and leaves little room for ambiguity.

  • French: The fast brown fox jumps over the lazy dog.
  • German: The fast brown fox branches over the lazy dog.
  • Italian: The fast vixen brown jumps over the lazy dog.
  • Spanish: The fast brown fox jumps concluído the sluggish dog.
  • Portuguese: The fast brown fox jumps on the sluggish dog.

These translations are all fairly close, with only a couple of small oddities to raise your eyebrows at. (Are there no male foxes in Italy? The Spanish-English translator seems to be out of synch with the English-Spanish translator.)

It is important to remember that this is not really a fair test. Each sentence was translated twice, first to the target language and then back to English. Any small imperfections in the first translation, which may not have been enough to render the sentence incomprehensible, have been magnified by the second translation.

What Is Machine Translation Good For?

There are four basic types of machine translation software. Each type of software addresses a specific translation application that is related to the direction of the translation (into my language vs. into your language) and the level of precision required (informal vs. formal), as shown in the following table.

  Into my language Into your language
Informal
1
2
Formal
3
4

Stand-alone translation software can be used successfully for applications 1 and 2, which are appropriate for e-mail communication between peers or for browsing the Web. Products like the one used at AltaVista are appropriate.

For applications 3 and 4, where precision is required for legal, technical, or marketing documents, human intervention is needed. A number of developers make machine-aided Human Translation (MAHT) tools for human translators. This process is sometimes called semi-automated translation. This software typically produces a first draft that is reviewed and corrected by a human translator. The best of these products are able to "learn" from the editing sessions. Most of them are also able to accept lists of prescribed equivalents so that industry- or company-specific jargon can be accurately substituted.

Machine translation works best in situations where a "sublanguage" can be used. An example of a sublanguage is Standard English, which is intended for comprehension by non-native speakers of English. It has a restricted vocabulary, simplified syntax, and avoids the use of idioms. Texts written in such a sublanguage can be translated automatically with very little human intervention.

Fundamental Problem

Machine translation, while imperfect, is a valuable technology that has valid current uses. It is not a substitute for a skilled human translator. The results of an automated translation process will continue to be suspect and even humorous until software researchers find a way to solve a fundamental problem.

The solution to this problem will make reliable machine translation possible and it will also pave the way for true speech synthesis that does not sound mechanical and for speech recognition that can detect the difference between homonyms. What is this fundamental problem? Computers do not understand speech. When a method is found for computers to recognize the meaning of utterances in some way that is analogous to human understanding, these technologies will finally work as we expect them to: like on the Starship Enterprise.The End

S. D. Liddiard studied French, German, Latin, and Russian for many years as a student. He wishes he could translate from one to another as well as today's machine translation computer programs.

 

Tell a friend

NEXT >>

 

Back to top