Automated Translation (part III)

Automatic translation (AT)
Automatic translation, or machine translation, requires no human intervention. There are two major concepts:[1] rule-based and statistical automatic translation. Some systems do, however, use a combination of both (mixed systems).

Rule-based machine translation
This type of translation relies on dictionaries and rules of grammar and conjugation. It is the classic method behind such commercially available programs as Reverso and Babelfish. While these programs may be widely available and are often free, they run into limitations as soon as a text is complex or contains ambiguous expressions.

To give an example, an automatic translation program will correctly translate “Être dans de beaux draps” (to be in a mess), but will struggle with “Il a dormi dans de beaux draps” (He slept in a mess).

Statistical automatic translation
This model appears to be the most promising, so I will spend more time on it.

In very general terms, statistical automatic translation works in two stages. First, bilingual texts are inputted into the system’s database, as in a translation memory (see Automated Translation, part II). Based on this content, the system will use statistical calculation (probabilities) to determine the best translation to use.

This is the principle that drives Google Translation.

From a purely linguistic standpoint, the quality of automated translation essentially depends on four criteria:

  • The volume of bilingual reference documents (the corpus) fed into the system.
  • The match between this corpus and the subject area of the texts to translate, since specialized translation involves specialized language and vocabulary.

This is also the case for big businesses, which often have their own style, terminology, and typical ways of saying things.

  • The quality of this corpus. Like humans, machines need high-quality educational documents in order to learn properly. The quality of learning will depend on the quality of language in the source and target language materials that the machine is given to “digest.”
  • The quality of the text to translate (quality of writing). Ambiguity is the number one enemy of machine translation. A machine can only predict a translation if the original text is written properly, if the chosen words have the right meaning, and if grammatical rules have been followed.

In part IV (final installment), we will see what this means in practice…

(to be continued)

Translated by Joachim Lépine, C. Tr


[1] A third principle has also recently emerged, namely example-based automatic translation, but I will not address it here.

Publicités

Laisser un commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion / Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion / Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion / Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion / Changer )

Connexion à %s