Friday, June 14, 2013

Machine translation FAIL


One of the things I like to test translation software with is formal French complimentary closings.  French uses long, gorgeous, wordy passages where we'd just say "sincerely" in English, so it's useful to determine whether the software recognizes the function of the text.  I was recently demonstrating this, and got the following result (click to embiggen):




For those of you who don't read French, the phrase input here is a French complimentary closing, appropriate to a formal business letter. With the exception of one serious error, the English is a reasonable literal translation.

There are two problems here, one macro and one micro.

The macro problem is that the French is a complimentary closing, and the English is not.  English complimentary closings are things like "Sincerely," or "Yours truly," and that's how this sentence should be translated.  The actual words don't matter; the message is "This is to indicate that I am ending the letter in the prescribed letter, and the next thing you see will be my signature."

And the micro problem is that, on a word-for-word level, it translated the French "Madame" (i.e. Ms. or Ma'am) with the English "Sir", thereby addressing the recipient as the wrong gender.  Not only is this clearly unacceptable, it's something even the most simplistic machine translation should be able to handle. Even if an individual text in their corpus got misaligned, they should have some mechanism to recognize that "Sir" is not the most common translation of "Madame". Even a calque of the French ("Madam") would be a better translation than "Sir", which is a sure sign of a particularly bad translation. I'm quite surprised to see this happening in 2013.

8 comments:

laura k said...

#LeastImportantThing: So "Madam" is less correct than Ms. or Ma'am? I'm glad to know that, as I've written about hating the "Dear Sir or Madam" salutation.

Plus I learned a new word: calque. I looked it up on Wikipedia, but I'm still not completely clear on the concept.

laura k said...

^^ Googled it and read the Wiki entry.

impudent strumpet said...

As it's used in my corner of the real world, a calque is a too-literal translation that's not idiomatic in the target language, but not outright wrong. Mistranslating the Spanish embarazada as embarrassed isn't a calque, it's a faux ami. (Translation terminology as we learn it in Canada tends to be in French.) Calques are more adhering to the wording and structure of the source text when you'd never land on that wording in the target language. It's the sort of thing that happens when you're relying too heavily on a dictionary.

And now I'm having a lot of trouble thinking of examples that are calques. I'll add to this after I work tomorrow. Apart from the erroneous "Sir", the translation provided by Google Translate in the original blog post is pretty much a calque (although it's less fruitful an example than whatever I'll think of while working or showering tomorrow.)

impudent strumpet said...

Meanwhile, if anyone reading this has examples of calques in French or Spanish, post them here!

impudent strumpet said...

Also, I don't think "Madam" is less correct that "Ms." or "Ma'am", I think they have different functions in English. Ms. is the title, Ma'am is how you'd address someone verbally, and Madam is what you'd use in writing if it were necessary to do so (although circumlocuting is generally more stylish). Addressing someone verbally as "Madam" would come across as kind of phoney-formal, and writing "Dear Sir or Ma'am" would just be odd.

laura k said...

Thanks for the great explanation. I'll come back for examples.

impudent strumpet said...

I had an example of a calque right there staring me in the face in my own comment! In the world of translation (and often in French class in general), we use the term "faux amis" to refer to the "embarazada" = "embarrassed" error. (For those who don't speak Spanish, "embarazada" means "pregnant".) Translation training in Canada tends to use French terminology even in English, so you get a bunch of anglophones walking around saying things like "faux amis" and no one blinks an eye.

The non-French-influenced English term for "faux amis" is "false cognates". However, sometimes people calque the French "faux amis" to make the English "false friends."

As you can see, it's not outright wrong, but it does smell like a translation, and generally sounds more foreign and less natural than the uninfluenced English.

laura k said...

Great examples, now I understand the concept.