Friday, 13 September 2013

The mindlessness of Google Translate

‘La pomme a mangé le garcon’ is a bizarre sentence, but an easily-comprehensible one (if you speak French).  It means, ‘The apple ate the boy’.  What does Google Translate make of it?

'The boy ate the apple'
Nul points

Bob Berwick at Faculty of Language has an explanation of why this is.  In a nutshell: GT works by bombarding problems with corpus statistics, while paying very little attention at all to things like grammatical structure or thematic role.  Since ‘the boy ate the apple’ is a statistically much more ‘likely’ sentence than ‘the apple ate the boy’, while both sentences contain English translations of all and only the words in the French source sentence, the former wins out.  Berwick’s take-home message relates to the dangers of overusing statistics (Bayes’ Theorem in particular) in place of doing serious linguistics.

Notwithstanding mishaps like this, however, Google Translate is remarkably successful in general.  Furthermore, overall it is significantly more successful than previous attempts at automated machine translation that paid much more attention to notions that are central in out best linguistic theories: things like grammatical structure (e.g. clause composition) and thematic role (e.g. verb subject/object). 

It is possible to draw many morals from this scenario.  At the very least, we can say the following: it is possible to write a computer program that mimics a human cognitive activity rather well, operating in a way that is nothing like the way that human cognition works.  This is something to bear in mind amid the multifarious claims made on behalf of artificial intelligence.

N.B.: Of course, we really didn’t need this example to see that human cognition works nothing like Google Translate.  Of course native speakers aren’t carrying n-grams around in their heads.  Of course native speakers’ linguistic knowledge doesn’t amount to knowing statistical distributions of collocations of words … right?


Doug said...

Yeah - I work at a company that does that statistical AI thing. The employees are of two camps: those who actually have convinced themselves that humans might indeed carry around n-grams, and operate off colocations... and the parents. :-D

mattghg said...