Ricochet Member of the Day
We have Turkish members here on Ricochet (quite some number, actually), and of course we have Bill Walsh, but when a member named John H left a particularly apt response in Turkish to one of my posts, I thought--who is this guy?
So I looked up his profile:
Just another beer-drinking Texan with a Ph.D. in biochemistry, proficiency in Portuguese, and 150,000 miles on his bicycle. My vocation now is computer programming; my impossible mission, to make cat and Toyota Echo ownership look totally masculine. (Well, the cats are easy: I just treat 'em like horses, slapping their flanks and singing to 'em. As for the Echo, I don't know...at least it has a 5-speed manual.) My interests - machine translation of Turkic languages, the unavoidable faultiness of computer models, the once and future Yugoslavia, all the lusophone world, and pedaling up to people in other area codes and watching their eyes telegraph But you're not wearing Spandex! - are summarized at http://www.machine-altaica.com/.
Doesn't Ricochet have the most interesting members?
John, if you crack the problem of machine-translating Turkic languages, you will be my personal hero. I'll leave it to you to explain how you're approaching it.
It's a hugely challenging problem because Turkish is agglutinated, ridiculously morphologically complex, and has a flexible word order--no simple "subject-verb-object" stuff around here. Machine translation, even if it results in a correctly translated root, usually messes up the morphemes. Nine times out of ten, you get gobbledygook.
Google translate, you'll see, can't even begin to fathom a phrase such as Avrupalılaştıramadıklarımızdanmışsınız--which means, "It seems you're one of those people we were not able to Europeanize." It just gives up.
Which by the way also sums up the state of Turkey's EU accession negotiations.
- Comment (14)
- · Quote
- · UnfollowFollow (2)



Comments :
Jun '10
Re: Ricochet Member of the Day
There are a lot of very funny member profiles. Look 'em up and you'll be pleased. This one is definitely one of the best I've seen. Bravo.
Sep '10
Re: Ricochet Member of the Day
Wow, a post so idiotic that it actually made me cancel my account.
Oh well,
Poof
Re: Ricochet Member of the Day
GT Speetzen: Wow, a post so idiotic that it actually made me cancel my account.
Oh well,
Poof · Apr 23 at 12:41am
Interesting example. I've never found a machine translator that could render this sentence into anything like the syntax a native Turkish speaker would use to express this concept.
May '10
Re: Ricochet Member of the Day
John H should be posting more.
GT Speetzen thankfully won't be.
Feb '11
Re: Ricochet Member of the Day
GT Speetzen: Wow, a post so idiotic that it actually made me cancel my account.
Oh well,
Poof · Apr 23 at 12:41am
budala
Apr '11
Re: Ricochet Member of the Day
Now I have to go add something to my profile. I usually leave those things blank/empty, but, then usually there's no possibility of being awarded Member of the Day. I'm nothing if not competitive. :-)
Is it interesting that I'm currently trying to translate hieroglyphics into Gaelic and that I don't wear gabardine while speed walking? Too derivative?
Re: Ricochet Member of the Day
Lady Bertrum: Now I have to go add something to my profile. I usually leave those things blank/empty, but, then usually there's no possibility of being awarded Member of the Day. I'm nothing if not competitive. :-)
Is it interesting that I'm currently trying to translate hieroglyphics into Gaelic and that I don't wear gabardine while speed walking? Too derivative? · Apr 23 at 7:58am
You can have "comment of the day." How's that?
May '10
Re: Ricochet Member of the Day
See, I just feel like you should get more than Member of the Day for Turkish-English translation. That's seriously hard stuff. It's not for nothin that Turkey has one of the lowest English-proficiency scores on the planet. It's Asian steppe language. Almost no correlation.
Being thus discouraged from ever being Member of the Day, I'll just continue posting innuendo, horserace handicapping, Walter Russell Mead rip-offs, and innuendo. Also innuendo.
PS, not sure my profile would cut any ice.
Edited on Apr 23, 2011 at 10:02amNov '10
Re: Ricochet Member of the Day
One of the reasons why Turkish machine translation hasn't been very successful so far is not that it's fundamentally hard for a machine, but that not much money and time has been expended on it, Claire. Yes, I know human beings find it hard to learn to agglutinate; but machines don't. Agglutinative morphology is one of the easier morphologies for machines to handle in fact: the word can be broken up into easily analyzable chunks. Morphemes in Turkish have a few fixed forms varying predictably according to vowel harmony, and so they're easily recognizable by machines. What's called "non-concatenative" morphology (i.e. where morphemes are embedded inside other morphemes, like Arabic) are far harder. Yet compare how (relatively) well Google translate does on Arabic compared to Turkish.
May '10
Re: Ricochet Member of the Day
Google translation systems work exclusively on the basis corpus-based statistics. It's a matter of religion for them, it seems. Google-style translation is good for getting an idea of what a text is about, which can be quite useful.
However, I am not sure whether they even touch what's traditionally known as morphology at all. And Mr. Aristar is right on the money about agglutinating morphology: e.g., for Turkish it's more or less solved -- check the work of Kemal Oflazer, of Sabanci U. and CMU campus in the Emirates.
The real reason for bad machine translation (MT), however, is the difficulty of extracting and manipulation meaning of text. That is a very difficult and expensive task but it is a strong prerequisite for high quality translation.
Being interested in computational semantics of natural languages, I worked in MT for many years until the statisticians took over promising to get results cheaper and sooner. I had to find a different set of applications -- which, in the end, proved a boon... But that's another story.
Re: Ricochet Member of the Day
Really? It's a matter of investment rather than the inherent difficulty of the problem? I did not know that. I wonder if now's the time to be hitting up the famous new breed of Turkish gazillionaires to invest in it, then?
Aug '10
Re: Ricochet Member of the Day
Here's why machine translation won't ever be adequate for any language pair. Imagine two such machines, both perfect: whatever you give one, its translation is unambiguously translated back by the other. Well, right away you see two problems: you, an outsider, have to seed the conversation ("Uh, OK...'The cheeseheads are out in force at Lambeau'") and then the two devices cycle pointlessly ("Lambeau'da çok peynirbaşları var" by definition inevitably returns "The cheeseheads are out in force at Lambeau"). Only intrusions from life itself can keep this going in a nontrivial manner, and to program machines to keep up with life is to do their work for them.
Claire, thanks for the citation - you've shown far more generosity than I ever have - and keep l'arnin' that Turkish! You'll know you've arrived when you no longer consciously "figure out" where all the infixes go. "Figuring out" is all a machine can ever do.
Re: Ricochet Member of the Day
I cannot wait to test this example on actual Turkish native speakers.
Do you agree that the difference between the quality of machine translation of Turkish and of Arabic right now is simply a matter of investment?
Aug '10
Re: Ricochet Member of the Day
A prideful man, I always shoo away help from browsers that want to translate for me. Which means, really, I just don't look at a lot of stuff. I stick to sites I can read unassisted, or at least get through with a dictionary. (Turkish is in the latter category. I'm hardly conversant in it but I know nearly all the grammar. Turkish grammar may be utterly unfamiliar but as you have undoubtedly observed, it is also commendably regular - it really was not that hard to program for it, even sitting on a garden bench with my right hand grasping a cold one and my left hand petting a cat.) So, I just can't compare machine translators. And I've never had much interest in Arabic. I got to recognizing "Israel," "Palestine," and a number, which was always a body count. Sigh.
I really think the future (maybe it is already the present) of machine translation is something like the so-called "data miner" I prototyped on my site: something that finds connections, reducing a huge corpus to a much smaller one, only then to be read and recast by bilingual humans.