ths1104

And without the Rosetta Stone ? part.2

Posted in Language by ths1104 on 21/11/2010

See: And without the Rosetta Stone ? part.1

The first thing to be done when trying to translate such a text seems to transliterate it. It will then be easier to analyse it with common data mining tools. The text being written in Greek alphabet, we can use tools such as transliterate.com to obtain the transliteration of our text:

euthys de parastas moi Peisandros “andres” ephē “bouleutai, egō ton andra touton endeiknyō hymin siton te eis tous polemious eisagagonta kai kōpeas.” kai to pragma ēdē pan diēgeito hōs epeprakto. en de tō tote ta enantia phronountes dēloi ēsan ēdē hoi epi stratias ontes tois tetrakosiois. kagō — thorybos gar dē toioutos egigneto tōn bouleutōn — [kai] epeidē egignōskon apoloumenos, euthys prospēdō pros tēn hestian kai lambanomai tōn hierōn. hoper moi kai pleistou axion egeneto en tō tote: eis gar tous theous echonta oneidē houtoi me mallon tōn anthrōpōn eoikasi kateleēsai, boulēthentōn te autōn apokteinai me houtoi ēsan hoi diasōsantes. desma te hysteron kai kaka hosa te kai hoia tō sōmati ēneschomēn, makron an eiē moi legein. hou dē kai malist᾽ emauton apōlophyramēn: hostis touto men en hō edokei ho dēmos kakousthai, egō anti toutou kaka eichon, touto de epeidē ephaineto hyp᾽ emou peponthōs, palin au kai dia tout᾽ egō apōllymēn. hōste hodon te kai poron mēdamē eti einai moi eutharsein: hopoi gar trapoimēn, pantothen kakon ti moi ephaineto hetoimazomenon. all᾽ homōs kai ek toutōn toioutōn ontōn apallageis ouk estin ho ti heteron ergon peri pleionos epoioumēn ē tēn polin tautēn agathon ti ergasasthai.

Of course for an unknown script, a transliteration table has to be created.

The next step would probably be to compile some statistics. What could be interesting to know ? What program should I use ?

Tagged with: ,

And without the Rosetta Stone ? part.1

Posted in Language by ths1104 on 13/11/2010

Is it possible to translate a text written in an unknown language ?

Suppose Greek is an unknown language. You have found an old document with the following script. You want to translate it. What would you do ?

εὐθὺς δὲ παραστάς μοι Πείσανδρος “ἄνδρες” ἔφη “βουλευταί, ἐγὼ τὸν ἄνδρα τοῦτον ἐνδεικνύω ὑμῖν σῖτόν τε εἰς τοὺς πολεμίους εἰσαγαγόντα καὶ κωπέας.” καὶ τὸ πρᾶγμα ἤδη πᾶν διηγεῖτο ὡς ἐπέπρακτο. ἐν δὲ τῶ τότε τὰ ἐναντία φρονοῦντες δῆλοι ἦσαν ἤδη οἱ ἐπὶ στρατιᾶς ὄντες τοῖς τετρακοσίοις. κἀγώ — θόρυβος γὰρ δὴ τοιοῦτος ἐγίγνετο τῶν βουλευτῶν — [καὶ] ἐπειδὴ ἐγίγνωσκον ἀπολούμενος, εὐθὺς προσπηδῶ πρὸς τὴν ἑστίαν καὶ λαμβάνομαι τῶν ἱερῶν. ὅπερ μοι καὶ πλείστου ἄξιον ἐγένετο ἐν τῷ τότε: εἰς γὰρ τοὺς θεοὺς ἔχοντα ὀνείδη οὗτοί με μᾶλλον τῶν ἀνθρώπων ἐοίκασι κατελεῆσαι, βουληθέντων τε αὐτῶν ἀποκτεῖναί με οὗτοι ἦσαν οἱ διασῴσαντες. δεσμά τε ὕστερον καὶ κακὰ ὅσα τε καὶ οἷα τῷ σώματι ἠνεσχόμην, μακρὸν ἂν εἴη μοι λέγειν. οὗ δὴ καὶ μάλιστ᾽ ἐμαυτὸν ἀπωλοφυράμην: ὅστις τοῦτο μὲν ἐν ᾧ ἐδόκει ὁ δῆμος κακοῦσθαι, ἐγὼ ἀντὶ τούτου κακὰ εἶχον, τοῦτο δὲ ἐπειδὴ ἐφαίνετο <εὖ> ὑπ᾽ ἐμοῦ πεπονθώς, πάλιν αὖ καὶ διὰ τοῦτ᾽ ἐγὼ ἀπωλλύμην. ὥστε ὁδόν τε καὶ πόρον μηδαμῇ ἔτι εἶναί μοι εὐθαρσεῖν: ὅποι γὰρ τραποίμην, πάντοθεν κακόν τί μοι ἐφαίνετο ἑτοιμαζόμενον. ἀλλ᾽ ὅμως καὶ ἐκ τούτων τοιούτων ὄντων ἀπαλλαγεὶς οὐκ ἔστιν ὅ τι ἕτερον ἔργον περὶ πλείονος ἐποιούμην ἢ τὴν πόλιν ταύτην ἀγαθόν τι ἐργάσασθαι.

Some precisions :

  • No Greek dictionary, no Rosetta Stone…
  • Suppose this is the unique document available.
  • As the problem to know whether a script is written with phonetic characters or not is already in itself a difficult issue, to simplify things you can suppose to begin that the text you try to translate is actually written with an alphabet. And that’s why I choose a text in Greek.
  • I choose this text randomly among the collection of texts of the Perseus Digital Library.  A translation is available but please don’t use it. Don’t cheat with Google translate 😉
  • All suggestions are welcome. If you have any idea, whether you have the time to explore it or not, please post it here before.

Edit (Nov 21st, 2010)

It can be argued that even if Greek is an unknown language, some European languages are descended from it. We should then find some graphic analogy between Greek and modern languages. I would probably allow us to translate some words from the original text. But I am not interested in this kind of considerations here because I am searching a general method to translate any unknown language. So let’s say this text has no relation with any known language.

This paragraph is just to remember that I am aware that Greek is inherently close to English, as sentence for example have the same structure in both languages. It will therefore facilitate our analysis. But if the analysis is successful we will try to translate other languages.