Google/Yandex Translation Detection in the Patterns Identifying System of Multilingual Texts

Authors

  • Vladimir Kulikov
  • Valentina Kulikova
  • Gulnur Yerkebulan

DOI:

https://doi.org/10.47839/ijc.20.1.2094

Keywords:

Google Translate, Yandex.translate, English, Russian, Kazakh, FuzzyWuzzy

Abstract

The object of this work is to develop a script for evaluating the ability of online translators to translate text from one language to another. For this purpose, we used Google Translate and Yandex.Translate. Examples from English, Kazakh and Russian languages were used for the analysis of 147 news items and about 1800 sentences. The texts are taken from an Internet resource astana.gov.kz. A corpus of parallel texts for three languages has been created. We used development for the “sentence” pattern with the prospect of further development for the “text” pattern. We analyzed errors in the following categories: untranslated/omitted words, extra words, incorrect word endings, incorrect word order, punctuation errors, mutilate translation and incorrect translation. Based on the analysis of the obtained data we have concluded that it is better to do the translation of the Russian text into Kazakh or English in the YandexTranslate than in Google Translate. The developed comparison script and error analysis script are available on the Internet in open access.

References

Wikipedia, List of languages by number of native speakers, 2020, [Online]. Available at: https://en.wikipedia.org/w/index.php?title=List_of_languages_by_number_of_native_speakers&oldid=957968997.

Language Learning, Multilingual People, 2018, [Online]. Available at: http://ilanguages.org/bilingual.php.

Research and Markets, Global Language Services Market 2020-2024, 2020. [Online]. Available at: https://www.researchandmarkets.com/reports/4894434/global-language-services-market-2020-2024.

Google Translator, 2020. [Online]. Available at: https://translate.google.com/.

Yandex.Translator, 2020. [Online]. Available at: https://translate.yandex.com/.

S. Seljan, M. Tucaković, I. Dunđer, “Human evaluation of online machine translation services for English/Russian-Croatian,” in: A. Rocha, A.M. Correia, S. Costanzo, L.P. Reis (Eds.), New Contributions in Information Systems and Technologies, Springer International Publishing, Cham, 2015, pp. 1089–1098. https://doi.org/10.1007/978-3-319-16486-1_108.

A. Sukhoverkhov, D. DeWitt, I. Manasidi, K. Nitta, V. Krstic, “Lost in machine translation: Contextual linguistic uncertainty,” Science Journal of VolSU. Linguistics, vol. 18, pp. 129–144, 2019. https://doi.org/10.15688/jvolsu2.2019.4.10.

Z. Bülbül, A. Çetinkaya, and F. Arıcı, Google Translate and Yandex Translate’s Differences in Naturalness, Clarity, and Accuracy: A Comparison Study on Machine Translation, 2020, [Online]. Available at: https://www.researchgate.net/publication/339029502_Google_Translate_and_Yandex_Translate's_Differences_in_Naturalness_Clarity_and_Accuracy_A_Comparison_Study_on_Machine_Translation.

O. Mohammed, S. Samad, “Machine translation strategies of translating death euphemistic expressions from Arabic into English and vice versa,” An International Peer-Reviewed Open Access Journal, pp. 114-121, 2020.

Google Trends, Comparison, 2020. [Online]. Available at: https://trends.google.com/trends/explore?date=all&geo=KZ&q=%2Fm%2F025sndk,%2Fg%2F11x1nzgtw,%2Fm%2F02z9kkt.

PHP, similar_text – Manual, 2020. [Online]. Available at: https://www.php.net/manual/en/function.similar-text.php.

ChairNerd, FuzzyWuzzy: Fuzzy String Matching in Python, 2011. [Online]. Available at: https://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/.

R.S. Sandhu, J. Shin, K.C. Wang, and G. Shih, “Single-center experience implementing the LOINC-RSNA radiology playbook for adult Abdomen/Pelvis CT and MR procedures using a semi-automated method,” Journal of Digital Imaging, vol. 31, pp. 124–132, 2018. https://doi.org/10.1007/s10278-017-0016-0.

P. Kanani and Dr. M. Padole, “Improving pattern matching performance in genome sequences using run length encoding in distributed Raspberry Pi clustering environment,” Procedia Computer Science, vol. 171, pp. 1670–1679, 2020. https://doi.org/10.1016/j.procs.2020.04.179.

G. Yerkebulan, Scripts developed to compare Google translate and Yandex.Translator, 2020. [Online]. Available at: http://102030.kz/works.php.

PHP, mb_strtolower – Manual, 2020. [Online]. Available at: https://www.php.net/manual/en/function.mb-strtolower.php.

PHP, preg_replace – Manual, 2020. [Online]. Available at: https://www.php.net/manual/en/function.preg-replace.php.

M. Porter, Porter Stemming Algorithm, 2006. [Online]. Available at: https://tartarus.org/martin/PorterStemmer/

M. Porter, Russian Stemming Algorithm, 2020. [Online]. Available at: http://snowball.tartarus.org/algorithms/russian/stemmer.html.

Wyndow, fuzzywuzzy, 2017. [Online]. Available at: https://github.com/wyndow/fuzzywuzzy.

Bing Microsoft Translator, 2020. [Online]. Available at: https://www.bing.com/translator.

G. Yerkebulan, Yandex and Google Translate Compare – Google Disk, 2020. [Online]. Available at: https://drive.google.com/drive/folders/1tPI42nCbaNZvlggnxclkf1vQoQS0ecgk?usp=sharing.

Wikipedia, Newline, 2020. [Online]. Available at: https://en.wikipedia.org/w/index.php?title=Newline&oldid=957966639.

Downloads

Published

2021-03-29

How to Cite

Kulikov, V., Kulikova, V., & Yerkebulan, G. (2021). Google/Yandex Translation Detection in the Patterns Identifying System of Multilingual Texts. International Journal of Computing, 20(1), 72-77. https://doi.org/10.47839/ijc.20.1.2094

Issue

Section

Articles