UPSI Digital Repository (UDRep)
|
|
|
Abstract : Universiti Pendidikan Sultan Idris |
A computer-aided language translation using a Machine translation (MT) is an application performed by computers (machines) that translates one natural language to another. There are many online language translation tools, but thus far none offers a sequence of text translations for the under-resourced Kadazandusun language. Although there are web-based and mobile applications of Kadazandusun dictionaries available, the systems do not translate more than one word. Hence, this paper aims to present the discussion of the preliminary translation of Malay to Kadazandusun. The basic word-to-word with dictionary alignment translation based on Direct Machine Translation (DMT) is selected to begin the exploration of the translation domain where DMT is one of the earliest translation methods which relies on the word-to-word approach (sequence-to-sequence model). This paper aims to investigate the under-resourced language and the task of translating from the Malay language to the Kadazandusun language or vice versa. This paper presents the application and the process as well as the results of the system according to the basic Kadazandusun word arrangement (Verb-Subject-Object) and its translation quality using the Bilingual Evaluation Understudy (BLEU) score. Several phases are involved during the process, including data collection (word pair translation), preprocessing, text selection, translation procedures, and performance evaluation. The preliminary language translation approach is proven to be capable of producing up to 0.5 BLEU scores which indicate that the translation is readable, however, requires post-editing for better comprehension. The findings are significant for the quality of the under-resourced language translation and as a starting point for other machine translation methodologies such as statistical or deep learning-based translation. 2023 by the author(s). |
References |
Yin-Lai Yeong, Tien-Ping Tan and Siti Khaotijah Mohammad, “Using Dictionary and Lemmatizer to Improve Low Resource English-Malay Statistical Machine Translation System”, Procedia Computer Science, Vol. 81, pp. 243–249, 2016, Online ISSN: 1877-0509, Published by Elsevier B.V., DOI: 10.1016/j.procs.2016.04.056, Available: https://www.sciencedirect.com/science/article/pii/S1877050916300709. Tong Loong-Cheong, “English-Malay translation system: a laboratory prototype”, in Proceedings of the 11th Conference on Computational Linguistics, Bonn, Germany, August 1986, pp. 639–642, DOI: 0.3115/991365.991552, Available: https://dl.acm.org/doi/pdf/10.3115/991365.991552. Tong Loong-Cheong, “The Computer Translation of Interrogatives from English to Malay”, RELC Journal, Vol. 18, No. 1, pp. 1–18, 1987, Published by SAGE Publications, DOI: 10.1177/003368828701800101, Available: https://journals.sagepub.com/doi/abs/10.1177/003368828701800101. Kentaro Ogura, Francis Bond and Yoshifumi Ooyama, “A prototype Japanese-to-Malay Translation System”, in Proceedings of the MT Summit VIII, 13-17 September 1999, Singapore, pp. 444-448, Available: https://aclanthology.org/1999.mtsummit-1.66/. Pidong Wang, Preslav Nakov and Hwee Tou Ng, “Source Language Adaptation Approaches for Resource-Poor Machine Translation”, Computational Linguistics, Vol. 42, No. 2, pp. 277–307, 2016, DOI: 10.1162/COLI_a_00248, Available: https://dl.acm.org/doi/10.1162/COLI_a_00248. Ahmed Jumaa Alsaket and Mohd Juzaiddin Ab Aziz, “Arabic-malay machine translation using rule-based approach”, Journal of Computer Science, Vol. 10, No. 6, pp. 1062–1068, 2014, DOI: 10.3844/jcssp.2014.1062.1068, Available: https://thescipub.com/pdf/jcssp.2014.1062.1068.pdf. Hamida Ali Almeshrky and Mohd Juzaiddin Ab Aziz “Arabic Malay Machine Translation for a Dialogue System”, Journal of Applied Sciences, Vol. 7, pp. 1371–1377, 2012, DOI: 10.3923/jas.2012.1371.1377, Available: https://scialert.net/fulltext/?doi=jas.2012.1371.1377. Surafel Melaku Lakew, Aliia Erofeeva and Marcello Federico, “Neural Machine Translation into Language Varieties”, in Proceedings of the Third Conference on Machine Translation: Research Papers (WMT 2018), Belgium, Brussels, October 31 - November 1 2018, Published by Association for Computational Linguistics, ISBN 978-1-948087-81-0, DOI: 10.18653/v1/W18-6316, Available: https://aclanthology.org/W18-6316/. Chong Chai Chua, T. Lim, Lay-Ki Soon, E. Tang and Bali Ranaivo-Malançon, “Analogical-Based Translation Hypothesis Derivation with Structural Semantics for English to Malay Example-Based Machine Translation”, Advanced Science Letters, Vol. 24, No. 2, pp. 1263–1267, 2018, DOI: 10.1166/asl.2018.10729, Available: https://www.ingentaconnect.com/contentone/asp/asl/2018/00000024/00000002/art00103. John Oladosu, Adebimpe Esan, Ibrahim Adeyanju, Benjamin Adegoke, Olatayo Olaniyan and Bolaji Omodunbi, “Approaches to Machine Translation: A Review”, FUOYE Journal of Engineering and Technology, Vol. 1, No. 1, pp.120–126, 30th September 2016, Published by Federal University of Oye-Ekiti, DOI: 10.46792/fuoyejet.v1i1.26, Available: http://journal.engineering.fuoye.edu.ng/index.php/engineer/article/view/26. S. Anbukkarasi and S. Varadhaganapathy, “Machine Translation (MT) Techniques for Indian Languages”, International Journal of Recent Technology and Engineering (IJRTE), Vol. 8, No. 2S4, pp. 86–90, 2019, ISSN: 2277-3878, Published by Blue Eyes Intelligence Engineering & Sciences Publication, DOI: 10.35940/IJRTE.B1015.0782S419, Available: https://www.ijrte.org/wp-content/uploads/papers/v8i2S4/B10150782S419.pdf. Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu, “BLEU: A Method for Automatic Evaluation of Machine Translation”, in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, Pennsylvania, USA, 7-12 July 2002, pp. 311–318, DOI: 10.3115/1073083.1073135, Available: https://dl.acm.org/doi/10.3115/1073083.1073135. George Doddington, “Automatic evaluation of machine translation quality using n-gram co-occurrence statistics”, in Proceedings of the second international conference on Human Language Technology Research (HLT’02), 24-27 March 2002, San Diego, California, pp. 138–145, Published by Morgan Kaufmann, DOI:10.3115/1289189.1289273, Available: https://dl.acm.org/doi/10.5555/1289189.1289273. Michael Denkowski and Alon Lavie, “Meteor universal: language specific translation evaluation for any target language”, in Proceedings of the EACL 2014 Workshop on Statistical Machine Translation, Baltimore, Maryland, USA, June 2014, pp. 376–380, DOI: 10.3115/v1/W14-3348, Available: https://aclanthology.org/W14-3348/. F. Jelinek, R. L. Mercer, L. R. Bahl, and J. K. Baker, “Perplexity measure of the difficulty of speech recognition tasks”, The Journal of the Acoustical Society of America, Vol. 62, No. S1, pp. S63, 1977, DOI: 10.1121/1.2016299, Available: https://asa.scitation.org/doi/10.1121/1.2016299. Francisco Guzmán, Shafiq Joty, Lluís Màrquez and Preslav Nakov, “Machine translation evaluation with neural networks”, Computer Speech & Language, Vol. 45, pp. 180–200, 2017, DOI: 10.1016/j.csl.2016.12.005, Available: https://www.sciencedirect.com/science/article/abs/pii/S0885230816301693. Karunesh Kumar Arora and Shyam S. Agrawal, "Pre-Processing of English-Hindi Corpus for Statistical Machine Translation", Computación y Sistemas, Vol. 21, No. 4, 2017, pp. 725–737, DOI: 10.13053/CyS-21-4-2697, Available: https://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/2697. Kadazandusun Language Foundation, Daftar Kata Bahasa Kadazandusun - Bahasa Malaysia. Kadazandusun Language Foundation (KLF) and INDEP Education Foundation (IEF), ISBN: 978-983-9325-64-5, 2015. Ree John Daulip, Kamus Malay-Dusun-English. 2nd Ed., Kota Kinabalu, Sabah: Sabah Komik, ISBN: 978-967-116322-1, 2015. |
This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials. You may use the digitized material for private study, scholarship, or research. |