Conditional Random Fields Applied to Arabic Orthographic-Phonetic Transcription

Journal title

Archives of Acoustics




vol. 46


No 2


Cherifi, El-Hadi : Department of Electronics, Signal and Communications Laboratory, National Polytechnic School, El-Harrach 16200, Algiers, Algeria ; Guerti, Mhania : Department of Electronics, Signal and Communications Laboratory, National Polytechnic School, El-Harrach 16200, Algiers, Algeria



Orthographic-To-Phonetic Transcription ; Conditional Random Fields ; text-to-speech ; Arabic speech synthesis ; Modern Standard Arabic

Divisions of PAS

Nauki Techniczne




Committee on Acoustics PAS, PAS Institute of Fundamental Technological Research, Polish Acoustical Society


1. Abu-Salim I.M. (1988), Consonant assimilation in Arabic: An auto-segmental perspective, Lingua, 74(1): 45–66, doi: 10.1016/0024-3841(88)90048-4.
2. AbuZeina D., Al-Khatib W., Elshafei M., Al- Muhtaseb H. (2012), Within-word pronunciation variation modeling for Arabic ASRs: a direct datadriven approach, International Journal of Speech Technology, 15(2): 65–75, doi: 10.1007/s10772-011-9122-4.
3. Ahmed M.E. (1991), Toward an Arabic text-to-speech system, The Arabian Journal for Science and Engineering, 16(4): 565–583.
4. Al-Daradkah B., Al-Diri B. (2015), Automatic grapheme-to-phoneme conversion of Arabic text, [in:] 2015 Science and Information Conference (SAI), pp. 468–473, doi: 10.1109/SAI.2015.7237184.
5. Alduais A.M.S. (2013), Quranic phonology and generative phonology: formulating generative phonological rules to non-syllabic Nuun’s Rules, International Journal of Linguistics, 5(5): 33–61, doi: 10.5296/ijl.v5i1.2436.
6. Al-Ghamdi M., Al-Muhtasib H., Elshafei M. (2004), Phonetic rules in Arabic script, Journal of King Saud University – Computer and Information Sciences, 16: 85–115, doi: 10.1016/S1319-1578(04)80010-7.
7. Al-Ghamdi M., Elshafei M., Al-Muhtaseb H. (2009), Arabic broadcast news transcription system, International Journal of Speech Technology, 10(4): 183–195, doi: 10.1007/s10772-009-9026-8.
8. Apostolopoulou M.S., Sotiropoulos D.G., Livieris I.E, Pintelas P. (2009), A memoryless BFGS neural network training algorithm, [in:] Proceeding of the 7th IEEE International Conference on Industrial Informatics (INDIN), pp. 216–221, doi: 10.1109/INDIN.2009.5195806.
9. Bagshaw P.C. (1998), Phonemic transcription by analogy in text-to-speech synthesis: novel word pronunciation and lexicon compression, Computer Speech and Language, 12(2): 119-142, doi: 10.1006/csla.1998.0042
10. Biadsy F., Habash N., Hirschberg J. (2009), Improving the Arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules, [in:] Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the ACL, Boulder, Colorado, pp. 397–405.
11. Casacuberta F., Vidal E. (2007), Systems and tools for machine translation. GIZA++: Training of statistical translation models, Universitat Politécnica de Valéncia, Spain,
12. Cherifi E.H. (2020), MCAW-Dict, Phonetic Dictionary of the Most Commonly used Arabic Words with SIMPA Transcription, d/1hƒ_dPwAXKone7nGIKgelMt8mIzGYFF7d2/view ?usp=sharing.
13. Cherifi E.H., Guerti M. (2017), Phonetisaurusbased letter-to-sound transcription for standard Arabic, [in:] The 5th International Conference on Electrical Engineering (ICEE-B 2017), pp. 45–48, October 29th to 31st, 2017, Boumerdes, Algeria, doi: 10.1109/ICEEB.2017.8192073.
14. El-Imam Y.A.(1989), An unrestricted vocabulary Arabic speech synthesis system, IEEE Transactions on Acoustics, Speech and Signal Processing, 37(12): 1829– 1845, doi: 10.1109/29.45531.
15. El-Imam Y.A. (2004), Phonetization of Arabic: rules and algorithms, Computer Speech and Language, 18: 339–373, doi: 10.1016/S0885-2308(03)00035-4.
16. Elshafei M., Al-Ghamdi M., Al-Muhtaseb H., Al-Najjar A. (2008), Generation of Arabic phonetic dictionaries for speech recognition, [in:] Proceedings of the International Conference on Innovations in Information Technology IIT2008, pp. 59-63. doi: 10.1109/INNOVATIONS.2008.4781716.
17. Ferrat K., Guerti M. (2017), An experimental study of the gemination in Arabic language, Archives of Acoustics, 42(4): 571–578, doi: 10.1515/aoa-2017-0061.
18. Habash N., Rambow O., Roth R. (2009), Mada+ tokan: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization, [in:] Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt, pp. 102–109.
19. Illina I., Fohr D., Jouvet D. (2012), Pronunciation generation for proper names using Conditional Random Fields [in French: Génération des prononciations de noms propres à l’aide des Champs Aléatoires Conditionnels], Actes de la Conférence Conjointe JEPTALN- RECITAL 2012, Vol. 1, pp. 641–648.
20. Jousse F., Gilleron R., Tellier I., Tommasi M. (2006), Conditional random fields for XML trees [in:] Proceedings of the International Workshop on Mining and Learning with Graphs, ECML/PKDD 2006, pp. 141–148.
21. Kudo T. (2005), CRF++: Yet another CRF toolkit. User’s manual and implementation, UCDenver-ccp/crfpp (retrieved September 20, 2020).
22. Lafferty J., McCallum A., Pereira F. (2001), Conditional Random Fields: probabilistic models for segmenting and labeling sequence data, [in:] Proceedings of the International Conference on Machine Learning ICML’01, pp. 282–289.
23. Luk R.W.P., Damper R.I. (1996), Stochastic phonographic transduction for English, Computer Speech and Language, 10(2): 133–153, doi: 10.1006/csla.1996.0009.
24. McCallum A., Li W. (2003), Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, [in:] Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL}2003, pp. 188– 191,
25. Polyakova T., Bonafonte A. (2005), Main issues in grapheme-to-phonetic transcription for TTS, Procesamiento Del Lenguaje Natural, 2005(35): 29–34, 35004.
26. Priva U.C. (2012), Sign and signal deriving linguistic generalizations from information utility, Phd Thesis, Stanford University.
27. Ramsay A., Alsharhan I., Ahmed H. (2014), Generation of a phonetic transcription for modern standard Arabic: A knowledge-based model, Computer Speech and Language, 28(4): 959–978, doi: 10.1016/ j.csl.2014.02.005.
28. Roach P. (1987), English Phonetics and Phonology, 3rd ed., Longman: Cambridge UP. 29. Sejnowsky T., Rosenberg C.R. (1987), Parallel networks that learn to pronounce English text, Complex System, 1(1): 145–168.
30. Selim H., Anbar T. (1987), A phonetic transcription system of Arabic text, [in:] ICASSP’87. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1446–1449, doi: 10.1109/ICASSP.1987.1169472.
31. Sha F., Pereira F. (2003), Shallow parsing with conditional random fields, [in:] Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 213–220, doi: 10.3115/1073445.1073473.
32. Sindran F., Mualla F., Haderlein T., Daqrouq K., Nöth E. (2016), Rule-based standard Arabic Phonetization at phoneme, allophone, and syllable level, International Journal of Computational Linguistics (IJCL), 7(2): 23–37.
33. Sînziana M., Iria J. (2011), L1 vs. L2 regularization in text classification when learning from labeled features, [in:] Proceedings of the 2011 10th International Conference on Machine Learning and Applications, Vol. 1, pp. 168–171, doi: 10.1109/ICMLA.2011.85.
34. Toutanova K., Klein D., Manning C.D., Singer Y.Y. (2003), Feature-rich part-of-speech tagging with a cyclic dependency network, [in:] Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 252–259,
35. Tsuruoka Y., Tsujii J., Ananiadou S. (2009), Fast full parsing by linear-chain conditional random fields, [in:] Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), pp. 790–798,
36. Van Coile B. (1991), Inductive learning of pronunciation rules with the Depes system, [in:] Proceedings of ICASSP 91: The IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 745–748, doi: 10.1109/ICASSP.1991.150448.
37. Wallach H. (2002), Efficient training of conditional random fields, Master’s Thesis, University of Edinburgh.
38. Wells J.C. (2002), SAMPA for Arabic, OrienTel Project, bic.htm.
39. Yvon F. (1996), Grapheme-to-phoneme conversion using multiple unbounded overlapping chunks, [in:] Proceedings of the Conference on New Methods in Natural Language Processing, NeMLaP’96, pp. 218–228, Ankara, Turkey.






DOI: 10.24425/aoa.2021.136574


Archives of Acoustics; 2021; vol. 46; No 2; 237-247