Item Details

Learners’ Feedback Regarding ASR-based Dictation Practice for Pronunciation Learning

Issue: Vol 36 No. 2 (2019)

Journal: CALICO Journal

Subject Areas:

DOI: 10.1558/cj.34738

Abstract:

Although early ASR-based dictation programs were criticized for lack of accuracy and explicit feedback for L2 pronunciation practice, teachers and researchers have shown renewed interest. However, little is known about student reactions to ASRbased dictation practice. This qualitative study examines student perspectives, identifying advantages and challenges to working with dictation software and generating ideas for the ideal ASR dictation program. Advanced ESL participants (n=16) worked with Windows Speech Recognition in a three-week hybrid pronunciation workshop. The study identifies many themes, including advantages such as ease of use, usefulness for pronunciation learning due to feedback provided, and heightened awareness of pronunciation issues, but also disadvantages, such as frustrating levels of recognition, particularly in the first attempt, doubts of the program's transcription abilities, and lack of convenience. Participants reported that convenience and greater support in pronunciation practice would be important for an ideal program.

Author: Shannon McCrocklin

View Original Web Page

References :

Blankenship, B. (1991). Second language vowel perception. Journal of the Acoustical Society of America, 90, 2252–2252. https://doi.org/10.1121/1.401514

Celce-Murcia, M., Brinton, D., & Goodwin, J. (2010). Teaching pronunciation (2nd ed.). Cambridge, England: Cambridge University Press.

Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., & Nakamura, S. (2008). Automatic pronunciation scoring of words and sentences independent from the non-native’s first language. Computer Speech and Language, 23, 65–88. https://doi.org/10.1016/j.csl.2008.03.001

Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English. System, 27, 49–64. https://doi.org/10.1016/S0346-251X(98)00049-9

Cordier, D. (2009). Speech recognition software for language learning: Toward an evaluation of validity and student perceptions (Doctoral dissertation). http://scholarcommons.usf.edu.

Creswell, J. W., & Plano Clark, V. L. (2007). Designing and conducting mixed methods research. Thousand Oaks, CA: Sage Publications.

Cucchiarini, C., & Strik, H. (2018). Automatic Speech Recognition. In O. Kang, R. I. Thomson, & J. M. Murphy (Eds.), The Routledge handbook of contemporary English pronunciation (pp. 556–569). New York, NY: Routledge.

Derwing, T., Munro, M., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34(3), 592–603. https://doi.org/10.2307/3587748

Flege, J. E., Munro, M. J., & Fox, R. A. (1993). Auditory and categorical effects on cross-language vowel perception. Journal of the Acoustical Society of America, 95, 3623–3641. https://doi.org/10.1121/1.409931

Gao, Y., Xie, Y., Cao, W., & Zhang, J. (2015). A study on robust detection of pronunciation erroneous tendency based on deep neural network. Proceedings from INTERSPEECH 2015, Dresden, Germany, 693–696. Available from https://www.isca-speech.org/archive/interspeech_2015.

Hincks, R. (2003). Speech technologies for pronunciation feedback and evaluation. ReCALL. 15(1), 3–20. https://doi.org/10.1017/S0958344003000211

Hincks, R. (2015). Technology and leaning pronunciation. In M. Reed & J. Levis (Eds), The handbook of English pronunciation (pp. 505–519). Malden, MA: John Wiley & Sons.

Johnson, B., & Turner, L. A. (2003). Data collection strategies in mixed methods research. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research (pp. 297–319). Thousand Oaks, CA: Sage Publications.

Levis, J., & Suvorov, R. (2014). Automated speech recognition. In C. Chapelle (Ed.), The encyclopedia of applied linguistics. http://onlinelibrary.wiley.com/.

Levy, M. (2015). The role of qualitative approaches to research in CALL contexts: Closing in on the learner’s experience. CALICO Journal, 32(2), 554–568. https://doi.org/10.1558/cj.v32i3.26620

Liakin, D., Cardoso, W., & Liakina, N. (2014). Learning L2 pronunciation with a mobile speech recognizer: French /y/. CALICO Journal, 32(1), 1–25. https://doi.org/10.1558/cj.v32i1.25962

Liakin, D., Cardoso, W., & Liakina, N. (2017). Mobilizing instruction in a second-language context: Perceptions of two speech technologies. Languages, 2(3), 1–21. https://doi.org/10.3390/languages2030011

Madriz, E. (2000). Focus groups in feminist research. In N. Denzin & Y. Lincoln (Eds.), Handbook of qualitative research (2nd ed.) (pp. 835–850). Thousand Oaks, CA: Sage Publications.

McCrocklin, S. (2016). Pronunciation learner autonomy: The potential of automatic speech recognition. System, 57, 25–42. https://doi.org/10.1016/j.system.2015.12.013

McCrocklin, S. (2019). ASR-based dictation practice for second language pronunciation improvement. Journal of Second Language Pronunciation, 5(1), 98–118.

Mroz, A. (2018). Seeing how people hear you: French learners experiencing intelligibility through automatic speech recognition. Foreign Language Annals, 51(3), 1–21. https://doi.org/10.1111/flan.12348

Neri, A., Cucchiarini, C., & Strik H. (2003). Automatic speech recognition for second language learning: How and why it actually works. Proceedings from the 15th ICPhS, Barcelona, Spain, 1157–1160.

Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning, 21(5), 393–408. https://doi.org/10.1080/09588220802447651

Saito, K., & Lyster, R. (2012). Effects of form-focused instruction and corrective feedback on L2 pronunciation development of /ɹ/ by Japanese learners of English. Language Learning, 62(2), 595–633. https://doi.org/10.1111/j.1467-9922.2011.00639.x

Schwienhorst, K. (2008). Learner autonomy and CALL environments. New York, NY: Routledge.

Sheerin, S. (1997). An exploration of the relationship between self-access and independent learning. In P. Benson & P. Voller (Eds.), Autonomy and independence in language learning (pp. 54–65). London, England: Longman.

Strik, H., Neri, A., & Cucchiarini, C. (2008). Speech technology for language tutoring. Proceedings of Language and Speech Technology (LangTech ‘08) Conference, Rome, Italy, 73–76.

Tepperman, J. (2009). Hierarchical methods in automatic pronunciation evaluation. (Doctoral dissertation). Ann Arbor, MI: UMI Dissertation Services.

Wallace, L. (2016). Using Google Web Speech as a springboard for identifying personal pronunciation problems. Proceedings of the 7th Annual Pronunciation in Second Language Learning and Teaching Conference. Retrieved from https://apling.engl.iastate.edu/alt-content/uploads/2016/08/PSLLT7_July29_2016_B.pdf.

Wang, H., Qian, W., & Meng, H. (2013). Predicting gradation of L2 English mispronunciations using crowdsourced ratings and phonological rules. Proceedings from Speech and Language Technology in Education 2013. Grenoble, France, 1–5.

Wang, Y. H., & Young, S. S. C. (2015). Effectiveness of feedback for enhancing English pronunciation in an ASR-based CALL system. Journal of Computer Assisted Learning, 31, 493–504. https://doi.org/10.1111/jcal.12079