The contribution of dynamic versus static formant information in conversational speech
Issue: Vol 27 No. 1 (2020)
Journal: International Journal of Speech Language and the Law
Subject Areas: Linguistics
DOI: 10.1558/ijsll.41058
Abstract:
The relative contributions of static and dynamic formant representations to speaker-specificity were investigated in conversational speech and in two vowels varying in inherent spectral change. Using polynomial fits, the contribution of dynamic formant coefficients to speaker-specificity relative to that of the formant intercept was investigated in the diphthongal vowel [ei] taken from English and Dutch conversational speech. The [ei] tokens were sampled from various linguistic contexts and analysed in an LR approach. Results show that formant dynamics contain speaker-specific information in conversational speech even though the high contextual variation seems to reduce its effect relative to that reported by earlier work. Vowels differ in inherent dynamicity and therefore, the added value of dynamic formant information to speaker-specificity was also compared between vowels differing in inherent spectral change. Using Dutch data, the contribution of formant dynamics to speaker-specificity was compared between [ei] and [aː] tokens produced by the same speakers. Formant dynamics in conversational speech only contributed to speaker-specificity in the diphthong [ei], not in the monophthong [aː].
Author: Willemijn Heeren
References :
Adank. P., van Hout, R. and Van de Velde, H. (2007) An acoustic description of northern and southern standard Dutch II: Regional varieties. The Journal of the Acoustical Society of America 121(2): 1130–1141.
Boersma, P. and Weenink, D. (2018) Praat. Doing phonetics by computer [Computer program]. Version 6.0.42.
Byrd, D. (1994) Relations of sex and dialect to reduction. Speech Communication 15(1-2): 39–54.
Fejlovà, D., Lukeš, D. and Skarnitzl, R. (2013) Formant contours in Czech vowels: Speaker-discriminating potential. Proceedings of Interspeech 2013 3182–3186, 25–29 August 2013, Lyon, France.
Gold, E. A. (2014) Calculating likelihood ratios for forensic speaker comparisons using phonetic and linguistic parameters. PhD dissertation, University of York, UK.
Gussenhoven, C. (1999) Dutch. In International Phonetic Association, and International Phonetic Association Staff (ed.) Handbook of the International Phonetic Association. A guide to the use of the International Phonetic Alphabet 74–77, Cambridge: Cambridge University Press.
Author (2018) Title.
Hughes, V., Wood, S. and Foulkes, P. (2016) Strength of forensic voice comparison evidence from the acoustics of filled pauses. The International Journal of Speech, Language and the Law 23(1): 99–132.
Hughes, V. (2017) Sample size and the multivariate kernel density likelihood ratio: How many speakers are enough? Speech Communication 94: 15–29.
Ingram, J. C. L., Prandolini, R. and Ong, S. (1996) Formant trajectories as indices of phonetic variation for speaker identification. Forensic Linguistics 3(1): 129–145.
Johnson, K., Ladefoged, P. and Lindau, M. (1993) Individual differences in vowel production. The Journal of the Acoustical Society of America 94(2): 701–714.
Jones, D. (1957) An outline of English phonetics. Cambridge: W. Heffer and Sons Ltd.
Künzel, H. J. (2001) Beware of the ‘telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Linguistics 8(1): 80–99.
Aitken, C. G. G. and Lucy, D. (2004) Evaluation of trace evidence in the form of multivariate data. Applied Statistics 53: 109–122.
McDougall, K. (2004) Speaker-specific formant dynamics: an experiment on Australian English /aɪ/. Speech, Language and the Law 11(1): 103–130.
McDougall, K. (2006) Dynamic features of speech and the characterization of speakers: towards a new approach using formant frequencies. International Journal of Speech, Language and the Law 13(1): 89–126.
McDougall, K. and Nolan, F. (2007) Discrimination of Speakers Using the Formant Dynamics of /u:/ in British English In J. Trouvain and W. Barry (eds) Proceedings of the 16th International Congress of Phonetic Sciences 1825–1828, 6–10 August 2007, Saarbrücken, Germany.
Moos, A. (2010) Long-term formant distributions as a measure of speaker characteristics in read and spontaneous speech. The Phonetician 101: 7–24.
Morrison, G. S., and Nearey, T. M. (2007) Testing theories of vowel inherent spectral change. Journal of the Acoustical Society of America 122: EL15–22.
Morrison, G. S. (2007) Matlab implementation of Aitken & Lucy’s (2004) forensic likelihood-ratio software using multivariate-kernel-density estimation, Downloaded from https://geoff-morrison.net/#MVKD, last visited on 28-11-2019.
Morrison, G. S. (2008) Forensic voice comparison using likelihood ratios based on polynomial curves fitted to the formant trajectories of Australian English /aɪ/. The International Journal of Speech, Language and the Law 15(2): 249–266.
Morrison, G. S. (2009a) Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs. Journal of the Acoustical Society of America 125(4): 2387–2397.
Morrison, G. S. (2009b) train_llr_fusion_robust.m, Downloaded from https://geoff-morrison.net/#TrainFus, last visited on 28-11-2019.
Morrison, G. S., Zhang, C. and Rose P. (2011) An empirical estimate of the precision of likelihood ratios form a forensic-voice-comparison system. Forensic Science International 208: 59–65.
Nearey, T. M. and Assman, P. F. (1986) Modeling the role of inherent spectral change in vowel identification. Journal of the Acoustical Society of America 80: 1297–1308.
Nolan, F., and Grigoras, C. (2005). A case for formant analysis in forensic speaker identification. Speech, Language and the Law 12(2): 143–173.
Nolan, F., McDougall, K., de Jong, G. and Hudson, T. (2009) The DyViS database: style-controlled recordings of 100 homogeneous speakers for forensic phonetic research. The International Journal of Speech, Language and the Law 16(1): 31–57.
Oostdijk, N. H. J. (2000) Het Corpus Gesproken Nederlands [The Spoken Dutch corpus]. Nederlandse Taalkunde 5: 280–284.
Peterson, G. E. and Barney, H. L. (1952) Control methods used in a study of the vowels. The Journal of the acoustical society of America 24(2): 175–184.
Roach, P. (2004) British English. Received pronunciation. Journal of the International Phonetic Association 34(2): 239–245.
Rose, P. (1999) Long- and short-term within-speaker differences in the formants of Australian hello. Journal of the International Phonetic Association 29(1): 1–31.
Rose, P. (2015) Forensic voice comparison with monophthongal formant trajectories-a likelihood ratio-based discrimination of “schwa” vowel acoustics in a close social group of young Australian females. Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing 4819–4823.
Schindler, C. and Draxler, C. (2013) Using spectral moments as a speaker specific feature in nasals and fricatives. Proceedings of Interspeech 2793–2796, Lyon, France, 25–29 August 2013.
Thaitechawat, S. and Foulkes, P. (2011) Discrimination of speakers using tone and formant dynamics in Thai. Proceedings of ICPhS XVII 1978–1981, Hong Kong, 17–21 August 2011.
Van den Heuvel, H. (1996) Speaker variability in acoustic properties of Dutch phoneme realisations. PhD dissertation, Radboud University Nijmegen.
Van de Velde, H. (1996) Variatie en verandering in het gesproken Standaard-Nederlands. Nijmegen: Katholieke Universiteit Nijmegen
Van Leeuwen, D. A. (2008) SRE-tools, a software package for calculating performance metrics for NIST speaker recognition evaluations. Downloaded from http://sretools.googlepages.com/, last visited on 02-03-2020.
Voeten, C. C. (submitted) The adoption of sound change. Synchronic and diachronic processing of regional variation in Dutch. PhD dissertation, Leiden University
Weirich, M. and Simpson, A. P. (2018) Individual differences in acoustic and articulatory undershoot in a German diphthong – Variation between male and female speakers. Journal of Phonetics 71: 35–50.