Ahadi S. M. and Woodland P. C. (1995) : "Rapid
speaker adaptation using model prediction," Proc. ICASSP 1995, pp. 684-687. |
Aust H., Oerder M., Seide F., and Steinbiss V. (1994) : "Experience with the Philips automatic train timetable information system," Proc. of IEEE Workshop on Interactive Voice Technology for Telecommunications Applications (IVTTA94), pp. 67-72. |
Bertenstam J., Blomberg M., Carlson R., Elenius K., Granström B., Gustafson J., Hunnicutt S., Högberg J., Lindell R., Neovius L., de Serpa-Leitao A., and Ström N. (1995a) : "The Waxholm application database," Proc. EUROSPEECH '95, Madrid, pp. 833-836. |
Bertenstam J., Blomberg M., Carlson R., Elenius K., Granström B,. Gustafson J., Hunnicutt S., Högberg J., Lindell R., Neovius L., de Serpa-Leitao A., Nord L., and Ström N. (1995b) : "Spoken dialogue data collection in the Waxholm project," STL-QPSR, KTH, 1/1995, pp. 50-73. |
Blomberg M., Carlson R., Elenius K., Granström B., Gustafson K., Hunnicutt S., Lindell R., and Neovius L. (1993) : "An experimental dialogue system: Waxholm," Proc. EUROSPEECH '93, pp. 1867-1870. |
Bourlard H. (1995) : "Towards increasing speech recognition error rates," Proc. EUROSPEECH '95, pp. 883-894. |
Bourlard H. and Morgan N. (1993) : "Continuous speech recognition by connectionist statistical methods," IEEE trans. on Neural Networks, Vol. 4(6), pp. 893-909. |
Bourlard H. and Wellekens C. J. (1988) : "Links between Markov Models
and Multilayer Perceptrons," IEEE Trans on PAMI, 12(12), pp. 1167-1178.
Brown P. F., Lee C-H, and Spohrer J. C. (1983) : "Bayesian adaptation in speech recognition," Proc. ICASSP 1983, pp. 761-764. |
Carlson, R. and Glass J. (1992a) : "Vowel classification based on analysis-by-synthesis," STL-QPSR 4/1992, pp. 17-27, Dept. of Speech Communication and Music Acoustics, KTH, Sweden. |
Carlson R. and Glass J. (1992b) : "Vowel classification based on analysis-by-synthesis," Proc. ICSLP 1992, pp. 575-578. |
Carlson R. and Hunnicutt S. (1996) : "Generic and domain-specific aspects of the Waxholm NLP and dialog modules" Proc. ICSLP 1996, pp. 677-680. |
Chang J. and Glass J. (1997) : "Segmentation and modeling in segment-based recognition," Proc. EUROSPEECH 1997, pp. 1199-1202. |
Cohen J. (1996): "The summers of our discontent," Proc. ICSLP 1996, distributed on CDROM version. |
Dalsgaard P. and Baekgaard A. (1994) : "Spoken language dialogue systems," Proc. of Artificial Intelligence, Infix. Presented at the CRIM/FOR-WISS workshop on Progress and Prospects of Speech Research and Technology, Munich. |
Digalakis V. V., Ostendorf M., and Rohlicek J. R. (1992) : "Fast algorithms for phone classification and recognition using segment-based models," IEEE Trans. on Signal Processing, Vol. 40, pp. 2885-2896. |
Elenius K. and Takacs G. (1990) : "Acoustic-phonetic recognition of continuos speech by artificial neural networks," STL-QPSR 2-3/1990, pp. 1-44, KTH, Dept. of Speech Communication and Music Acoustics, Sweden. |
English T. M. and Boggess L. C. (1992) : "Back-propagation training of a neural network for word spotting", Proc. ICASSP '92, Vol. 2, pp. 357-360. |
Fant G., Liljenkrans J., and Lin Q. (1985) : "A four-parameter model of glottal flow," STL-QPSR 4/85, pp. 1-13, KTH, Dept. of Speech, Music and Hearing, Sweden. |
Gauvain J. L. and Lee C. H. (1994) : "Maximum a posteriori estimation for multivariate Gaussian observations of Markov chains," IEEE Trans. Speech and Audio Processing, Vol. 2(2), pp. 806-814. |
Gish H. (1990) : A probabilistic approach to the understanding and training of neural network classifiers," Proc. ICASSP '90, pp.1361-1364. |
Glass J., Chang J., and McCandless M. (1996) : "A probabilistic framework for feature-based speech recognition," Proc. ICSLP '96, pp. 2277-2280. |
Glass J., Flammia G., Goodine D., Phillips M., Polifroni J., Sakai S., Seneff S., and Zue V. (1995) : "Multilingual spoken language understanding in the MIT voyager system," Speech Communication 17/1-2, pp. 1-18. |
Hazen T. J. and Glass J. R. (1997) : "A comparison of novel techniques for instantaneous speaker adaptation," Proc. EUROSPEECH 1997, pp. 2047-2050. |
Hetherington L. and McCandless M. (1996) : "SAPPHIRE: An extensible speech analysis and recognition tool based on Tcl/Tk," Proc ICSLP '96, pp. 1942-1945. |
Hetherington L., Phillips M., Glass J., and Zue V. (1993) : "A* word network search for continuous speech recognition," Proc. ICASSP '93, pp. 1533-1536. |
Hopcroft J. and Ullman J. (1979) : Introduction to automata theory, languages and computation, Addison and Wesley, ISBN 0-201-02988X. |
Huang X. D. and Lee K. F. (1991) : "On speaker-independent, speaker-dependent and speaker-adaptive speech recognition," Proc. ICASSP 1991, pp. 877-880. |
Kershaw D. J., Hochberg M. M., and Robinson A. J. (1996) : "Context-dependent classes in a hybrid recurrent network-HMM speech recognition system," In: Advances in Neural Information Processing Systems, Vol. 8, eds: Touretsky D. S., Mozer M. C, and Hasselmo M. E., Morgan Kaufmann. |
Ladefoged P., and Broadbent D. E. (1957) : "Information conveyed by vowels," JASA 29(1), pp. 99-104. |
Le Cun Y., Denker J. S., and Solla S. A. (1990) : "Optimal brain damage," In: Advances in Neural Information Processing Systems Vol. II, ed: Touretsky D. S., pp. 589-605, San Mateo, California IEEE, Morgan Kaufmann. |
Leggetter C. J. and Woodland P. C. (1994) : "Speaker adaptation of continuous density HMMs using multivariate linear regression," Proc. ICSLP 1994, pp. 451-454. |
Levin E. (1990): "Word recognition using hidden control neural architecture", Proc. ICASSP '90, Vol. 1, pp. 433-436. |
Li K. P., Naylor J. A., and Rossen M. L. (1992) : "A whole word recurrent neural network for keyword spotting," Proc. ICASSP '92, Vol. 2, pp. 81-84. |
Mitchel C. D., Harper M. P., and Jamieson L. H. (1996) : "Stochastic observation hidden Markov models," Proc. ICASSP '96, pp. 617-620. |
Necioglu B. F., Ostendorf M., and Rohlicek J. R. (1992) : "A Bayesian approach to speaker adaptation for the stochastic segment model," Proc. ICASSP 1992, pp. I-437 - I-440. |
Ney H. and Aubert X. (1994) : "A word graph algorithm for large vocabulary, continuous speech recognition," Proc. ICSLP '94, pp. 1355-1358. |
Peckham J. (1993) : "A new generation of spoken dialog systems: results and lessons from the SUNDIAL Project," Proc. Eurospeech '93, pp. 33-40. |
Richard M. D. and Lippman R. P. (1991) : "Neural network classifiers estimate Bayesian a posteriori probabilities," Neural Computation, Vol. 3, pp. 461-483. |
Robinson A. J. (1994) : "An application of recurrent nets to phone probability estimation," IEEE trans. on Neural Networks Vol. 5(2), pp. 298-305. |
Robinson T. and Fallside F. (1991) : "A recurrent error propagation network speech recognition system," Computer Speech & Language 5:3, pp. 259-274. |
Shiel F. (1993) : "A new approach to speaker adaptation by modelling pronounciation in automatic speech recognition," Speech Communication Vol. 13, pp. 281-286. |
Sietsma J. and Dow R. J. F. (1991) : "Creating artificial neural networks that generalize," Neural Networks, 4(1) pp. 67-69. |
Sjölander K. and Gustafson J. (1997) : "An integrated system for teaching spoken dialogue systems technology," Proc. EUROSPEECH '97, pp. 1927 - 1930. |
Soong and Huang (1991) : "A tree-trellis based fast search for finding the N best sentence hypotheses in continuous speech recognition," Proc. ICASSP '91, pp. 713-716. |
Strange W. (1989): "Evolving theories of vowel perception," JASA 85(5), pp. 2081-2087. |
Ström N. (1992): "Development of a recurrent time-delay neural net speech recognition system," STL-QPSR 2-3/1992, pp. 1-44, KTH, Dept. of Speech Communication and Music Acoustics, Sweden. |
Ström N. (1994a) : "Optimising the lexical representation to improve A* lexical search", STL-QPSR 2-3/1994, pp. 113-124. |
Ström N. (1994b): "Experiments with a new algorithm for fast speaker adaptation," Proc. ICSLP 1994, pp. 459-462. |
Ström N. (1995): Generation and minimisation of word graphs in continuous speech recognition," Proc. Workshop on Automatic Speech Recognition, pp. 125-126, Snowbird, Utah. |
Ström N. (1997): Nikko Ström (1997): "A tonotopic artificial neural network architechture for phoneme probability estimatio," To appear in Proc. of the 1997 IEEE Workshop on Speech Recognition and Understanding, Santa Barbara, CA. |
Sutton S., de Veilliers J., Schalkwyk J., Fanty M., Novick D. and Cole R. (1996): "Technical specification of the CSLU toolkit;" Tech. Report No. CSLU-013096, CSLU, Dept. of Computer Science and Engineering, Oregon Graduate Institute of Science and Technology, Portland. OR. |
Tebelskis J. and Waibel A. (1990): "Large vocabulary recognition using linked predictive neural networks," Proc. ICASSP '90, Vol. 1, pp. 437-440. |
Verbrugge R. R. and Strange W. (1976): "What information enables a listener to map a talker's vowel space," JASA 60(1), pp. 198-212. |
Viterbi A.J. (1967): "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm," IEEE Trans. Information Theory, Vol. IT-13, pp. 260-269. |
Waibel A., Hanazawa T., Hinton G., Shikano K. and Lang K. (1987) : "Phoneme recognition using time-delay neural networks," ATR Technical Report TR-006, ATR, Japan. |