Ahadi S. M. and Woodland P. C. (1995) : "Rapid
speaker adaptation using model prediction," Proc. ICASSP 1995,
pp. 684-687. |
Aust H., Oerder M., Seide F., and Steinbiss V. (1994) : "Experience
with the Philips automatic train timetable information system," Proc.
of IEEE Workshop on Interactive Voice Technology for Telecommunications
Applications (IVTTA94), pp. 67-72. |
Bertenstam J., Blomberg M., Carlson R., Elenius K., Granström
B., Gustafson J., Hunnicutt S., Högberg J., Lindell R., Neovius L.,
de Serpa-Leitao A., and Ström N. (1995a) : "The Waxholm application
database," Proc. EUROSPEECH '95, Madrid, pp. 833-836. |
Bertenstam J., Blomberg M., Carlson R., Elenius K., Granström
B,. Gustafson J., Hunnicutt S., Högberg J., Lindell R., Neovius L.,
de Serpa-Leitao A., Nord L., and Ström N. (1995b) : "Spoken dialogue
data collection in the Waxholm project," STL-QPSR, KTH, 1/1995,
pp. 50-73. |
Blomberg M., Carlson R., Elenius K., Granström B., Gustafson K.,
Hunnicutt S., Lindell R., and Neovius L. (1993) : "An experimental dialogue
system: Waxholm," Proc. EUROSPEECH '93, pp. 1867-1870. |
Bourlard H. (1995) : "Towards increasing speech recognition error rates,"
Proc. EUROSPEECH '95, pp. 883-894. |
Bourlard H. and Morgan N. (1993) : "Continuous speech recognition by
connectionist statistical methods," IEEE trans. on Neural Networks,
Vol. 4(6), pp. 893-909. |
Bourlard H. and Wellekens C. J. (1988) : "Links between Markov Models
and Multilayer Perceptrons," IEEE Trans on PAMI, 12(12), pp. 1167-1178.
Brown P. F., Lee C-H, and Spohrer J. C. (1983) : "Bayesian adaptation
in speech recognition," Proc. ICASSP 1983, pp. 761-764. |
Carlson, R. and Glass J. (1992a) : "Vowel classification based on analysis-by-synthesis,"
STL-QPSR 4/1992, pp. 17-27, Dept. of Speech Communication and Music
Acoustics, KTH, Sweden. |
Carlson R. and Glass J. (1992b) : "Vowel classification based on analysis-by-synthesis,"
Proc. ICSLP 1992, pp. 575-578. |
Carlson R. and Hunnicutt S. (1996) : "Generic and domain-specific aspects
of the Waxholm NLP and dialog modules" Proc. ICSLP 1996, pp. 677-680. |
Chang J. and Glass J. (1997) : "Segmentation and modeling in segment-based
recognition," Proc. EUROSPEECH 1997, pp. 1199-1202. |
Cohen J. (1996): "The summers of our discontent," Proc. ICSLP 1996,
distributed on CDROM version. |
Dalsgaard P. and Baekgaard A. (1994) : "Spoken language dialogue systems,"
Proc. of Artificial Intelligence, Infix. Presented at the CRIM/FOR-WISS
workshop on Progress and Prospects of Speech Research and Technology,
Munich. |
Digalakis V. V., Ostendorf M., and Rohlicek J. R. (1992) : "Fast algorithms
for phone classification and recognition using segment-based models," IEEE
Trans. on Signal Processing, Vol. 40, pp. 2885-2896. |
Elenius K. and Takacs G. (1990) : "Acoustic-phonetic recognition of
continuos speech by artificial neural networks," STL-QPSR 2-3/1990,
pp. 1-44, KTH, Dept. of Speech Communication and Music Acoustics, Sweden. |
English T. M. and Boggess L. C. (1992) : "Back-propagation training
of a neural network for word spotting", Proc. ICASSP '92, Vol. 2,
pp. 357-360. |
Fant G., Liljenkrans J., and Lin Q. (1985) : "A four-parameter model
of glottal flow," STL-QPSR 4/85, pp. 1-13, KTH, Dept. of Speech,
Music and Hearing, Sweden. |
Gauvain J. L. and Lee C. H. (1994) : "Maximum a posteriori estimation
for multivariate Gaussian observations of Markov chains," IEEE Trans.
Speech and Audio Processing, Vol. 2(2), pp. 806-814. |
Gish H. (1990) : A probabilistic approach to the understanding and
training of neural network classifiers," Proc. ICASSP '90, pp.1361-1364. |
Glass J., Chang J., and McCandless M. (1996) : "A probabilistic framework
for feature-based speech recognition," Proc. ICSLP '96, pp. 2277-2280. |
Glass J., Flammia G., Goodine D., Phillips M., Polifroni J., Sakai
S., Seneff S., and Zue V. (1995) : "Multilingual spoken language understanding
in the MIT voyager system," Speech Communication 17/1-2,
pp. 1-18. |
Hazen T. J. and Glass J. R. (1997) : "A comparison of novel techniques
for instantaneous speaker adaptation," Proc. EUROSPEECH 1997, pp.
2047-2050. |
Hetherington L. and McCandless M. (1996) : "SAPPHIRE: An extensible
speech analysis and recognition tool based on Tcl/Tk," Proc ICSLP '96,
pp. 1942-1945. |
Hetherington L., Phillips M., Glass J., and Zue V. (1993) : "A* word
network search for continuous speech recognition," Proc. ICASSP '93,
pp. 1533-1536. |
Hopcroft J. and Ullman J. (1979) : Introduction to automata theory,
languages and computation, Addison and Wesley, ISBN 0-201-02988X. |
Huang X. D. and Lee K. F. (1991) : "On speaker-independent, speaker-dependent
and speaker-adaptive speech recognition," Proc. ICASSP 1991, pp.
877-880. |
Kershaw D. J., Hochberg M. M., and Robinson A. J. (1996) : "Context-dependent
classes in a hybrid recurrent network-HMM speech recognition system," In:
Advances in Neural Information Processing Systems, Vol. 8,
eds: Touretsky D. S., Mozer M. C, and Hasselmo M. E., Morgan Kaufmann. |
Ladefoged P., and Broadbent D. E. (1957) : "Information conveyed by
vowels," JASA 29(1), pp. 99-104. |
Le Cun Y., Denker J. S., and Solla S. A. (1990) : "Optimal brain damage,"
In: Advances in Neural Information Processing Systems Vol. II,
ed: Touretsky D. S., pp. 589-605, San Mateo, California IEEE, Morgan Kaufmann. |
Leggetter C. J. and Woodland P. C. (1994) : "Speaker adaptation of
continuous density HMMs using multivariate linear regression," Proc.
ICSLP 1994, pp. 451-454. |
Levin E. (1990): "Word recognition using hidden control neural architecture",
Proc. ICASSP '90, Vol. 1, pp. 433-436. |
Li K. P., Naylor J. A., and Rossen M. L. (1992) : "A whole word recurrent
neural network for keyword spotting," Proc. ICASSP '92, Vol. 2,
pp. 81-84. |
Mitchel C. D., Harper M. P., and Jamieson L. H. (1996) : "Stochastic
observation hidden Markov models," Proc. ICASSP '96, pp. 617-620. |
Necioglu B. F., Ostendorf M., and Rohlicek J. R. (1992) : "A Bayesian
approach to speaker adaptation for the stochastic segment model," Proc.
ICASSP 1992, pp. I-437 - I-440. |
Ney H. and Aubert X. (1994) : "A word graph algorithm for large vocabulary,
continuous speech recognition," Proc. ICSLP '94, pp. 1355-1358. |
Peckham J. (1993) : "A new generation of spoken dialog systems: results
and lessons from the SUNDIAL Project," Proc. Eurospeech '93, pp.
33-40. |
Richard M. D. and Lippman R. P. (1991) : "Neural network classifiers
estimate Bayesian a posteriori probabilities," Neural Computation,
Vol. 3, pp. 461-483. |
Robinson A. J. (1994) : "An application of recurrent nets to
phone probability estimation," IEEE trans. on Neural Networks Vol. 5(2),
pp. 298-305. |
Robinson T. and Fallside F. (1991) : "A recurrent error propagation
network speech recognition system," Computer Speech & Language
5:3, pp. 259-274. |
Shiel F. (1993) : "A new approach to speaker adaptation by modelling
pronounciation in automatic speech recognition," Speech Communication
Vol. 13, pp. 281-286. |
Sietsma J. and Dow R. J. F. (1991) : "Creating artificial neural networks
that generalize," Neural Networks, 4(1) pp. 67-69. |
Sjölander K. and Gustafson J. (1997) : "An integrated system for
teaching spoken dialogue systems technology," Proc. EUROSPEECH '97,
pp. 1927 - 1930. |
Soong and Huang (1991) : "A tree-trellis based fast search for finding
the N best sentence hypotheses in continuous speech recognition," Proc.
ICASSP '91, pp. 713-716. |
Strange W. (1989): "Evolving theories of vowel perception," JASA
85(5), pp. 2081-2087. |
Ström N. (1992): "Development of a recurrent time-delay
neural net speech recognition system," STL-QPSR 2-3/1992, pp. 1-44,
KTH, Dept. of Speech Communication and Music Acoustics, Sweden. |
Ström N. (1994a) : "Optimising the lexical representation to improve
A* lexical search", STL-QPSR 2-3/1994, pp. 113-124. |
Ström N. (1994b): "Experiments with a new algorithm for fast speaker
adaptation," Proc. ICSLP 1994, pp. 459-462. |
Ström N. (1995): Generation and minimisation of word graphs in
continuous speech recognition," Proc. Workshop on Automatic Speech Recognition,
pp. 125-126, Snowbird, Utah. |
Ström N. (1997): Nikko Ström (1997): "A tonotopic artificial
neural network architechture for phoneme probability estimatio," To
appear in Proc. of the 1997 IEEE Workshop on Speech Recognition and Understanding,
Santa Barbara, CA. |
Sutton S., de Veilliers J., Schalkwyk J., Fanty M., Novick D. and Cole
R. (1996): "Technical specification of the CSLU toolkit;" Tech. Report
No. CSLU-013096, CSLU, Dept. of Computer Science and Engineering, Oregon
Graduate Institute of Science and Technology, Portland. OR. |
Tebelskis J. and Waibel A. (1990): "Large vocabulary recognition using
linked predictive neural networks," Proc. ICASSP '90, Vol. 1,
pp. 437-440. |
Verbrugge R. R. and Strange W. (1976): "What information enables a
listener to map a talker's vowel space," JASA 60(1), pp.
198-212. |
Viterbi A.J. (1967): "Error bounds for convolutional codes and an asymptotically
optimum decoding algorithm," IEEE Trans. Information Theory, Vol.
IT-13, pp. 260-269. |
Waibel A., Hanazawa T., Hinton G., Shikano K. and Lang K. (1987) :
"Phoneme recognition using time-delay neural networks," ATR Technical
Report TR-006, ATR, Japan. |