Conferences related to Speech Recognition

Back to Top

2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD 2013)

“SpeD 2013” will bring together academics and industry professionals from universities, government agencies and companies to present their achievements in speech technology and related fields. “SpeD 2013” is a conference and international forum which will reflect some of the latest tendencies in spoken language technology and human-computer dialogue research as well as some of the most recent applications in this area.

  • 2011 6th Conference on Speech Technology and Human - Computer Dialogue (SpeD 2011)

    SpeD 2011 will bring together academics and industry professionals from universities, government agencies and companies to present their achievements and the latest tendencies in spoken language technology and human-computer dialogue research as well as some of the most recent applications in this area.

  • 2009 5th Conference on Speech Technology and Human - Computer Dialogue (SpeD 2009)

    The 5th Conference on Speech Technology and Human-Computer Dialogue (at Constanta, Romania) brings together academics and industry professionals from universities, government agencies and companies to present their achievements in speech technology and related fields. SpeD 2009 is a conference and international forum which will reflect some of the latest tendencies in spoken language technology and human-computer dialogue research as well as some of the most recent applications in this area.


2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII)

The conference will address, but is not limited to, the following topics:• Computational and psychological models of emotion;• Affect in arts entertainment and multimedia;• Bodily manifestations of affect (facial expressions, posture, behavior, physiology);• Databases for emotion processing, development and issues;• Affective interfaces and applications (games, learning, dialogue systems…);• Ecological and continuous emotion assessment;• Affect in social interactions.

  • 2009 3rd International Conference on Affective Computing and Intelligent Interaction (ACII 2009)

    The conference series on Affective Computing and Intelligent Interaction is the premier international forum for state of the art in research on affective and multi modal human-machine interaction and systems. Every other year the ACII conference plays an important role in shaping related scientific, academic, and higher education programs. This year, we are especially soliciting papers discussing Enabling Behavioral and Socially-Aware Human-Machine Interfaces in areas including psychology.


2013 IEEE International Conference on Multimedia and Expo (ICME)

To promote the exchange of the latest advances in multimedia technologies, systems, and applications from both the research and development perspectives of the circuits and systems, communications, computer, and signal processing communities.

  • 2012 IEEE International Conference on Multimedia and Expo (ICME)

    IEEE International Conference on Multimedia & Expo (ICME) has been the flagship multimedia conference sponsored by four IEEE Societies. It exchanges the latest advances in multimedia technologies, systems, and applications from both the research and development perspectives of the circuits and systems, communications, computer, and signal processing communities.

  • 2011 IEEE International Conference on Multimedia and Expo (ICME)

    Speech, audio, image, video, text processing Signal processing for media integration 3D visualization, animation and virtual reality Multi-modal multimedia computing systems and human-machine interaction Multimedia communications and networking Multimedia security and privacy Multimedia databases and digital libraries Multimedia applications and services Media content analysis and search Hardware and software for multimedia systems Multimedia standards and related issues Multimedia qu

  • 2010 IEEE International Conference on Multimedia and Expo (ICME)

    A flagship multimedia conference sponsored by four IEEE societies, ICME serves as a forum to promote the exchange of the latest advances in multimedia technologies, systems, and applications from both the research and development perspectives of the circuits and systems, communications, computer, and signal processing communities.

  • 2009 IEEE International Conference on Multimedia and Expo (ICME)

    IEEE International Conference on Multimedia & Expo is a major annual international conference with the objective of bringing together researchers, developers, and practitioners from academia and industry working in all areas of multimedia. ICME serves as a forum for the dissemination of state-of-the-art research, development, and implementations of multimedia systems, technologies and applications.


2013 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

The ASRU workshop meets every two years and has a tradition of bringing together researchers from academia and industry in an intimate and collegial setting to discuss problems of common interest in automatic speech recognition and understanding.


2013 International Carnahan Conference on Security Technology (ICCST)

This international conference is a forum for all aspects of physical, cyber and electronic security research, development, systems engineering, testing, evaluation, operations and sustainability. The ICCST facilitates the exchange of ideas and information.

  • 2012 IEEE International Carnahan Conference on Security Technology (ICCST)

    Research, development, and user aspects of security technology, including principles of operation, applications, and user experiences.

  • 2011 International Carnahan Conference on Security Technology (ICCST)

    This annual conference is the world s longest -running, international technical symposium on security technology. This conference is a forum for collaboration on all aspects of physical, cyber and electronic security research, development, systems engineering, testing, evaluation, operations and sustainment. The ICCST facilitates the exchange of ideas and sharing of information on both new and existing technology and systems. Conference participants are encouraged to consider the impact of their work on society. The ICCST provides a foundation for support to authorities and agencies responsible for security, safety and law enforcement in the use of available and future technology.

  • 2010 IEEE International Carnahan Conference on Security Technology (ICCST)

    The ICCST is a forum for researchers and practitioners in both new and existing security technology, providing an interchange of knowledge through paper presentations and publication of proceedings that have been selected by the ICCST organizing committee.

  • 2009 International Carnahan Conference on Security Technology (ICCST)

    Conference is directed toward research and development and user aspects of electronic security technology.

  • 2008 International Carnahan Conference on Security Technology (ICCST)

    The ICCST is directed toward the research and development aspects of electronic security technology, including the operational testing of the technology. It establishes a forum for the exchange of ideas and dissemination of information on both new and existing technology. Conference participants are stimulated to consider the impact of their work on society. The Conference is an interchange of knowledge through the presentation of learned papers that have been selected by the ICCST organizing committee.

  • 2007 IEEE International Carnahan Conference on Security Technology (ICCST)

  • 2006 IEEE International Carnahan Conference on Security Technology (ICCST)


More Conferences

Periodicals related to Speech Recognition

Back to Top

Audio, Speech, and Language Processing, IEEE Transactions on

Speech analysis, synthesis, coding speech recognition, speaker recognition, language modeling, speech production and perception, speech enhancement. In audio, transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. (8) (IEEE Guide for Authors) The scope for the proposed transactions includes SPEECH PROCESSING - Transmission and storage of Speech signals; speech coding; speech enhancement and noise reduction; ...


Pattern Analysis and Machine Intelligence, IEEE Transactions on

Statistical and structural pattern recognition; image analysis; computational models of vision; computer vision systems; enhancement, restoration, segmentation, feature extraction, shape and texture analysis; applications of pattern analysis in medicine, industry, government, and the arts and sciences; artificial intelligence, knowledge representation, logical and probabilistic inference, learning, speech recognition, character and text recognition, syntactic and semantic processing, understanding natural language, expert systems, ...


Selected Areas in Communications, IEEE Journal on

All telecommunications, including telephone, telegraphy, facsimile, and point-to-point television, by electromagnetic propagation, including radio; wire; aerial, underground, coaxial, and submarine cables; waveguides, communication satellites, and lasers; in marine, aeronautical, space, and fixed station services; repeaters, radio relaying, signal storage, and regeneration; telecommunication error detection and correction; multiplexing and carrier techniques; communication switching systems; data communications; communication theory; and wireless communications.


Systems, Man and Cybernetics, Part A, IEEE Transactions on

Systems engineering, including efforts that involve issue formnaulations, issue analysis and modeling, and decision making and issue interpretation at any of the life-cycle phases associated with the definition, development, and implementation of large systems. It will also include efforts that relate to systems management, systems engineering processes and a variety of systems engineering methods such as optimization, modeling and simulation. ...



Most published Xplore authors for Speech Recognition

Back to Top

Xplore Articles related to Speech Recognition

Back to Top

Recognition of voiced speech from the bispectrum

Anastasios Delopoulos; Maria Rangoussi; Janne Andersen European Signal Processing Conference, 1996. EUSIPCO 1996. 8th, 1996

Recognition of voiced speech phonemes is addressed in this paper using features extracted from the bispectrum of the speech signal. Voiced speech is modeled as a superposition of coupled harmonics, located at frequencies that are multiples of the pitch and modulated by the vocal tract. For this type of signal, nonzero bispectral values are shown to be guaranteed by the ...


Improving vocabulary independent HMM decoding results by using the dynamically expanding context

M. Kurimo Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 1998

A method is presented to correct phoneme strings produced by a vocabulary independent speech recognizer. The method first extracts the N best matching result strings using mixture density hidden Markov models (HMMs) trained by neural networks. Then the strings are corrected by the rules generated automatically by the dynamically expanding context (DEC). Finally, the corrected string candidates and the extra ...


ANGIE: a new framework for speech analysis based on morpho-phonological modelling

S. Seneff; R. Lau; H. Meng Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, 1996

This paper describes a new system for speech analysis, ANGIE, which characterizes word substructure in terms of a trainable grammar. ANGIE capture morpho-phonemic and phonological phenomena through a hierarchical framework. The terminal categories can be alternately letters or phone units, yielding a reversible letter-to-sound/sound-to-letter system. In conjunction with a segment network and acoustic phone models, the system can produce phonemic-to- ...


On existence of optimal boundary value between early reflections and late reverberation

Arkadiy Prodeus; Olga Ladoshko Electronics and Nanotechnology (ELNANO), 2014 IEEE 34th International Conference on, 2014

Enhancement of speech distorted by reverberation is issue of the day. The problem has been actively studied in the last decade. However, it is still extremely difficult to find clear recommendations on choice of boundary value between early reflections and late reverberation, optimal in sense of such criteria as speech recognition accuracy and speech quality. Another problem is getting of ...


Third-order cumulant-based wiener filtering algorithm applied to robust speech recognition

Josep M. Salavedra; Javier Hernando European Signal Processing Conference, 1996. EUSIPCO 1996. 8th, 1996

In previous works [5], [6], we studied some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the ...


More Xplore Articles

Educational Resources on Speech Recognition

Back to Top

eLearning

Recognition of voiced speech from the bispectrum

Anastasios Delopoulos; Maria Rangoussi; Janne Andersen European Signal Processing Conference, 1996. EUSIPCO 1996. 8th, 1996

Recognition of voiced speech phonemes is addressed in this paper using features extracted from the bispectrum of the speech signal. Voiced speech is modeled as a superposition of coupled harmonics, located at frequencies that are multiples of the pitch and modulated by the vocal tract. For this type of signal, nonzero bispectral values are shown to be guaranteed by the ...


Improving vocabulary independent HMM decoding results by using the dynamically expanding context

M. Kurimo Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 1998

A method is presented to correct phoneme strings produced by a vocabulary independent speech recognizer. The method first extracts the N best matching result strings using mixture density hidden Markov models (HMMs) trained by neural networks. Then the strings are corrected by the rules generated automatically by the dynamically expanding context (DEC). Finally, the corrected string candidates and the extra ...


ANGIE: a new framework for speech analysis based on morpho-phonological modelling

S. Seneff; R. Lau; H. Meng Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, 1996

This paper describes a new system for speech analysis, ANGIE, which characterizes word substructure in terms of a trainable grammar. ANGIE capture morpho-phonemic and phonological phenomena through a hierarchical framework. The terminal categories can be alternately letters or phone units, yielding a reversible letter-to-sound/sound-to-letter system. In conjunction with a segment network and acoustic phone models, the system can produce phonemic-to- ...


On existence of optimal boundary value between early reflections and late reverberation

Arkadiy Prodeus; Olga Ladoshko Electronics and Nanotechnology (ELNANO), 2014 IEEE 34th International Conference on, 2014

Enhancement of speech distorted by reverberation is issue of the day. The problem has been actively studied in the last decade. However, it is still extremely difficult to find clear recommendations on choice of boundary value between early reflections and late reverberation, optimal in sense of such criteria as speech recognition accuracy and speech quality. Another problem is getting of ...


Third-order cumulant-based wiener filtering algorithm applied to robust speech recognition

Josep M. Salavedra; Javier Hernando European Signal Processing Conference, 1996. EUSIPCO 1996. 8th, 1996

In previous works [5], [6], we studied some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the ...


More eLearning Resources

IEEE-USA E-Books

  • Conclusion

    Neural Networks for Pattern Recognition takes the pioneering work in artificial neural networks by Stephen Grossberg and his colleagues to a new level. In a simple and accessible way it extends embedding field theory into areas of machine intelligence that have not been clearly dealt with before. Following a tutorial of existing neural networks for pattern classification, Nigrin expands on these networks to present fundamentally new architectures that perform realtime pattern classification of embedded and synonymous patterns and that will aid in tasks such as vision, speech recognition, sensor fusion, and constraint satisfaction.Nigrin presents the new architectures in two stages. First he presents a network called Sonnet 1 that already achieves important properties such as the ability to learn and segment continuously varied input patterns in real time, to process patterns in a context sensitive fashion, and to learn new patterns without degrading existing categories. He then removes simplifications inherent in Sonnet 1 and introduces radically new architectures. These architectures have the power to classify patterns that may have similar meanings but that have different external appearances (synonyms). They also have been designed to represent patterns in a distributed fashion, both in short-term and long-term memory.Albert Nigrin is Assistant Professor in the Department of Computer Science and Information Systems at American University.

  • Becoming Real

    This chapter contains sections titled: Automatic Speech Recognition Takes Off, Making Things Real, Speaking to Your Personal Computer, The Wildfire Effect, Why Don't You Just Tell Me?, Controlling Errors, Corona and Altech, Nuance and SpeechWorks, The Art of Speech Design, Inviting Computers to Dinner, Products and Solutions, How May I Help You?

  • Glossary

    Neural Networks for Pattern Recognition takes the pioneering work in artificial neural networks by Stephen Grossberg and his colleagues to a new level. In a simple and accessible way it extends embedding field theory into areas of machine intelligence that have not been clearly dealt with before. Following a tutorial of existing neural networks for pattern classification, Nigrin expands on these networks to present fundamentally new architectures that perform realtime pattern classification of embedded and synonymous patterns and that will aid in tasks such as vision, speech recognition, sensor fusion, and constraint satisfaction.Nigrin presents the new architectures in two stages. First he presents a network called Sonnet 1 that already achieves important properties such as the ability to learn and segment continuously varied input patterns in real time, to process patterns in a context sensitive fashion, and to learn new patterns without degrading existing categories. He then removes simplifications inherent in Sonnet 1 and introduces radically new architectures. These architectures have the power to classify patterns that may have similar meanings but that have different external appearances (synonyms). They also have been designed to represent patterns in a distributed fashion, both in short-term and long-term memory.Albert Nigrin is Assistant Professor in the Department of Computer Science and Information Systems at American University.

  • Neural map applications

    For several years it has been known that a self-organizing array of neural elements with modifiable synaptic weights is capable of forming a topologically ordered map of data to which it has been exposed. Neural maps are useful when the perceptual relation ship between stimuli is reflected in the pattern space from which they are drawn. When patterns with this type of natural structure are applied to a neural map they cause neural firings whose spatial positions correspond to the perceptual relationship between the patterns. For example, if the neural array is exposed to spectral vectors which represent different speech sounds, particularly the vowel sounds, different neural elements in the array become atuned to different types of sound. The relative positions of neural elements which are excited by different speech sounds reflect the phonetic relationship between the same set of sounds. This property, originally discovered by Kohonen, has been exploited to make speech recognition systems using neural maps and is described in more detail later in the chapter. In this chapter we explore ways in which pattern recognition and feature extraction may be obtained using neural mapping techniques. In particular, the difference between supervised and unsuper vised neural learning systems is considered and it is shown that important pattern processing propertiesa reobtainable using unsupervised systems.

  • Bibliography

    Neural Networks for Pattern Recognition takes the pioneering work in artificial neural networks by Stephen Grossberg and his colleagues to a new level. In a simple and accessible way it extends embedding field theory into areas of machine intelligence that have not been clearly dealt with before. Following a tutorial of existing neural networks for pattern classification, Nigrin expands on these networks to present fundamentally new architectures that perform realtime pattern classification of embedded and synonymous patterns and that will aid in tasks such as vision, speech recognition, sensor fusion, and constraint satisfaction.Nigrin presents the new architectures in two stages. First he presents a network called Sonnet 1 that already achieves important properties such as the ability to learn and segment continuously varied input patterns in real time, to process patterns in a context sensitive fashion, and to learn new patterns without degrading existing categories. He then removes simplifications inherent in Sonnet 1 and introduces radically new architectures. These architectures have the power to classify patterns that may have similar meanings but that have different external appearances (synonyms). They also have been designed to represent patterns in a distributed fashion, both in short-term and long-term memory.Albert Nigrin is Assistant Professor in the Department of Computer Science and Information Systems at American University.

  • Epilogue: Siri ... What's the Meaning of Life?

    Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer technology that Kubrick never imagined, but we do not have computers that talk and understand speech as HAL did. Is it a failure of our technology that we have not gotten much further than an automated voice that tells us to "say or press 1"? Or is there something fundamental in human language and speech that we do not yet understand deeply enough to be able to replicate in a computer? In The Voice in the Machine, Roberto Pieraccini examines six decades of work in science and technology to develop computers that can interact with humans using speech and the industry that has arisen around the quest for these technologies. He shows that although the computers today that understand speech may not have HAL's capacity for conversation, they have capabilities that make them usable in many applications today and are on a fast track of improvement and innovation. Pieraccini describes the evolution of speech recognition and speech understanding processes from waveform methods to artificial intelligence approaches to statistical learning and modeling of human speech based on a rigorous mathematical model--specifically, Hidden Markov Models (HMM). He details the development of dialog systems, the ability to produce speech, and the process of bringing talking machines to the market. Finally, he asks a question that only the future can answer: will we end up with HAL-like computers or something completely unexpected?

  • No title

    This book is about HCI research in an industrial research setting. It is based on the experiences of two researchers at the IBM T. J. Watson Research Center. Over the last two decades, Drs. John and Clare-Marie Karat have conducted HCI research to create innovative usable technology for users across a variety of domains. We begin the book by introducing the reader to the context of industrial research as well as a set of common themes or guidelines to consider in conducting HCI research in practice. Then case study examples of HCI approaches to the design and evaluation of usable solutions for people are presented and discussed in three domain areas: - item Conversational speech technologies, - item Personalization in eCommerce, and - item Security and privacy policy management technologies In each of the case studies, the authors illustrate and discuss examples of HCI approaches to design and evaluation that worked well and those that did not. They discuss what was learned over time bout different HCI methods in practice, and changes that were made to the HCI tools used over time. The Karats discuss trade-offs and issues related to time, resources, and money and the value derived from different HCI methods in practice. These decisions are ones that need to be made regularly in the industrial sector. Similarities and differences with the types of decisions made in this regard in academia will be discussed. The authors then use the context of the three case studies in the three research domains to draw insights and conclusions about the themes that were introduced in the beginning of the book. The Karats conclude with their perspective about the future of HCI industrial research. Table of Contents: Introduction: Themes and Structure of the Book / Case Study 1: Conversational Speech Technologies: Automatic Speech Recognition (ASR) / Case Study 2: Personalization in eCommerce / Case Study 3: Security and Privacy Policy Management Technologies / Insights and Conclusio s / The Future of Industrial HCI Research

  • Index

    Finite-state devices, which include finite-state automata, graphs, and finite- state transducers, are in wide use in many areas of computer science. Recently, there has been a resurgence of the use of finite-state devices in all aspects of computational linguistics, including dictionary encoding, text processing, and speech processing. This book describes the fundamental properties of finite-state devices and illustrates their uses. Many of the contributors pioneered the use of finite-automata for different aspects of natural language processing. The topics, which range from the theoretical to the applied, include finite-state morphology, approximation of phrase- structure grammars, deterministic part-of-speech tagging, application of a finite-state intersection grammar, a finite-state transducer for extracting information from text, and speech recognition using weighted finite automata. The introduction presents the basic theoretical results in finite-state automata and transducers. These results and algorithms are described and illustrated with simple formal language examples as well as natural language examples.Contributors : Douglas Appelt, John Bear, David Clemenceau, Maurice Gross, Jerry R. Hobbs, David Israel, Megumi Kameyama, Lauri Karttunen, Kimmo Koskenniemi, Mehryar Mohri, Eric Laporte, Fernando C. N. Pereira, Michael D. Riley, Emmanuel Roche, Yves Schabes, Max D. Silberztein, Mark Stickel, Pasi Tapanainen, Mabry Tyson, Atro Voutilainen, Rebecca N. Wright.Language, Speech, and Communication series

  • The Speech Recognition Problem

    This chapter contains sections titled: Introduction The ¿Dimensions of Difficulty¿ Related Problems and Approaches Conclusions Problems

  • Index

    This collection of essays by 12 members of the MIT staff, provides an inside report on the scope and expectations of current research in one of the world's major AI centers. The chapters on artificial intelligence, expert systems, vision, robotics, and natural language provide both a broad overview of current areas of activity and an assessment of the field at a time of great public interest and rapid technological progress.Contents: Artificial Intelligence (Patrick H. Winston and Karen Prendergast). KnowledgeBased Systems (Randall Davis). Expert-System Tools and Techniques (Peter Szolovits). Medical Diagnosis: Evolution of Systems Building Expertise (Ramesh S. Patil). Artificial Intelligence and Software Engineering (Charles Rich and Richard C. Waters). Intelligent Natural Language Processing (Robert C. Berwick). Automatic Speech Recognition and Understanding (Victor W. Zue). Robot Programming and Artificial Intelligence (Tomas Lozano-Perez). Robot Hands and Tactile Sensing (John M. Hollerbach). Intelligent Vision (Michael Brady). Making Robots See (W. Eric L. Grimson). Autonomous Mobile Robots (Rodney A. Brooks).W. Eric L. Grimson, author of From Images to Surfaces: A Computational Study of the Human Early Vision System (MIT Press 1981), and Ramesh S. Patil are both Assistant Professors in the Department of Electrical Engineering and Computer Science at MIT. AI in the 1980s and Beyond is included in the Artificial Intelligence Series, edited by Patrick H. Winston and Michael Brady.



Standards related to Speech Recognition

Back to Top

No standards are currently tagged "Speech Recognition"


Jobs related to Speech Recognition

Back to Top