193 resources related to Speech Generation
- Topics related to Speech Generation
- IEEE Organizations related to Speech Generation
- Conferences related to Speech Generation
- Periodicals related to Speech Generation
- Most published Xplore authors for Speech Generation
2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
CVPR is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.
The ICASSP meeting is the world's largest and most comprehensive technical conference focused on signal processing and its applications. The conference will feature world-class speakers, tutorials, exhibits, and over 50 lecture and poster sessions.
2019 IEEE International Symposium on Information Theory (ISIT)
Information theory and coding theory and their applications in communications and storage, data compression, wireless communications and networks, cryptography and security, information theory and statistics, detection and estimation, signal processing, big data analytics, pattern recognition and learning, compressive sensing and sparsity, complexity and computation theory, Shannon theory, quantum information and coding theory, emerging applications of information theory, information theory in biology.
The IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC) is the premier forum for researchers to present their latest findings in the area of asynchronous design.
ICPR will be an international forum for discussions on recent advances in the fields of Pattern Recognition, Machine Learning and Computer Vision, and on applications of these technologies in various fields
The IEEE Aerospace and Electronic Systems Magazine publishes articles concerned with the various aspects of systems for space, air, ocean, or ground environments.
Speech analysis, synthesis, coding speech recognition, speaker recognition, language modeling, speech production and perception, speech enhancement. In audio, transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. (8) (IEEE Guide for Authors) The scope for the proposed transactions includes SPEECH PROCESSING - Transmission and storage of Speech signals; speech coding; speech enhancement and noise reduction; ...
The IEEE Transactions on Automation Sciences and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. We welcome results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, ...
Broad coverage of concepts and methods of the physical and engineering sciences applied in biology and medicine, ranging from formalized mathematical theory through experimental science and technological development to practical clinical applications.
Broadcast technology, including devices, equipment, techniques, and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.
2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN), 2013
Silent speech generation is an intelligent idea that can possibly assist physically challenged people who cannot convey their information as an acoustic signal. Silent speech is generated by predicting the intended speech information which occurs as a result of neural activity involved in the process of speech production. The acquired speech is synthesized and given as a feedback to the ...
ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359), 1999
The paper gives an overall discussion on problems in Chinese natural speech generation. We considered not only how to convert text into speech but also how to generate the necessary text in text-to-speech conversion. A Chinese bi- directional grammar is developed to suit for Chinese language understanding and generation. The system gets the right text and generates speech which has ...
Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002., 2002
A spoken dialogue system of information retrieval on academic documents has been developed with a special attention to reply speech generation. In order to realize speech reply with its prosodic features properly controlled to express dialogue focuses, a scheme was developed for directly generating speech reply from reply content. When developing the system, firstly a priority was placed on the ...
IEE Colloquium on Prospects for Spoken Language Technology (Digest No: 1997/138), 1997
Some of the most important issues in the design of a dialogue system involve the modeling of linguistic context. The paper highlights a number of these issues, focusing an the language and speech generation components of such systems, and discusses their implications for the way in which context has to be modeled in a spoken dialogue system. We compare the ...
2016 International Conference on Knowledge Creation and Intelligent Computing (KCIC), 2016
Humanoid robot is a robot that has intelligence like human. In this research, the team has build a robot called by FLoW. The robot is designed to have the ability as human beings, one of the ability is able to communicate. In the process of communication requires media, one of them is sound. This system is built to help the ...
ICASSP 2012 Plenary-Dr. Chin-Hui Lee
Brooklyn 5G Summit: MIMO Technology: Past, Present and Future
Robotic Governance Paving the Way for Generation 'R'
APEC 2012 - Dr. Fred Lee Plenary
ITEC 2014: Next Generation Combat Vehicle Electrical Power Architecture Development
A Flexible Testbed for 5G Waveform Generation and Analysis: MicroApps 2015 - Keysight Technologies
The NESC: Engaging the Next Generation
IMS 2012 Microapps - Phase Noise Choices in Signal Generation: Understanding Needs and Tradeoffs Riadh Said, Agilent
ICASSP 2012 - Opening Ceremony
IMS 2012 Microapps - The Next Generation of Communications Design, Validate, and Test Dr. Mark Pierpoint
APEC 2011-GaN Based Power Devices in Power Electronics
ICRA Keynote: Dr. Matt Mason
APEC 2015: KeyTalks - Solid State Lighting
ICASSP 2012 Plenary-Dr. Stephane Mallat
ECCE Plenary: Paul Hamilton, part 2
ICASSP 2011 Trends in Multimedia Signal Processing
IMS 2011 Microapps - Simulation and Evaluation of Communications Systems in Conformance With Third- and Fourth-Generation Wireless Standards
ECCE Plenary: Pedro Ray, part 2
ICASSP 2011 Trends in Design and Implementation of Signal Processing Systems
Silent speech generation is an intelligent idea that can possibly assist physically challenged people who cannot convey their information as an acoustic signal. Silent speech is generated by predicting the intended speech information which occurs as a result of neural activity involved in the process of speech production. The acquired speech is synthesized and given as a feedback to the user acoustically with the delay of 50ms. This paper briefly elucidate the process of acquiring neural signal, preprocessing and feature extracting for the production of speech signal by means of Brain Machine Interface.
The paper gives an overall discussion on problems in Chinese natural speech generation. We considered not only how to convert text into speech but also how to generate the necessary text in text-to-speech conversion. A Chinese bi- directional grammar is developed to suit for Chinese language understanding and generation. The system gets the right text and generates speech which has good quality of naturalness and intelligibility using the Chinese text-to speech conversion system.
A spoken dialogue system of information retrieval on academic documents has been developed with a special attention to reply speech generation. In order to realize speech reply with its prosodic features properly controlled to express dialogue focuses, a scheme was developed for directly generating speech reply from reply content. When developing the system, firstly a priority was placed on the automatic processing, and prosodic focus was controlled by rather simple rules (original rules). Based on the listening test for the reply speech generated using original rules, new rules were then developed. Through the further listening test, the rules were revised and called the revised rules. The validity of the revised rules was verified through an evaluation experiment. It was also indicated that there existed users' preferences on the intonation of the reply speech.
Some of the most important issues in the design of a dialogue system involve the modeling of linguistic context. The paper highlights a number of these issues, focusing an the language and speech generation components of such systems, and discusses their implications for the way in which context has to be modeled in a spoken dialogue system. We compare the 'dedicated' context models that have been proposed in theoretical and computational linguistics with the more general models proposed in artificial intelligence. Our main examples of a dedicated context model is the context model of the Dial Your Disc (DYD) music information system (Van Deemter and Odijk, 1997) and the better-known discourse representation theory of which this model is a variant. Our main example of a 'general' context model is provided by the so-called '1st' formalism (McCarthy, 1993).
Humanoid robot is a robot that has intelligence like human. In this research, the team has build a robot called by FLoW. The robot is designed to have the ability as human beings, one of the ability is able to communicate. In the process of communication requires media, one of them is sound. This system is built to help the development research of ER2C (EEPIS Robotic Research Center) in building a Humanoid Robot `FLoW'. Robot `FLoW' to be able to communicate, then the robot should be able to say word or doing speech. Its called as speech generation. To generate sound, it will be make text to speech synthesis system. In the process of preprocessing is using FSA (Finite State Automata) algorithms. In Indonesian language uses 11 patterns. The testing process is done on the processing of `words', `sentences', and `articles'. The percentage of success in `words' and `sentences' is more accurate and match with the separation of syllables in Indonesian language than the process of articles. From processing the article in newspaper, it has success rate of parsing 92.63%. The data were processed taken from five types of theme articles, namely the economy, education, sports, politics, and law. The performance is the result of parsing the articles is lower due to the addition of the name, title, and foreign words that have not undergone uptake in Indonesian language.
The emotional function of the human mind has an important role for decision- making, memory, action, and good communication or so. Especially, emotion characteristics of voice are very import for warm communication, successful business, human-to-human good relationship, and good care for children and silver ages. On the other hand, recently, service robot market such as educator, helper, secretary, deliver, and guider has been growing up because of old population and complicated social situation. In that case the emotion function is needed in those areas. The emotion characteristic of voice depends on pitch contour, acoustic energy, vocal tract features, speech energy or so. Therefore we need to consider on how we have to apply and implement emotion function of voice for service robot. However, its implement for robot is very difficult and recognition is also not easy because of various emotion patterns in voice. This paper suggests method of voice emotion generation for user demand emotion talk in service robot. Fuzzy rule based approach is introduced to generate emotion for user demand emotional function by controlling pitch contour, acoustic energy, vocal tract features, and speech energy.
A real time implementation of a Text of Speech system is discussed. Details are given of the grapheme to phoneme process, prosodic modelling ad diphone synthesis. The CSTR user interface which allows speech editing is described.<<ETX>>
Although in most spoken dialogue systems, text-to-speech conversion devices are used for reply speech generation. However, use of such devices makes it difficult to well reflect higher-level linguistic (and para-/non- linguistic) information obtainable during sentence generation process on reply speech. This situation degrades the reply speech quality mainly from the aspect of prosodic features. A method is necessary to directly converting content of reply into speech. This method, known as concept-to-speech conversion, was realized for the reply speech generation in our spoken dialogue system on road guidance. It is an improved version of our formerly developed one for an agent dialogue system. Reply sentence generation was conducted by pasting words and/or phrases at tag positions of a sentence frame, which was prepared in a tag-LISP form. In order to realize the concept-to-speech conversion, syntactic structure of phrases in user's inputs is kept and is utilized for the sentence generation. Several improvements, such as prosodic phrase boundary positioning using probability of word sequences, are also added to prosodic control in speech synthesis. In the spoken dialogue system, a user was guided to reach a place marked on a map through conversation. Several schemes on dialogue management were implemented to solve the problems caused due to the imperfect information on the roads given to the user and the system. A trial use of the system showed that a smooth conversation between the user and the system was possible. The result clearly indicated a better prosodic control for the newly developed method as compared to the original method
By highlighting the focus of an utterance to draw attention, emphasis in speech interaction plays an important role for speaker intention expressing and understanding. Therefore, emphatic speech synthesis draws increasing interest in the text-to-speech (TTS) area. For emphatic speech synthesis, three problems still exist: 1) sparseness of emphatic speech data; 2) flexibility of trained model; 3) modelling shortage for secondary emphasis. Recently, recurrent neural networks (RNNs) and their bidirectional long short term memory (BLSTM) variants based statistical parametric speech synthesis (SPSS) systems have shown their adaptability and controllability in acoustic modelling thus can solve aforementioned problems. In this paper, we propose a novel conditional input layer for conventional BLSTM-RNN based approach combining using emphasis-specific vectors and linguistic features as input to produce emphatic speech trajectories. Experimental results from objective and subjective evaluations demonstrate the proposed approach can produce emphatic speech trajectories with high quality and naturalness only requiring an additional small-scale emphatic speech corpus.
Currently there are various technologies for converting written text into speech. Their common goal is the artificial generation of natural speech and in maximum is understandable. However, unfortunately such perfect convertors still can not be found. Therefore, research in this field is reasonable and with rising tendency, because it affects the advancement of performance of the existing solutions and defines instant achievements in the area concerned. Based on the fact that different world languages differ considerably in writing and in speech, it is impossible for such convertors to have universal application. This means that there are differences in the development of this field in different countries and for different languages. For local languages in particular, the researches in this field have not recorded any significant progress. This is probably because of the small number of users which could not justify the economic aspect. From this perspective, the question is: What happens with the Albanian language in this regard and what are the possibilities of conversion of written texts in Albanian language in to spoken Albanian? This is precisely the response given in this paper.
No standards are currently tagged "Speech Generation"