IEEE Organizations related to IEEE Transactions on Speech and Audio Processing

Back to Top


Conferences related to IEEE Transactions on Speech and Audio Processing

Back to Top

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

The ICASSP meeting is the world's largest and most comprehensive technical conference focused on signal processing and its applications. The conference will feature world-class speakers, tutorials, exhibits, and over 50 lecture and poster sessions.



Periodicals related to IEEE Transactions on Speech and Audio Processing

Back to Top

Audio, Speech, and Language Processing, IEEE Transactions on

Speech analysis, synthesis, coding speech recognition, speaker recognition, language modeling, speech production and perception, speech enhancement. In audio, transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. (8) (IEEE Guide for Authors) The scope for the proposed transactions includes SPEECH PROCESSING - Transmission and storage of Speech signals; speech coding; speech enhancement and noise reduction; ...


Broadcasting, IEEE Transactions on

Broadcast technology, including devices, equipment, techniques, and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.


Communications, IEEE Transactions on

Telephone, telegraphy, facsimile, and point-to-point television, by electromagnetic propagation, including radio; wire; aerial, underground, coaxial, and submarine cables; waveguides, communication satellites, and lasers; in marine, aeronautical, space and fixed station services; repeaters, radio relaying, signal storage, and regeneration; telecommunication error detection and correction; multiplexing and carrier techniques; communication switching systems; data communications; and communication theory. In addition to the above, ...


Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Methods, algorithms, and human-machine interfaces for physical and logical design, including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, and documentation of integrated-circuit and systems designs of all complexities. Practical applications of aids resulting in producible analog, digital, optical, or microwave integrated circuits are emphasized.


Consumer Electronics, IEEE Transactions on

The design and manufacture of consumer electronics products, components, and related activities, particularly those used for entertainment, leisure, and educational purposes


More Periodicals

Most published Xplore authors for IEEE Transactions on Speech and Audio Processing

Back to Top

Xplore Articles related to IEEE Transactions on Speech and Audio Processing

Back to Top

Correction to “Linear Predictive Method for Improved Spectral Modeling of Lower Frequencies of Speech With Small Prediction Orders”

IEEE Transactions on Speech and Audio Processing, 2004

None


Maximum likelihood and minimum classification error factor analysis for automatic speech recognition

IEEE Transactions on Speech and Audio Processing, 2000

Hidden Markov models (HMMs) for automatic speech recognition rely on high dimensional feature vectors to summarize the short-time properties of speech. Correlations between features can arise when the speech signal is nonstationary or corrupted by noise. We investigate how to model these correlations using factor analysis, a statistical method for dimensionality reduction. Factor analysis uses a small number of parameters ...


On the applications of the interacting multiple model algorithm for enhancing noisy speech

IEEE Transactions on Speech and Audio Processing, 2000

The interacting multiple model (IMM) algorithm is applied to enhancing speech contaminated by additive white or colored noise. Noisy speech is modeled by a linear state-space model with Markovian switching parameters. The parameters are estimated from the training speech and noise processes. The simulation results shows that the proposed method offers performance gains relative to the previous results with slightly ...


Multiple-point equalization of room transfer functions by using common acoustical poles

IEEE Transactions on Speech and Audio Processing, 1997

A multiple-point equalization filter using the common acoustical poles of room transfer functions is proposed. The common acoustical poles correspond to the resonance frequencies, which are independent of source and receiver positions. They are estimated as common autoregressive (AR) coefficients from multiple room transfer functions. The equalization is achieved with a finite impulse response (FIR) filter, which has the inverse ...


A coupled approach to ADPCM adaptation

IEEE Transactions on Speech and Audio Processing, 1994

The algorithms for adaptive quantization and adaptive prediction in existing adaptive differential pulse code modulation (ADPCM) are distinct and decoupled. The designs of the two algorithms are based on the assumption that they act independently whereas, due to the feedback configuration, they must interact in some way. The authors present a preliminary investigation of a new design approach where they ...


More Xplore Articles

Educational Resources on IEEE Transactions on Speech and Audio Processing

Back to Top

IEEE-USA E-Books

  • Correction to “Linear Predictive Method for Improved Spectral Modeling of Lower Frequencies of Speech With Small Prediction Orders”

    None

  • Maximum likelihood and minimum classification error factor analysis for automatic speech recognition

    Hidden Markov models (HMMs) for automatic speech recognition rely on high dimensional feature vectors to summarize the short-time properties of speech. Correlations between features can arise when the speech signal is nonstationary or corrupted by noise. We investigate how to model these correlations using factor analysis, a statistical method for dimensionality reduction. Factor analysis uses a small number of parameters to model the covariance structure of high dimensional data. These parameters can be chosen in two ways: (1) to maximize the likelihood of observed speech signals, or (2) to minimize the number of classification errors. We derive an expectation- maximization (EM) algorithm for maximum likelihood estimation and a gradient descent algorithm for improved class discrimination. Speech recognizers are evaluated on two tasks, one small-sized vocabulary (connected alpha-digits) and one medium-sized vocabulary (New Jersey town names). We find that modeling feature correlations by factor analysis leads to significantly increased likelihoods and word accuracies. Moreover, the rate of improvement with model size often exceeds that observed in conventional HMM's.

  • On the applications of the interacting multiple model algorithm for enhancing noisy speech

    The interacting multiple model (IMM) algorithm is applied to enhancing speech contaminated by additive white or colored noise. Noisy speech is modeled by a linear state-space model with Markovian switching parameters. The parameters are estimated from the training speech and noise processes. The simulation results shows that the proposed method offers performance gains relative to the previous results with slightly increased complexity.

  • Multiple-point equalization of room transfer functions by using common acoustical poles

    A multiple-point equalization filter using the common acoustical poles of room transfer functions is proposed. The common acoustical poles correspond to the resonance frequencies, which are independent of source and receiver positions. They are estimated as common autoregressive (AR) coefficients from multiple room transfer functions. The equalization is achieved with a finite impulse response (FIR) filter, which has the inverse characteristics of the common acoustical pole function. Although the proposed filter cannot recover the frequency response dips of the multiple room transfer functions, it can suppress their common peaks due to resonance; it is also less sensitive to changes in receiver position. Evaluation of the proposed equalization filter using measured room transfer functions shows that it can reduce the deviations in the frequency characteristics of multiple room transfer functions better than a conventional multiple-point inverse filter. Experiments show that the proposed filter enables 1-5 dB additional amplifier gain in a public address system without acoustic feedback at multiple receiver positions. Furthermore, the proposed filter reduces the reflected sound in room impulse responses without the pre-echo that occurs with a multiple-point inverse filter. A multiple-point equalization filter using common acoustical poles can thus equalize multiple room transfer functions by suppressing their common peaks.

  • A coupled approach to ADPCM adaptation

    The algorithms for adaptive quantization and adaptive prediction in existing adaptive differential pulse code modulation (ADPCM) are distinct and decoupled. The designs of the two algorithms are based on the assumption that they act independently whereas, due to the feedback configuration, they must interact in some way. The authors present a preliminary investigation of a new design approach where they perform joint adaptation based on a common cost function, considering the overall system. A “backward” adaptive algorithm is presented and applied to a simple example to demonstrate the feasibility of the general approach.

  • A fast algorithm for large vocabulary keyword spotting application

    Presents a fast algorithm for spotting a large number of keywords in unconstrained, continuous speech using an HMM-based continuous speech recognizer. This fast algorithm is based on a two stage scheme. In the first stage, the forward backward search is performed for detecting N most likely common subwords. In the second stage, the tree-trellis search is carried out to determine the optimum keyword by traversing the tree-structural vocabulary in an effective way. Compared with the conventional whole-word based keyword spotting algorithm, the proposed fast algorithm can drastically reduce the computational cost.<<ETX>>

  • Multiple description perceptual audio coding with correlating transforms

    In audio communication over a lossy packet network, concealment techniques are used to mitigate the effects of lost packets. This concealment is markedly improved if the compressed representation retains redundancy to aid in the estimation of lost information. A perceptual audio coder employing multiple description correlating transforms demonstrates this phenomenon.

  • A new class of doubletalk detectors based on cross-correlation

    A doubletalk detector (DTD) is used with an echo canceler to sense when far- end speech is corrupted by near-end speech. Its role is to freeze the adaptation of the model filter when near-end speech is present in order to avoid divergence of the adaptive algorithm. Several authors have proposed to use the cross-correlation coefficient vector between the input signal vector x and the scalar output y for a DTD. We show in this paper that this measure is not appropriate and propose a modified form that meets, in an optimal way, the needs for an efficient DTD. By extension, we also propose a definition of the normalized cross-correlation matrix between two vectors and show a link with the coherence function.

  • An extended clustering algorithm for statistical language models

    An existing clustering algorithm is extended to deal with higher order N-grams and a faster heuristic version is developed. Even though results are not comparable to back-off trigram models, they outperform back-off bigram models when many million words of training data are not available.

  • On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate

    We present a framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities. The QB formulation is based on the theory of recursive Bayesian inference. The QB algorithm is designed to incrementally update the hyperparameters of the approximate posterior distribution and the CDHMM parameters simultaneously. By further introducing a simple forgetting mechanism to adjust the contribution of previously observed sample utterances, the algorithm is adaptive in nature and capable of performing an online adaptive learning using only the current sample utterance. It can, thus, be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, and transducers. As an example, the QB learning framework is applied to on-line speaker adaptation and its viability is confirmed in a series of comparative experiments using a 26-letter English alphabet vocabulary.



Standards related to IEEE Transactions on Speech and Audio Processing

Back to Top

No standards are currently tagged "IEEE Transactions on Speech and Audio Processing"


Jobs related to IEEE Transactions on Speech and Audio Processing

Back to Top