Viterbi algorithm

What Is the Viterbi Algorithm?

The Viterbi algorithm is a dynamic programming procedure for finding the most probable sequence of hidden states in a Markov model given a series of observed outputs. Proposed by Andrew Viterbi in 1967 in the context of decoding noisy digital communications channels, the algorithm solves the maximum-likelihood sequence detection problem with a computational cost that grows linearly in the number of observations rather than exponentially, as a naive search would require. It has since become one of the most widely deployed algorithms in digital communications, automatic speech recognition, and biological sequence analysis.

The algorithm operates on a trellis diagram: a graph whose columns represent time steps and whose nodes represent possible states at each step. At every column, the algorithm retains only the single most probable path reaching each state, discarding all others. This pruning strategy, which relies on the principle of optimal substructure, guarantees that the surviving path to any state at time t is the best extension of the globally optimal path up to that point. Research on Viterbi decoding for convolutional codes in IEEE Transactions on Information Theory documented the algorithm's application to error-correcting codes and established its computational properties in that context.

Dynamic Programming and Trellis Decoding

The Viterbi algorithm belongs to the family of dynamic programming methods, which decompose a large optimization problem into overlapping subproblems and cache their solutions to avoid redundant computation. In the trellis formulation, the state space at each time step typically contains a fixed number of nodes, so the algorithm makes exactly a constant multiple of T times S transitions, where T is the observation length and S is the number of states. For convolutional codes with constraint length K, the trellis has 2^(K-1) states; for a rate-1/2 code with K = 7, this yields 64 states, a tractable size for hardware implementation. The branch metric computed at each edge quantifies the mismatch between the predicted output and the observed symbol, either as a Hamming distance for hard decisions or as a Euclidean distance for soft-decision decoding, which achieves approximately 2 to 3 dB of additional coding gain.

Hidden Markov Models and Speech Processing

Beyond channel coding, the Viterbi algorithm is the standard decoding step in hidden Markov model (HMM) systems. An HMM models a process as a sequence of unobserved states, each emitting an observable symbol according to a probability distribution. Given an observation sequence, the Viterbi algorithm efficiently recovers the most likely state sequence, which corresponds to the decoded output. In automatic speech recognition, HMMs represent phonemes or sub-phoneme units, and the Viterbi search finds the word sequence most consistent with the acoustic observations. The combination of the Baum-Welch training algorithm and Viterbi decoding defined the architecture of commercial speech recognition systems from the 1980s through the early 2010s. Speaker recognition systems, which determine identity rather than transcribed words, also rely on this framework, using speaker-specific HMMs trained on enrollment utterances. The IEEE Xplore paper on innovations approaches to Viterbi decoding examines extensions of the classical algorithm that exploit signal structure to improve accuracy under non-standard noise conditions.

The algorithm's generality extends to any sequential inference problem with a Markov structure: biological sequence alignment uses Viterbi-style search over nucleotide or amino acid states, and optical character recognition applies it to sequence models of character transitions. Resources such as the Georgia Tech lecture notes on the Viterbi algorithm provide accessible mathematical derivations of the trellis recursion for students entering the field.

Applications

The Viterbi algorithm has applications in a range of fields, including:

  • Decoding convolutional and turbo codes in wireless and satellite communications
  • Automatic speech recognition and speaker identification
  • Biological sequence analysis, including gene finding and protein structure alignment
  • Natural language processing for part-of-speech tagging and named-entity recognition
  • Navigation and positioning, including GPS signal decoding
Loading…