Sum product algorithm
What Is the Sum Product Algorithm?
The sum product algorithm is a message-passing procedure for computing marginal probability distributions in probabilistic graphical models, operating by passing local messages along the edges of a factor graph until each node accumulates the information needed to report its marginal. First formalized in the 2001 IEEE Transactions on Information Theory paper by Kschischang, Frey, and Loeliger, the algorithm unifies a broad family of inference and decoding methods under a single computational framework. It applies exactly on cycle-free (tree-structured) graphs, where it delivers exact marginals in linear time, and is applied as an approximation on graphs with cycles, where it is known as loopy belief propagation. The algorithm is foundational to modern coding theory, Bayesian inference, and machine learning.
The sum product algorithm is closely related to belief propagation, the forward-backward algorithm, and the Viterbi algorithm, all of which can be viewed as specializations operating on particular graph structures. Its introduction as a unifying framework spurred rapid progress in the design and analysis of capacity-approaching error-correcting codes.
Factor Graphs and Message Passing
A factor graph is a bipartite graph containing two types of nodes: variable nodes and factor (function) nodes. A global function of many variables is expressed as a product of local functions, each depending on a subset of the variables, and the factor graph encodes which factors involve which variables. The foundational IEEE Transactions paper on factor graphs and the sum-product algorithm by Kschischang, Frey, and Loeliger describes how messages flowing from variable to factor and from factor to variable represent partial beliefs, computed as sums of products of incoming messages and the local function. Each variable node computes its marginal by multiplying all incoming messages from adjacent factor nodes; each factor node computes outgoing messages by summing the product of the factor function and all incoming variable messages over the domain of the non-target variable. The result, on a tree, is exact inference achieved in a single forward-backward pass.
Convergence and Loopy Belief Propagation
When the factor graph contains cycles, the sum product algorithm becomes iterative: messages are initialized, passed repeatedly until they converge to stable values, and the resulting pseudo-marginals are read from the variable nodes. This loopy variant does not guarantee exact marginals, but empirically produces good approximations on many practical graphs, particularly sparse ones. Turbo decoding of parallel concatenated convolutional codes, introduced in 1993 by Berrou, Glavieux, and Thitimajshima, was later recognized as an instance of loopy belief propagation on a graph with a single cycle. LDPC (low-density parity-check) codes, whose decoding likewise employs iterative message passing, achieve performance within a fraction of a decibel of the Shannon capacity limit in part because their sparse Tanner graphs have large girth, limiting cycle-induced errors. MIT OpenCourseWare material on algorithms for inference, including the forward-backward algorithm and sum-product on factor graphs, provides a formal treatment of these convergence properties.
Relationship to Classical Algorithms
The sum product algorithm subsumes several algorithms developed independently in different fields. The forward-backward (Baum-Welch) algorithm for hidden Markov models is an instance applied to a chain-structured factor graph. Pearl's belief propagation for Bayesian networks is another special case. The Kalman filter, used in linear Gaussian state-space models, can also be derived as sum-product message passing on a linear Gaussian factor graph. Stanford's lecture notes on belief propagation and variational inference treat this unification rigorously and connect it to statistical physics formulations of message passing in spin glasses.
Applications
The sum product algorithm has applications in a range of fields, including:
- Decoding of LDPC codes and turbo codes in wireless and satellite communications
- Bayesian network inference in medical diagnosis and risk assessment systems
- Hidden Markov model decoding in speech recognition and computational biology
- Computer vision for image segmentation and stereo reconstruction via Markov random fields
- Sensor network localization and distributed estimation