Autocorrelation
What Is Autocorrelation?
Autocorrelation is a statistical measure of the similarity between a signal or data sequence and a time-shifted version of itself. It quantifies how observations at one point in time are correlated with observations from the same process at earlier or later times, with the time offset called the lag. The autocorrelation function (ACF) takes values between -1 and +1, where +1 at a given lag indicates perfect positive self-similarity and 0 indicates no linear relationship. Autocorrelation arises in disciplines ranging from statistical signal processing and time series econometrics to radar engineering and speech analysis, wherever understanding the internal temporal structure of a process matters.
The concept draws on classical probability theory and the theory of stochastic processes. For a wide-sense stationary process, the ACF depends only on the lag between observations and not on the absolute time, a property that simplifies analysis and underlies most practical estimation procedures.
Statistical Foundation and Computation
The sample autocorrelation at lag k is computed from a data sequence as the normalized inner product of the sequence with itself shifted by k positions. The central value, at lag zero, equals the mean square of the sequence and represents total signal power. The ACF is always symmetric around lag zero for real-valued sequences, and its values at nonzero lags are bounded by the zero-lag value.
The NIST/SEMATECH Engineering Statistics Handbook defines the autocorrelation coefficient formally and describes its use in exploratory data analysis: it is the ratio of the autocovariance at lag k to the variance of the process. Autocorrelation plots, which graph the ACF value against lag, are standard diagnostic tools. A random process with no serial dependence shows ACF values near zero at all nonzero lags; a periodic process shows a cosine-shaped ACF pattern; a process with slowly decaying memory shows ACF values that taper toward zero gradually.
Time Series Analysis and Stationarity
In time series analysis, the ACF is central to identifying appropriate statistical models. The Box-Jenkins methodology for autoregressive integrated moving average (ARIMA) models uses the sample ACF alongside the partial autocorrelation function (PACF) to determine model order. An AR(p) process has an ACF that decays exponentially or in a damped oscillatory pattern, while an MA(q) process has an ACF that cuts off sharply after lag q.
Stationarity is a prerequisite for most ACF-based analysis: as described in NIST's introduction to time series analysis, a stationary series has a constant mean, constant variance, and an autocorrelation structure that does not change over time. Non-stationary series, such as those with trends or unit roots, exhibit slowly decaying ACF patterns that persist at large lags. Transformations including differencing and logarithmic scaling are applied to restore stationarity before model identification. The related concept of stationarity itself, as defined by NIST, requires that the ACF structure remain flat and periodic-fluctuation-free across the time span of observation.
Signal Detection and Spectral Analysis
In signal processing, the autocorrelation function connects directly to the power spectral density through the Wiener-Khinchin theorem: the power spectrum of a stationary process is the Fourier transform of its ACF. This relationship makes autocorrelation foundational to spectral estimation and to the design of matched filters.
Radar systems use autocorrelation of transmitted waveforms to detect targets and measure range. Pulse compression techniques exploit the ACF of chirp signals to achieve high range resolution without high peak power. In speech processing, the pitch period of voiced speech is estimated from peaks in the short-time autocorrelation function, and linear predictive coding coefficients are derived from Toeplitz systems built from autocorrelation values.
Applications
Autocorrelation has applications across a range of engineering and scientific fields, including:
- Time series model identification and forecasting in econometrics and finance
- Radar and sonar signal processing for target detection and range estimation
- Speech analysis for pitch detection and linear predictive coding
- Communications system design and channel estimation
- Turbulence and fluid dynamics research using correlation of velocity fields