Hamming Weight
What Is Hamming Weight?
Hamming weight is the count of non-zero symbols in a string, or equivalently, the number of 1-bits in a binary word. Also called the population count, popcount, or sideways sum, the measure takes its name from Richard W. Hamming, whose 1950s work on error-correcting codes depended on the ability to characterize strings by the density of their non-zero positions. For a binary string of length $n$, the Hamming weight ranges from 0 (the all-zeros word) to $n$ (the all-ones word). The concept is closely related to Hamming distance: the distance between two binary strings equals the Hamming weight of their bitwise exclusive-OR.
Hamming weight is not merely a combinatorial curiosity. It appears as a core operation in error-correcting code design, cryptographic analysis, computer chess, data compression, and hardware arithmetic, a breadth that reflects the importance of bit-counting across virtually every domain of digital computation.
Definition and Combinatorial Properties
For a binary vector $x = (x_1, x_2, \ldots, x_n)$ over the field GF(2), the Hamming weight $w_H(x)$ is defined as $\sum_{i=1}^{n} x_i$. Codewords with low Hamming weight tend to be sparse in a vector space sense; codes constructed to avoid low-weight codewords have better minimum-distance properties and thus stronger error-correction capability. The binomial coefficient $\binom{n}{k}$ counts the number of binary strings of length $n$ with Hamming weight exactly $k$, a fact central to the analysis of code parameters such as the Hamming bound and the Singleton bound. The IBM overview of Hamming weight and related coding concepts discusses how weight distributions of code families determine their theoretical limits.
Hardware and Software Implementations
Computing Hamming weight efficiently has been a recurring problem in computer architecture. The naive approach increments a counter for each bit in a loop, requiring $O(n)$ operations. Brian Kernighan's well-known loop reduces this to $O(w)$ where $w$ is the weight itself, by repeatedly clearing the lowest set bit with the identity $x \leftarrow x \,\&\, (x-1)$. Parallel prefix summation techniques achieve Hamming weight in $O(\log n)$ steps using bitwise arithmetic alone. Modern x86 processors expose this calculation directly through the POPCNT instruction introduced with the SSE4.2 extension, and GCC's __builtin_popcount function will emit this instruction when targeting compatible hardware. The Rosetta Code population count page documents efficient implementations across dozens of programming languages and architectures, illustrating both the algorithmic variety and the universality of the operation. Parallel SIMD implementations using AVX-512 can compute weight across 512-bit registers in a single operation, enabling throughput of tens of billions of bits per second on current server processors.
Applications in Coding and Cryptography
In coding theory, the weight enumerator of a linear block code, a polynomial whose coefficients count codewords at each Hamming weight, determines the code's error probability under various channel models. Cryptographic analyses of block ciphers depend on Hamming weight to measure the diffusion and avalanche properties of substitution-permutation networks; a strong cipher should flip roughly half the output bits for any single-bit input change, reflecting an output Hamming weight close to half the block size. Side-channel attacks on hardware implementations of cryptographic algorithms frequently model power consumption as proportional to Hamming weight, making weight a direct interface between algorithmic security and physical security. The NIST post-quantum cryptography standardization project evaluates candidate algorithms in part on the Hamming weight properties of their key and ciphertext spaces.
Applications
Hamming weight has applications across a range of fields, including:
- Error-correcting codes: determining minimum distances and code rate tradeoffs
- Cryptographic design: avalanche criterion analysis and side-channel power models
- Computer chess: counting pieces on a bitboard representation of the game state
- Network routing: computing the number of hosts in a CIDR subnet mask
- Bioinformatics: scoring sequence alignments and computing edit distances over genomic data