Concurrency Control

TOPIC AREA

What Is Concurrency Control?

Concurrency control is a discipline within computer science concerned with coordinating simultaneous access to shared resources in multi-process and multi-threaded systems. When two or more execution threads operate on the same data without coordination, the result is a race condition: the final state of the data depends on the unpredictable ordering of operations rather than on program logic alone. Concurrency control provides the formal mechanisms and practical algorithms that prevent such inconsistencies, ensuring that concurrent operations produce results equivalent to some serial execution.

Mutual Exclusion and Semaphores

Mutual exclusion is the property that only one process at a time may execute a critical section: a code region that accesses a shared resource. Edsger Dijkstra introduced the semaphore in 1965 as the first formal construct for enforcing mutual exclusion. A semaphore is an integer variable supporting two atomic operations: wait (decrement and block if the value is negative) and signal (increment and wake a blocked process). Binary semaphores enforce mutual exclusion directly; counting semaphores generalize the concept to resources with finite multiplicity, such as a pool of database connections. Modern operating systems layer higher-level abstractions, including mutexes and condition variables, on top of the same underlying atomic operations.

Deadlock Prevention and Processor Scheduling

A deadlock arises when a set of processes each holds a resource that another process in the set is waiting for, and none can proceed. Coffman et al. identified four necessary conditions for deadlock in 1971: mutual exclusion, hold-and-wait, no preemption, and circular wait. Prevention strategies eliminate one of these conditions: for example, requiring processes to request all needed resources at once eliminates hold-and-wait. Detection-and-recovery approaches instead allow deadlocks to form but identify them through resource-allocation graphs and resolve them by preempting one process. Processor scheduling is closely coupled to deadlock avoidance: the order in which the scheduler grants CPU time to processes determines which resource-acquisition sequences are possible, and algorithms such as the Banker's Algorithm use scheduled resource requests to prove that a proposed allocation leaves the system in a safe state.

Transaction Processing

Transaction processing applies concurrency control to database systems, where correctness requirements are formalized as the ACID properties: Atomicity, Consistency, Isolation, and Durability. The isolation property is the direct concern of concurrency control protocols. Two-phase locking (2PL), the most widely deployed protocol, requires a transaction to acquire all its locks before releasing any. In 2PL, execution proceeds in a growing phase (locks acquired) followed by a shrinking phase (locks released). Strict 2PL, a variant used in most commercial databases, holds all locks until the transaction commits or aborts, preventing dirty reads and non-repeatable reads. Multiversion concurrency control (MVCC), used in PostgreSQL and Oracle, instead maintains timestamped versions of data rows, allowing readers to see a consistent snapshot without blocking writers. The ACM SIGMOD community has produced decades of research formalizing these protocols and measuring their performance under contention.

Formal Verification and Standards

Formal methods provide mathematical proofs that a concurrency control mechanism satisfies its specification. Model checkers such as TLA+ and SPIN exhaustively explore the state space of a concurrent system to find violations of safety and liveness properties. Safety properties assert that a bad state is never reached; liveness properties assert that desired events eventually occur. The IEEE 1003.1 (POSIX) standard specifies threading and synchronization primitives for Unix-compatible systems, defining the semantics of mutexes, condition variables, and read-write locks that operating system implementations must satisfy.

Applications

Concurrency control has applications in a wide range of disciplines, including:

  • Database management: coordinating simultaneous queries and updates in relational and NoSQL systems
  • Operating systems: managing access to shared kernel data structures by multiple threads and interrupt handlers
  • Distributed systems: maintaining consistency across replicated data stores using protocols such as two-phase commit
  • Real-time embedded systems: arbitrating shared hardware peripherals among concurrent tasks with timing constraints
  • Cloud computing: orchestrating parallel workloads across multi-tenant infrastructure without data corruption

Topics in this Area