Uniform resource locators
What Are Uniform Resource Locators?
Uniform resource locators (URLs) are standardized string identifiers that specify the location of a resource on a network, providing both the means of locating the resource and the protocol required to access it. A URL encodes in a compact, human-readable form the information a client application needs to retrieve a document, image, service endpoint, or any other addressable resource from a server. URLs form the addressing backbone of the World Wide Web and are the primary mechanism by which browsers, APIs, and networked applications reference and retrieve content. They are a subset of the broader Uniform Resource Identifier (URI) scheme standardized by the Internet Engineering Task Force (IETF).
The concept of the URL was introduced by Tim Berners-Lee as part of the early World Wide Web architecture around 1990, and the first formal specification appeared as RFC 1738, Uniform Resource Locators, published by the IETF in December 1994. That specification defined the syntax and semantics for a set of common URL schemes including HTTP, FTP, and mailto. Subsequent standards work consolidated URL addressing under the more general URI framework in RFC 3986, Uniform Resource Identifier: Generic Syntax, which remains the authoritative specification for URI and URL syntax.
Syntax and Components
A URL is composed of a sequence of components separated by reserved delimiter characters. The scheme, which names the protocol, appears first and is separated from the rest of the URL by a colon. In HTTP URLs, the scheme is followed by two slashes, then the authority component, which identifies the host and optionally a port number. The path follows the authority and identifies the specific resource on the host; query parameters, prefixed by a question mark, supply additional input to the resource handler; and a fragment identifier, prefixed by a hash sign, refers to a specific section within the document.
Character encoding within URLs follows a percent-encoding convention in which characters outside the unreserved set are represented as a percent sign followed by two hexadecimal digits encoding the character's UTF-8 byte value. Reserved characters that appear as data rather than as delimiters must also be percent-encoded, preventing ambiguity in parsing. The W3C Web Naming and Addressing overview places URL syntax within the broader framework of web architecture, explaining how the hierarchical naming scheme supports relative reference resolution, enabling a document to link to another resource by specifying only the path components that differ from its own base URL.
URL Schemes and Protocols
The scheme component of a URL identifies the access protocol and therefore determines how the rest of the URL is interpreted. HTTP and HTTPS are the predominant schemes for web content retrieval, with HTTPS indicating that the transport is secured by TLS. The FTP scheme addresses files on file transfer protocol servers; the mailto scheme encodes an email address as a URL to be handled by a mail client. The file scheme references resources on the local file system. The data scheme allows small data objects such as inline images to be embedded directly in a URL string without requiring a network request.
Specialized URL schemes serve application-specific needs. The ws and wss schemes address WebSocket endpoints for full-duplex communication; the blob scheme addresses in-memory binary objects in web browser contexts; and custom application schemes registered with operating systems direct URLs to native applications. IANA maintains the official registry of URI schemes, ensuring that new schemes are assigned consistently.
URI, URL, and URN
The relationship between URI, URL, and URN is a common source of confusion. A URI is the most general form: any string that identifies a resource by location, name, or both. A URL is a URI that specifies how to locate the resource, providing an access protocol and network address. A Uniform Resource Name (URN) is a URI that identifies a resource by name in a persistent, location-independent way, such as an ISBN or a DOI. In practice, the term URL is used colloquially to refer to virtually all web addresses, and the IETF has acknowledged in RFC 3986 that the formal URL/URN distinction has less practical significance than the overarching URI framework.
Applications
Uniform resource locators have applications in a range of fields, including:
- Web browsing and hypertext linking across documents and media
- REST API design, where URL paths encode resource identity and hierarchy
- Content delivery networks routing requests by URL path and query string
- Search engine indexing and crawling of web content at scale
- Single sign-on and OAuth authorization flows that encode state in redirect URLs