I’ve done a fair amount of work with different forms of server push over the years, and I find that a lot of developers are curious about the topic. As such, I’d like to create a series of posts on the different ways one can send data to the client without the client initiating a request.
The Anatomy of a Web Request
Before we get into how push works, let’s quickly go over the way HTTP normally works.
The internet is built on TCP/IP (Transmission Control Protocol and Internet Protocol, respectively). TCP is a method of transferring data from one endpoint to another in a reliable fashion. Those endpoints can be different applications running on separate computers, though they can also be used to “loopback” to the same computer, and even the same application. Error detection and correction are built in at a low level. Unlike UDP, TCP ensures that each packet gets through in the intended order before allowing the next packet to be sent. That adds a bit of overhead, but alleviates a lot of headaches for most applications.
Internet Protocol on the other hand, is how the computer routes the data to its intended recipient. We’re all familiar with IP addresses, which are indeed a part of the Internet Protocol.
Many applications use raw TCP sockets to communicate. For instance, when you connect an application server to a database server, they may very well be communicating over a raw TCP socket.
HTTP (Hypertext Transfer Protocol) is a request/response protocol that runs on top of TCP/IP. Simply put, a client makes a request to a server (consisting of a request line, headers, and possibly a body). The server handles the request in whatever way is deemed appropriate and sends a response consisting of headers and possibly a body.
Here’s a slightly more in depth look at what happens when a user types a URL in their browser and hits enter:
- The client turns that URL into an IP address. If the address is cached then that is used, if not, it requests the information from a domain name server.
- The client initiates a TCP connection with the server.
- The client sends the request data, including a request line (i.e, GET /index.html HTTP/1.1), headers, and perhaps a body.
- The server processes the request.
- The server sends the response data, including headers, and possibly a body.
- The TCP connection is ended.
This happens for every single piece of data that needs to be requested. You’ll make one request for the HTML page, another for the CSS, a few for however many Javascript files are included, and one for every single image, etc. You could easily make several dozen requests for a single page, each of which goes through the entire list of steps given above.
When you think about it, it makes perfect sense that a server can’t normally push data to a client without a request; after responding to a client, the TCP connection is terminated and the server doesn’t even know if the client is still available.
Despite that, there are many means for getting realtime or semi-realtime data to users. Many years ago polling was the only option, followed by Java, Flash, ActiveX and other browser plugins. Eventually Web Sockets and Event Source were created, which allow for realtime data push without plugins. That said, all of the aforementioned techniques are quite different. I’ll cover some of those options in later posts including example applications.
So in summary:
- IP is how computers locate and route data to one another on the internet.
- TCP is how the individual packets of data are transferred in an orderly and fault resistant fashion.
- HTTP is how clients and servers speak to one another. It involves an unbreakable request response cycle.
- Polling, plugins, Web Sockets, and Event Source allow us to send realtime (or mostly realtime) data to clients, but each has a unique set of pros and cons.