Understanding HTTP using sockets
Creating a simple HTTP server using sockets
With the ever increasing popularity of web-based software (web-apps, microservices, REST, SOAP etc), it's always good to think about whats going on behind the scenes. This is particularly useful for me when venturing into frameworks like Flask or Django.
So, how low shall we go? As low as the main transport protocols used in the internet stack - TCP and UDP.
TCP and UDP
Without going into too much detail, the two main standards for transporting bytes across the internet are UDP and TCP. They differ in the nature of the connection, TCP requires a handshake (a formal connection between a server and a client process) whilst a client can send a UDP message to a server without guarantee of complete transmission.
For the purposes of this post, we'll focus on TCP as for internet traffic using HTTP, TCP is nearly always used (hence the typical labelling of the internet protocol stack as TCP/IP)
So how do we transfer messages between two processes over the network using TCP? One answer - sockets.
Sockets
Sockets are the magic that interface between your program and the operating system. A socket API is provided by the OS and can be accessed using libraries in all programming languages, so a developer can pick any - as long as it's Python.
Let's create client and server scripts that communicate over TCP using sockets.
import socket
#socket.SOCK_STREAM indicates TCP
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(("localhost", 12345))
serversocket.listen(1)
(clientsocket, address) = serversocket.accept()
msg = clientsocket.recv(1024)
print "server recieved "+msg
import socket
#socket.SOCK_STREAM indicates TCP
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect(("localhost", 12345))
msg = "Hello World from client"
print "client sending: "+msg
clientsocket.send(msg)
First running the server then client script in separate processes results in the following:
client sending: Hello World from client
server received: Hello World from client
The above example can be modified to perform two-way communication by adding send and recv methods to each server and client scripts:
print "server sending reply"
clientsocket.send("server received your message")
msg = clientsocket.recv(1024)
print "client received: "+msg
giving the output:
client sending: Hello World from client
client received: server received your message
server received: Hello World from client
server sending reply
As may be clear from the above example, if we were to use this in anger we would soon need to define a standard way to communicate, i.e. a protocol. This would include the defition of the message type (plain text or something else?), message length and methods to handle server/client requests (including authentication and standard error messages). This is where HTTP comes in.
before reading on, run the server script and try opening the address in a web browser (http://localhost:12345/) - you should see the same server messages
HTTP
HTTP is a protocol for defining messages sent throughout the web. As suggested above, communication via HTTP is usually done using sockets and the TCP transport protocol. So if we were to modify the server example above, what would the message look like?
- Firstly a status line, that includes the version of HTTP and a status (if this message is a respone)
- Header fields, including:
- content-type: text/html or application/json (message format)
- content-length: (length of message in bytes)
- Empty line
- Message body
Time to modify the server script:
import socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(("localhost", 12345))
serversocket.listen(1)
msg = """
HTTP/1.1
Content-Type: text/html
<html>
<body>
<b>Hello World</b>
</body>
</html>
"""
(clientsocket, address) = serversocket.accept()
sent = clientsocket.send(msg)
As you can see we have added some HTML to the body of the HTTP message. Running the server and client scripts give:
client sending: Hello World from client
client received:
HTTP/1.1
Content-Type: text/html
<html>
<body>
<b>Hello World</b>
</body>
</html>
To give this a more authentic look, we can switch the python client script for a web browser. Run the server script again but now use a browser to navigate to http://localhost:12345/. The browser now understands the HTTP message and renders the HTML:
The server script can be modified to parse 'GET' and 'POST' requests (along with other HTTP methods) but this is beyond the scope of this post. Python provides lots of libraries to simplify communications over HTTP, with vanilla Python itself containing SimpleHTTPServer. As such there is seldom the need to develop web servers using sockets directly - but hey, it's interesting.
Comments