The underlying HTTP architecture explanation with implementation in python
Prerequisites- This article assumes a basic understanding of networking, TCP protocol, and python programming language.
Hello programmers đź‘‹. In this article, we are going to go through the HTTP protocol. We will see what it is, how it works, and the general architect of the HTTP protocol and finally, we are going to implement the architect by making an HTTP server from scratch with the python programming language. The introduction to HTTP is going first and the implementation next. If you know HTTP protocol, you can pretty well skip the introduction.
An overview of HTTP
HTTP protocol stands for Hypertext Transfer Protocol and it is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. It is a protocol used to access the data on the World Wide Web (www). The HTTP protocol can be used to transfer the data in the form of plain text, hypertext, audio, video, and so on.
HTTP is a protocol for fetching resources such as HTML documents. It is the foundation of any data exchange on the Web and it is a client-server protocol, which means requests are initiated by the recipient, usually the Web browser. A complete document is reconstructed from the different sub-documents fetched, for instance, text, layout description, images, videos, scripts, and more. (Source from here)
How does it work?
As a request-response protocol, HTTP gives users a way to interact with web resources such as HTML files by transmitting hypertext messages between clients and servers. HTTP clients generally use Transmission Control Protocol (TCP) connections to communicate with servers.
Clients and servers communicate by exchanging individual messages (as opposed to a stream of data). The messages sent by the client, usually a Web browser, are called requests, and the messages sent by the server as an answer are called responses.
Implementation
Like we said earlier, HTTP works in a Client-Server way. So, we have to code the server-side program to host the HTTP clone server and for the client… we already have a web browser as a client.
Ok, let’s go… Let’s code the server program. We are going to use the python programming language and its library socket. Well, let’s import and create the socket object.
import socketserver = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
As you see in the above code… we created a TCP socket object with the socket type SOCK_STREAM
.
Well… Evert HTTP servers have their own IP address and port number to access them. So, let’s give our HTTP server a port number and we will use localhost(127.0.0.1) as an IP address.
ip = "127.0.0.1" # localhost
port = 4545 # arbitrary port number
server.bind((ip, port))
Great! Our HTTP server bounded to an IP and Port number. One last thing left to make the HTTP server work is listening. After binding it with the IP and port number, we have to start listening. You can simply copy-paste the below code.
server.listen()
Good… Our HTTP server is not accessible with the URL http://127.0.0.1:4545
Hmm… try to access this URL. It is showing “This site can’t be reached” right? Of course, the HTTP server is working but the error is showing up because we are not accepting client connection requests. Every time you try to connect to an HTTP server, a connection is made first. That’s because HTTP uses the TCP protocol. So… our server is working but it is not accepting clients' requests. That’s why your browser is displaying to you that error.
Let’s overcome this problem! Well … we first need to accept clients' connections. In our case, we are going to handle only one client at a time(No multi-threading). And to do that, let’s create a while loop (while True
) and inside the while loop block, let’s accept the client's connection and receive their requests.
while True:
client, addr = server.accept()
request = client.recv(4096).decode()
In the above 3 lines of code, we are accepting a client, receiving their request, and decoding it. Basically, if the request is made from a browser, it contains the method(GET/POST/PUT), path, protocol, and headers. To parse that we can use some parsing libraries but as this is an introduction article, we don’t need to worry about the requests.
Let me explain how things are working from the client-side. Every time the client tries to access our HTTP server from their browser… They are first making a connection, Then sending the request data after that waiting for the server response. Well, we accepted the client connection and received their request. Now, we are left with sending our(server) response to the client. We are going to send a response twice. The first one is an acknowledgment message telling if it was a forbidden or ok connection. After that, the HTML/web page.
page = "<h1>Hello world</h1>"
This is a pretty basic example I found. Well, Let’s send this to the client. While sending, we have to make sure that we are encoding the data.
page = "<h1>Hello world</h1>" # the page
while True:
client, addr = server.accept() # accepting conn
request = client.recv(4096).decode() # receving request
print(request)
client.send("HTTP/1.1 200 OK\r\n\r\n".encode())#sending the ack
client.send(page.encode()) # sending the page
client.close() # Closing the connection
In the above script, we created a page string, then inside the while loop… we are accepting client connection, receiving the request, sending the ack data, then the page, and finally closing our connection. The while loop will keep working and accept clients until it is manually closed.
The whole code
import socketserver = socket.socket(socket.AF_INET, socket.SOCK_STREAM)ip = "127.0.0.1" # localhost
port = 4545 # arbitrary port number
server.bind((ip, port))server.listen()page = "<h1>Hello world</h1>" # the pagewhile True:
client, addr = server.accept() # accepting conn
request = client.recv(4096).decode() # receving request
print(request)
client.send("HTTP/1.1 200 OK\r\n\r\n".encode())#sending the ack
client.send(page.encode()) # sending the page
client.close() # Closing the connection
This concludes the explanation and implementation of the HTTP protocol. I hope you liked it. The HTTPS protocol doesn’t differ much from that of HTTP. In HTTPS, the requests and responses are encrypted so that they won’t be vulnerable to attacks.
You can support me by following me on Medium, Twitter, and Github. If you have any questions or feedback, shoot it in the comment section.