Unfortunately I don't really see how your code should work, so I'm putting here my thoughts of how should a simple HTTP proxy look like.
So what should a basic proxy server do:
- Accept connection from a client and receive an HTTP request.
- Parse the request and extract its destination.
- Forward requests and responses.
- (optionally) Support
Connection: keep-alive.
Let's go step by step and write some very simplified code.
How does proxy accepts a client. A socket should be created and moved to passive mode:
import socket, select
sock = socket.socket()
sock.bind((your_ip, port))
sock.listen()
while True:
client_sock = sock.accept()
do_stuff(client_sock)
Once the TCP connection is established, it's time receive a request. Let's assume we're going to get something like this:
GET /?a=1&b=2 HTTP/1.1
Host: localhost
User-Agent: my browser details
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
In TCP, message borders aren't preserved, so we should wait until we get at least first two lines (for GET request) in order to know what to do later:
def do_stuff(sock):
data = receive_two_lines(sock)
remote_host = parse_request(data)
After we have got the remote hostname, it's time to forward the requests and responses:
def do_stuff(client_sock):
data = receive_two_lines(client_sock)
remote_host = parse_request(data)
remote_ip = socket.getaddrinfo(remote_host) # see the docs for exact use
webserver = socket.socket()
webserver.connect((remote_ip, 80))
webserver.sendall(data)
while it_makes_sense():
client_ready = select.select([client_sock], [], [])[0]
web_ready = select.select([webserver], [], [])[0]
if client_ready:
webserver.sendall(client_sock.recv(1024))
if web_ready:
client_sock.sendall(webserver.recv(1024))
Please note select - this is how we know if a remote peer has sent us data. I haven't run and tested this code and there are thing left to do:
- Chances are, you will get several GET requests in a single
client_sock.recv(1024) call, because again, message borders aren't preserved in TCP. Probably, look additional get requests each time you receive data.
- Request may differ for POST, HEAD, PUT, DELETE and other types of requests. Parse them accordingly.
- Browsers and servers usually utilise one TCP connection by setting
Connection: keep-alive option in the headers, but they also may decide to drop it. Be ready to detect disconnects and sockets closed by a remote peer (for simplicity sake, this is called while it_makes_sense() in the code).
bind, listen, accept, recv, send, sendall, getaddrinfo, select - all these functions can throw exceptions. It's better to catch them and act accordingly.
- The code currently server one client at a time.
server()and you callserver()again. Is that intentional?