I'm working on a slightly larger project of my own and I need to make a localhost proxy in python.
The way I wrote mine is that there's a TCP server (using socket and SOCK_STREAM) on port 8080 on the localhost. It accepts a request from the local host, using slicing, string.find(), and gethostbyname() finds that target IP, so it opens up another TCP socket, sends the request and recv's a reply. After that, it relays the reply back to the localhost proxy which in turn throws it back at the browser.
This is the code with ample debugging messages and a debug file to collect the requests of the browser and the replies received back (also note this is just a prototype, hence the limited for loop instead of a while 1 loop):
import socket
local = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
f = open('test.txt', 'a')
local.bind(('localhost', 8080))
local.listen(5)
for i in xrange(20):
print '=====%d=====\n' % i
out = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
data, addr = local.accept()
print 'Connection accepted'
buffer = data.recv(4096)
print 'data recieved'
f.write('=============================================================\n')
f.write(buffer)
end = buffer.find('\n')
print buffer
#print buffer[:end]
host = buffer[:end].split()[1]
end = host[7:].find('/')
print host[7:(end+7)]
host_ip = socket.gethostbyname(host[7:(end+7)])
#print 'remote host: ' + host + ' IP: ' + host_ip
print 'sending buffer to remote host'
out.connect((host_ip, 80))
out.sendall(buffer)
print 'recieving data from remote host'
reply = out.recv(4096)
out.close()
print 'data recieved from remote host'
f.write('+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++\n')
f.write(reply)
f.write('\n\n\n')
print 'sending data back to local host'
data.sendall(reply)
print 'data sent'
local.close()
out.close()
f.close()
Now my problem is that it seems to work fine for the first few requests, it gets the html and a few images but at some point it always stops at the "data received" point and quits, because it gets no data ie. the buffer is empty. The browser still shows it's loading elements of the page, but when it stops and I look at the text log file, I see that the buffer was empty, meaning that the browser didn't submit anything to the proxy?
I am guessing that the issue lies somewhere in how a browser submits requests and my script not reacting properly to this behavior.
I know I could use the Twist framework, however I want to learn to write this kinda stuff myself. I've been reading about SocketServer and I might use that, but I have no clue if it'll solve the issue because frankly, I don't really understand what's causing the issue here. Is my script too slow for the browser? Do servers send more than one answer and my receiving socket should listen for more packets? Is my buffer size (4096) too small?
I'd really appreciate a nudge in the right direction.
Thanks!