1

Consider this piece of code:

app.on("connection", (clientToProxySocket) => {
    console.log("Client connected to proxy");

    clientToProxySocket.once("data", (data) => {
        let isConnectionTLS = data.toString().indexOf("CONNECT") !== -1;

        let port= 80;
        let host;

        if (isConnectionTLS) {
            port= 443;
            host= data
                .toString().split("CONNECT")[1]
                .split(" ")[1]
                .split(":")[0];
        } else {
            host= data.toString().split("Host: ")[1].split("\\n")[0];
        }

        let proxyToServerSocket = net.createConnection(
            {host, port},
            () => {}
        )

        if (isConnectionTLS) {
            clientToProxySocket.write("HTTP/1.1 200 OK\r\n\r\n");
        } else {
            proxyToServerSocket.write(data);
        }

        const intermediatePipe = new stream.PassThrough();
        intermediatePipe.on('data', (chunk) => {
            console.log(chunk.toString());
        });
        clientToProxySocket.pipe(proxyToServerSocket);
        proxyToServerSocket.pipe(clientToProxySocket);

        // ... more events handling
    })
})

//... more events handling

app.listen(
    {
        host: "0.0.0.0",
        port: 8080,
    },
    () => {
        console.log("Server listening on 0.0.0.0:8080");
    }
);

My question is about the data event of the clientToProxySocket socket. What confuses me is that some data is consumed by the first listener (the one registered using once), but this consumed data is still being seen in the pipe ( clientToProxySocket.pipe(proxyToServerSocket); ).

I tried this curl command:

curl https://httpbin.org/headers -x http://127.0.0.1:8080 -kv

And paused the code using the debugger to see what is in data, I found:

CONNECT httpbin.org:443 HTTP/1.1
Host: httpbin.org:443
User-Agent: curl/8.7.1
Proxy-Connection: Keep-Alive

This lets me think that httpbin will not see these headers because the are consumed in the first data event and thus the piping will not include them, but curl shows this:

CONNECT httpbin.org:443 HTTP/1.1
Host: httpbin.org:443
User-Agent: curl/8.7.1
Proxy-Connection: Keep-Alive

Actually, httpbin.org have recieved all the headers (the api I used just prints the request headers) while it shouldn't: OK for the CONNECT header to not be as it is performed by us in the net.createConnection call.

My question goes in two parts:

  • How could the headers be consumed in the first event but still included in piping?
  • How could it be possible to see the headers plainly in an https connections while only httpbin.org who could decrypt the traffic?

EDIT

Now, I understand that the first data is not for the final server, but for the proxy server, this is why it is being sent in plain with no encryption. But should I always wait for the first request header to be received (and consume it until \r\n\r\n) before piping the client request to the remote server (for https, I know for http I can directly pipe it) ?

8
  • 1
    This has nothing to do with how sockets in nodejs work, but it is only about the semantics of the HTTP CONNECT method when using a proxy. To understand how it is supposed to work I recommend that you study the related standard - see ietf.org/rfc/rfc2817.html#section-5.2. Or at least read the wikipedia article about the CONNECT method - en.wikipedia.org/wiki/HTTP_tunnel#HTTP_CONNECT_method Commented Jun 13, 2024 at 4:54
  • This is not how to write an HTTP proxy. This is how to write an HTTP proxy: 1. Connect to the host and port named in the CONNECT command. Whether or not it is TLS is none of your business. 2. If the connect fails, return an appropriate response code, otherwise 3. return 200 and then 4. start copying bytes in both directions simultaneously, i.e. in two threads, until you have received end of stream from both ends, after which you close both sockets. You don't have to be concerned about TLS or HTTP in the least beyond the initial CONNECT and its response. Commented Jun 13, 2024 at 8:08
  • I think I can't just pipe client to server and server to client that easy for the following: 1 - I need to extract the headers to know the target and do some work based on that info 2 - that said, I need to receive and store the first chunks of data myself, once I have what I need, I forward that data to the remote server, at this point I can pipe both directly 3- if I start piping both sockets up/down, it may be: while checking the first chunks and sending them up, some more data may arrive (as HTTP keeps streaming) from the client, this data may arrive before/after the piping is set... Commented Jun 13, 2024 at 18:26
  • part 2 of previous comment: now, I need to be careful, cause if I just write the received data to the remote socket, it may already be sent be the piping as well, so I need to unregister the data listener, but in that case I need to be very careful and consider all situation to avoid duplicate data due to piping + sending manually using write, and also avoid to miss data due to receiving data between unregistering the data event and start the piping... It is more an algorithmic problem then HTTP-spec related. But still, with HTTPs we don't have this problem ... to be continued ... Commented Jun 13, 2024 at 18:30
  • ... because the client stops sending data until it received the header HTTP/1.1 200 OK or a another response, before continuing the transfer, I guess this is the handshake ... \ Commented Jun 13, 2024 at 18:31

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.