0

I'm having some trouble when I select large tables full of strings data using Npgsql over a high latency connection (200-300 ms). The total fields value size is something like 256kb. I'm totally sure that the problem is related to the transfer over the network beacause if I execute the same query locally it's executed in very short time (10 ms - 20 ms), instead if the query is executed via the slow connection it takes 20-30 seconds. Also if i'm measuring the table using length(fields) the query is executed in a decent time (1-2 seconds). I'm experiencing this problem on different pc, different postgresql version and npgsql. I think that the problem is related to the size of packets.. maybe increasing the buffer could solve my problem but, how to do it? in postgres? or in npgsql?

7
  • 1
    Is it latency, bandwidth, or both that's limited? I'd use Wireshark to take a packet trace. Then use it to get statistics on the transfer and graph it. Commented Dec 3, 2015 at 13:22
  • It is latency. I'm sure of it. I also simulate the problem by tunneling the connection to my home and back again to the work pc. The bandwith it's good. Commented Dec 3, 2015 at 13:31
  • As I'm expecting using wireshark I can see, for a single query, over 200 packets of maximum 1250 bytes length.. A lot of packets in low latency connection is a bad thing.. Isn't there a way to increase the packet size? Commented Dec 3, 2015 at 13:51
  • 1
    Lots of small packets is fine, and not a big concern for latency. That's normal. They get pipelined. 1250 is a reasonable size for Internet traffic. Even on a local wired Ethernet the packet size limit is 1500 bytes unless your devices, switches, etc all support jumbo frames. Small packets are only an issue if there's a forced round trip between each, where the client waits until the server responds. TCP/IP uses smart ACKs that mean this isn't required for the low level TCP stream, only if the application forces it. Commented Dec 3, 2015 at 14:10
  • Ok, then probably it's a problem of Postgresql that is not optimized for high latency connection. Probably it has a lot of requests/waits between server and client.. Commented Dec 3, 2015 at 14:23

1 Answer 1

2

Per comments, you're using an SSH tunnel.

TCP-in-TCP tunneling like SSH does is absolutely awful for carrying lots of data. Expect it to perform terribly. Congestion control and window scaling don't work properly, retransmits are a problem, reordering and reassembly is inefficient, slow-start doesn't work well, etc.

Just don't do that. Use direct TCP/IP with SSL, or use a UDP-based tunnel/VPN.

PostgreSQL's protocol is quite friendly to pipelining and it requires zero round trips per field or per row fetched when getting results. You just keep reading until there's nothing more to read. So round-trip latencies shouldn't be the issue here.

It's very likely to be problems caused by the tunnel.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.