Over the past few days, I’ve been developing a project in C# (.NET 4.8) that scrapes and validates hundreds of HTTP/S proxies.
I’ve experimented with multiple approaches to confirm whether a proxy is alive, but with my current implementation (shown below), I’m only seeing roughly a 50% success rate when I cross check my results against popular online proxy checker tools like https://hide.mn/en/proxy-checker/ and https://proxy-seller.com/tools/proxy-checker/
Currently, I’m using an HttpClient configured with each proxy to make a request to https://api.ipify.org?format=json, which returns the public IP address. If the returned IP matches the proxy, I consider the proxy alive. However, even with this approach, only about half of the proxies that my program marks as “alive” are confirmed by the online proxy checker tools I listed above and no I'm not being rate limited by api.ipify.org!
I’m sharing my current method below and would appreciate any advice or suggestions for improving the accuracy!!
private static async Task<bool> Check(string proxy)
{
string[] proxyParts = proxy.Split(':');
if (proxyParts.Length < 2)
{
return false; // Invalid proxy format
}
string proxyIp = proxyParts[0];
for (int attempt = 1; attempt <= retries; attempt++)
{
var handler = new HttpClientHandler
{
Proxy = new WebProxy(proxy),
UseProxy = true
};
using (var checkClient = new HttpClient(handler) { Timeout = TimeSpan.FromMilliseconds(timeOut) })
{
try
{
string jsonResponse = await checkClient.GetStringAsync("https://api.ipify.org?format=json").ConfigureAwait(false);
var jsonObject = JToken.Parse(jsonResponse);
string publicIp = jsonObject["ip"]?.ToString();
if (!string.IsNullOrEmpty(publicIp) && publicIp == proxyIp)
{
return true; // Proxy IP matches public IP
}
}
catch
{
if (attempt == retries)
{
return false; // All retries failed
}
}
await Task.Delay(delayMs).ConfigureAwait(false);
}
}
return false;
}
I’ve tried a couple of ways to check the proxies. First, I make a request through each proxy to https://api.ipify.org?format=json and see if the returned IP matches the proxy, which should indicate it’s working. I also tried just sending a request to a regular URL and checking for a 200 response, thinking that might be more reliable. In both cases, I expected most of the working proxies to be flagged correctly, but the success rate is still only about half compared to online proxy checkers.