Jon Gallant

How to set HtmlAgilityPack Timeout

1 min read

HtmlAgilityPack is a great HTML parser library that I often use for scraping. It does web requests on your behalf via the HtmlWeb().Load methods, but doesn’t expose the HttpWebRequest.Timeout property. I see a lot of people recommending using HttpWebRequest or WebClient to get the request and then HtmlAgilityPack to query the DOM, but there’s an easier way.

You can view the source for HtmlWeb here and see that they expose a PreRequest delegate:

public delegate bool PreRequestHandler(HttpWebRequest request);
```text
And they call that delegate right before making the GetResponse call:
```csharp
if (PreRequest != null)
{
// allow our user to change the request at will
if (!PreRequest(req))
{
return HttpStatusCode.ResetContent;
}
}
HttpWebResponse resp;
try
{
resp = req.GetResponse() as HttpWebResponse;
}
```text
So all you have to do is assign a delegate to PreRequest and set your timeout within that delegate:
```csharp
var web = new HtmlWeb();
web.PreRequest = delegate(HttpWebRequest webRequest)
{
webRequest.Timeout = 4;
return true;
};
var doc = web.Load("https://www.msn.com/");

Yep, it’s that easy.

Jon

Share:
Share on X