[wcf] WCF timeout exception detailed investigation

We have an application that has a WCF service (*.svc) running on IIS7 and various clients querying the service. The server is running Win 2008 Server. The clients are running either Windows 2008 Server or Windows 2003 server. I am getting the following exception, which I have seen can in fact be related to a large number of potential WCF issues.

System.TimeoutException: The request channel timed out while waiting for a reply after 00:00:59.9320000. Increase the timeout value passed to the call to Request or increase the SendTimeout value on the Binding. The time allotted to this operation may have been a portion of a longer timeout. ---> System.TimeoutException: The HTTP request to 'http://www.domain.com/WebServices/myservice.svc/gzip' has exceeded the allotted timeout of 00:01:00. The time allotted to this operation may have been a portion of a longer timeout. 

I have increased the timeout to 30min and the error still occurred. This tells me that something else is at play, because the quantity of data could never take 30min to upload or download.

The error comes and goes. At the moment, it is more frequent. It does not seem to matter if I have 3 clients running simultaneously or 100, it still occurs once in a while. Most of the time, there are no timeouts but I still get a few per hour. The error comes from any of the methods that are invoked. One of these methods does not have parameters and returns a bit of data. Another takes in lots of data as a parameter but executes asynchronously. The errors always originate from the client and never reference any code on the server in the stack trace. It always ends with:

 at System.Net.HttpWebRequest.GetResponse()
  at System.ServiceModel.Channels.HttpChannelFactory.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout)

On the server: I've tried (and currently have) the following binding settings:

maxBufferSize="2147483647" maxReceivedMessageSize="2147483647" maxBufferPoolSize="2147483647"

It does not seem to have an impact.

I've tried (and currently have) the following throttling settings:

<serviceThrottling maxConcurrentCalls="1500"   maxConcurrentInstances="1500"    maxConcurrentSessions="1500"/>

It does not seem to have an impact.

I currently have the following settings for the WCF service.

[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Single)]

I ran with ConcurrencyMode.Multiple for a while, and the error still occurred.

I've tried restarting IIS, restarting my underlying SQL Server, restarting the machine. All of these don't seem to have an impact.

I've tried disabling the Windows firewall. It does not seem to have an impact.

On the client, I have these settings:

maxReceivedMessageSize="2147483647"

<system.net>
    <connectionManagement>
    <add address="*" maxconnection="16"/>
</connectionManagement> 
</system.net>

My client closes its connections:

var client = new MyClient();

try
{
    return client.GetConfigurationOptions();
}
finally
{
    client.Close();
}

I have changed the registry settings to allow more outgoing connections:

MaxConnectionsPerServer=24, MaxConnectionsPer1_0Server=32.

I have now just recently tried SvcTraceViewer.exe. I managed to catch one exception on the client end. I see that its duration is 1 minute. Looking at the server side trace, I can see that the server is not aware of this exception. The maximum duration I can see is 10 seconds.

I have looked at active database connections using exec sp_who on the server. I only have a few (2-3). I have looked at TCP connections from one client using TCPview. It usually is around 2-3 and I have seen up to 5 or 6.

Simply put, I am stumped. I have tried everything I could find, and must be missing something very simple that a WCF expert would be able to see. It is my gut feeling that something is blocking my clients at the low-level (TCP), before the server actually receives the message and/or that something is queuing the messages at the server level and never letting them process.

If you have any performance counters I should look at, please let me know. (please indicate what values are bad, as some of these counters are hard to decypher). Also, how could I log the WCF message size? Finally, are there any tools our there that would allow me to test how many connections I can establish between my client and server (independently from my application)

Thanks for your time!

Extra information added June 20th:

My WCF application does something similar to the following.

while (true)
{
   Step1GetConfigurationSettingsFromServerViaWCF(); // can change between calls
   Step2GetWorkUnitFromServerViaWCF();
   DoWorkLocally(); // takes 5-15minutes. 
   Step3SendBackResultsToServerViaWCF();
}

Using WireShark, I did see that when the error occurs, I have a five TCP retransmissions followed by a TCP reset later on. My guess is the RST is coming from WCF killing the connection. The exception report I get is from Step3 timing out.

I discovered this by looking at the tcp stream "tcp.stream eq 192". I then expanded my filter to "tcp.stream eq 192 and http and http.request.method eq POST" and saw 6 POSTs during this stream. This seemed odd, so I checked with another stream such as tcp.stream eq 100. I had three POSTs, which seems a bit more normal because I am doing three calls. However, I do close my connection after every WCF call, so I would have expected one call per stream (but I don't know much about TCP).

Investigating a bit more, I dumped the http packet load to disk to look at what these six calls where.

1) Step3
2) Step1
3) Step2
4) Step3 - corrupted
5) Step1
6) Step2

My guess is two concurrent clients are using the same connection, that is why I saw duplicates. However, I still have a few more issues that I can't comprehend:

a) Why is the packet corrupted? Random network fluke - maybe? The load is gzipped using this sample code: http://msdn.microsoft.com/en-us/library/ms751458.aspx - Could the code be buggy once in a while when used concurrently? I should test without the gzip library.

b) Why would I see step 1 & step 2 running AFTER the corrupted operation timed out? It seems to me as if these operations should not have occurred. Maybe I am not looking at the right stream because my understanding of TCP is flawed. I have other streams that occur at the same time. I should investigate other streams - a quick glance at streams 190-194 show that the Step3 POST have proper payload data (not corrupted). Pushing me to look at the gzip library again.

This question is related to wcf timeout timeoutexception

The answer is


If you havn't tried it already - encapsulate your Server-side WCF Operations in try/finally blocks, and add logging to ensure they are actually returning.

If those show that the Operations are completing, then my next step would be to go to a lower level, and look at the actual transport layer.

Wireshark or another similar packet capturing tool can be quite helpful at this point. I'm assuming this is running over HTTP on standard port 80.

Run Wireshark on the client. In the Options when you start the capture, set the capture filter to tcp http and host service.example.com - this will reduce the amount of irrelevant traffic.

If you can, modify your client to notify you the exact start time of the call, and the time when the timeout occurred. Or just monitor it closely.

When you get an error, then you can trawl through the Wireshark logs to find the start of the call. Right click on the first packet that has your client calling out on it (Should be something like GET /service.svc or POST /service.svc) and select Follow TCP Stream.

Wireshark will decode the entire HTTP Conversation, so you can ensure that WCF is actually sending back responses.


I'm having a very similar problem. In the past, this has been related to serialization problems. If you are still having this problem, can you verify that you can correctly serialize the objects you are returning. Specifically, if you are using Linq-To-Sql objects that have relationships, there are known serialization problems if you put a back reference on a child object to the parent object and mark that back reference as a DataMember.

You can verify serialization by writing a console app that serializes and deserializes your objects using the DataContractSerializer on the server side and whatever serialization methods your client uses. For example, in our current application, we have both WPF and Compact Framework clients. I wrote a console app to verify that I can serialize using a DataContractSerializer and deserialize using an XmlDesserializer. You might try that.

Also, if you are returning Linq-To-Sql objects that have child collections, you might try to ensure that you have eagerly loaded them on the server side. Sometimes, because of lazy loading, the objects being returned are not populated and may cause the behavior you are seeing where the request is sent to the service method multiple times.

If you have solved this problem, I'd love to hear how because I'm stuck with it too. I have verified that my issue is not serialization so I'm at a loss.

UPDATE: I'm not sure if it will help you any but the Service Trace Viewer Tool just solved my problem after 5 days of very similar experience to yours. By setting up tracing and then looking at the raw XML, I found the exceptions that were causing my serialization problems. It was related to Linq-to-SQL objects that occasionally had more child objects than could be successfully serialized. Adding the following to your web.config file should enable tracing:

<sharedListeners>
    <add name="sharedListener"
         type="System.Diagnostics.XmlWriterTraceListener"
         initializeData="c:\Temp\servicetrace.svclog" />
  </sharedListeners>
  <sources>
    <source name="System.ServiceModel" switchValue="Verbose, ActivityTracing" >
      <listeners>
        <add name="sharedListener" />
      </listeners>
    </source>
    <source name="System.ServiceModel.MessageLogging" switchValue="Verbose">
      <listeners>
        <add name="sharedListener" />
      </listeners>
    </source>
  </sources>

The resulting file can be opened with the Service Trace Viewer Tool or just in IE to examine the results.


Looks like this exception message is quite generic and can be received due to a variety of reasons. We ran into this while deploying the client on Windows 8.1 machines. Our WCF client runs inside of a windows service and continuously polls the WCF service. The windows service runs under a non-admin user. The issue was fixed by setting the clientCredentialType to "Windows" in the WCF configuration to allow the authentication to pass-through, as in the following:

      <security mode="None">
        <transport clientCredentialType="Windows" proxyCredentialType="None"
          realm="" />
        <message clientCredentialType="UserName" algorithmSuite="Default" />
      </security>

You will also receive this error if you are passing an object back to the client that contains a property of type enum that is not set by default and that enum does not have a value that maps to 0. i.e enum MyEnum{ a=1, b=2};


Are you closing the connection to the WCF service in between requests? If you don't, you'll see this exact timeout (eventually).


I've just solved the problem.I found that the nodes in the App.config file have configed wrong.

<client>
<endpoint name="WCF_QtrwiseSalesService" binding="wsHttpBinding" bindingConfiguration="ws" address="http://cntgbs1131:9005/MyService/TGE.ISupplierClientManager" contract="*">
</endpoint>
</client>

<bindings>
    <wsHttpBinding>
        <binding name="ws" maxBufferPoolSize="2147483647" maxReceivedMessageSize="2147483647" messageEncoding="Text">
            <readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" maxArrayLength="2147483647" maxBytesPerRead="2147483647" maxNameTableCharCount="2147483647"/>
            <**security mode="None">**
                <transport clientCredentialType="None"></transport>
            </security>
        </binding>
    </wsHttpBinding>
</bindings>

Confirm your config in the node <security>,the attribute "mode" value is "None". If your value is "Transport",the error occurs.


Did you check the WCF traces? WCF has a tendency to swallow exceptions and only return the last exception, which is the timeout that you're getting, since the end point didn't return anything meaningful.


Did you try using clientVia to see the message sent, using SOAP toolkit or something like that? This could help to see if the error is coming from the client itself or from somewhere else.


from: http://www.codeproject.com/KB/WCF/WCF_Operation_Timeout_.aspx

To avoid this timeout error, we need to configure the OperationTimeout property for Proxy in the WCF client code. This configuration is something new unlike other configurations such as Send Timeout, Receive Timeout etc., which I discussed early in the article. To set this operation timeout property configuration, we have to cast our proxy to IContextChannel in WCF client application before calling the operation contract methods.


I'm not a WCF expert but I'm wondering if you aren't running into a DDOS protection on IIS. I know from experience that if you run a bunch of simultaneous connections from a single client to a server at some point the server stops responding to the calls as it suspects a DDOS attack. It will also hold the connections open until they time-out in order to slow the client down in his attacks.

Multiple connection coming from different machines/IP's should not be a problem however.

There's more info in this MSDN post:

http://msdn.microsoft.com/en-us/library/bb463275.aspx

Check out the MaxConcurrentSession sproperty.


Examples related to wcf

Create a asmx web service in C# using visual studio 2013 WCF Exception: Could not find a base address that matches scheme http for the endpoint WCF Service, the type provided as the service attribute values…could not be found WCF error - There was no endpoint listening at How can I pass a username/password in the header to a SOAP WCF Service The HTTP request is unauthorized with client authentication scheme 'Negotiate'. The authentication header received from the server was 'NTLM' Content Type application/soap+xml; charset=utf-8 was not supported by service The content type application/xml;charset=utf-8 of the response message does not match the content type of the binding (text/xml; charset=utf-8) maxReceivedMessageSize and maxBufferSize in app.config how to generate a unique token which expires after 24 hours?

Examples related to timeout

Waiting for Target Device to Come Online Spring Boot Java Config Set Session Timeout How to dispatch a Redux action with a timeout? Spring Boot REST API - request timeout? 5.7.57 SMTP - Client was not authenticated to send anonymous mail during MAIL FROM error How to set timeout in Retrofit library? How to set connection timeout with OkHttp How to modify the nodejs request default timeout time? How to handle ETIMEDOUT error? Timeout for python requests.get entire response

Examples related to timeoutexception

Simple timeout in java Changing the CommandTimeout in SQL Management studio WCF timeout exception detailed investigation