Setting TIME WAIT TCP

Question

We re trying to tune an application that accepts messages via TCP and also uses TCP for some of its internal messaging   While load testing  we noticed that response time degrades significantly  and then stops altogether  as more simultaneous requests are made to the system   During this time   we see a lot of TCP connections in TIME WAIT status and someone suggested lowering the TIME WAIT environment variable from it s default 60 seconds to 30   From what I understand  the TIME WAIT setting essentially sets the time a TCP resource is made available to the system again after the connection is closed   I m not a  network guy  and know very little about these things   I need a lot of what s in that linked post  but  dumbed down  a little    I think I understand why the TIME WAIT value can t be set to 0  but can it safely be set to 5   What about 10   What determines a  safe  setting for this value  Why is the default for this value 60   I m guessing that people a lot smarter than me had good reason for selecting this as a reasonable default  What else should I know about the potential risks and benefits of overriding this value

User · Answer

Usually  only the endpoint that issues an  active close  should go into TIME WAIT state  So  if possible  have your clients issue the active close which will leave the TIME WAIT on the client and NOT on the server    See here  http   www serverframework com asynchronousevents 2011 01 time-wait-and-its-design-implications-for-protocols-and-scalable-servers html and http   www isi edu touch pubs infocomm99 infocomm99-web  for details  the later also explains why it s not always possible due to protocol design that doesn t take TIME WAIT into consideration

User · Answer

A TCP connection is specified by the tuple  source IP  source port  destination IP  destination port    The reason why there is a TIME WAIT state following session shutdown is because there may still be live packets out in the network on their way to you  or from you which may solicit a response of some sort    If you were to re-create that same tuple and one of those packets showed up  it would be treated as a valid packet for your connection  and probably cause an error due to sequencing    So the TIME WAIT time is generally set to double the packets maximum age  This value is the maximum age your packets will be allowed to get to before the network discards them   That guarantees that  before you re allowed to create a connection with the same tuple  all the packets belonging to previous incarnations of that tuple will be dead   That generally dictates the minimum value you should use  The maximum packet age is dictated by network properties  an example being that satellite lifetimes are higher than LAN lifetimes since the packets have much further to go

User · Answer

In Windows  you can change it through the registry     Set the TIME WAIT delay to 30 seconds  0x1E    HKEY LOCAL MACHINE SYSTEM CurrentControlSet Services TCPIP Parameters   TcpTimedWaitDelay  dword 0000001E

User · Answer

TIME WAIT might not be the culprit    int listen int sockfd  int backlog     According to Unix Network Programming Volume1  backlog is defined to be the sum of completed connection queue and incomplete connection queue   Let s say the backlog is 5  If you have 3 completed connections  ESTABLISHED state   and 2 incomplete connections  SYN RCVD state   and there is another connect request with SYN  The TCP stack just ignores the SYN packet  knowing it ll be retransmitted some other time  This might be causing the degradation   At least that s what I ve been reading

User · Answer

TIME WAIT might not be the culprit    int listen int sockfd  int backlog     According to Unix Network Programming Volume1  backlog is defined to be the sum of completed connection queue and incomplete connection queue   Let s say the backlog is 5  If you have 3 completed connections  ESTABLISHED state   and 2 incomplete connections  SYN RCVD state   and there is another connect request with SYN  The TCP stack just ignores the SYN packet  knowing it ll be retransmitted some other time  This might be causing the degradation   At least that s what I ve been reading

User · Answer

A TCP connection is specified by the tuple  source IP  source port  destination IP  destination port    The reason why there is a TIME WAIT state following session shutdown is because there may still be live packets out in the network on their way to you  or from you which may solicit a response of some sort    If you were to re-create that same tuple and one of those packets showed up  it would be treated as a valid packet for your connection  and probably cause an error due to sequencing    So the TIME WAIT time is generally set to double the packets maximum age  This value is the maximum age your packets will be allowed to get to before the network discards them   That guarantees that  before you re allowed to create a connection with the same tuple  all the packets belonging to previous incarnations of that tuple will be dead   That generally dictates the minimum value you should use  The maximum packet age is dictated by network properties  an example being that satellite lifetimes are higher than LAN lifetimes since the packets have much further to go

User · Answer

I have been load testing a server application  on linux  by using a test program with 20 threads   In 959 000 connect   close cycles I had 44 000 failed connections and many thousands of sockets in TIME WAIT   I set SO LINGER to 0 before the close call and in subsequent runs of the test program had no connect failures and less than 20 sockets in TIME WAIT

User · Answer

Usually  only the endpoint that issues an  active close  should go into TIME WAIT state  So  if possible  have your clients issue the active close which will leave the TIME WAIT on the client and NOT on the server    See here  http   www serverframework com asynchronousevents 2011 01 time-wait-and-its-design-implications-for-protocols-and-scalable-servers html and http   www isi edu touch pubs infocomm99 infocomm99-web  for details  the later also explains why it s not always possible due to protocol design that doesn t take TIME WAIT into consideration

User · Answer

Usually  only the endpoint that issues an  active close  should go into TIME WAIT state  So  if possible  have your clients issue the active close which will leave the TIME WAIT on the client and NOT on the server    See here  http   www serverframework com asynchronousevents 2011 01 time-wait-and-its-design-implications-for-protocols-and-scalable-servers html and http   www isi edu touch pubs infocomm99 infocomm99-web  for details  the later also explains why it s not always possible due to protocol design that doesn t take TIME WAIT into consideration

User · Answer

In Windows  you can change it through the registry     Set the TIME WAIT delay to 30 seconds  0x1E    HKEY LOCAL MACHINE SYSTEM CurrentControlSet Services TCPIP Parameters   TcpTimedWaitDelay  dword 0000001E

User · Answer

I have been load testing a server application  on linux  by using a test program with 20 threads   In 959 000 connect   close cycles I had 44 000 failed connections and many thousands of sockets in TIME WAIT   I set SO LINGER to 0 before the close call and in subsequent runs of the test program had no connect failures and less than 20 sockets in TIME WAIT

User · Answer

setting the tcp reuse is more useful than changing time wait  as long as you have the parameter  kernels 3 2 and above  unfortunately that disqualifies all versions  of RHEL and XenServer      Dropping the value  particularly for VPN connected users  can result in constant recreation of proxy tunnels on the outbound connection  With the default Netscaler  XenServer  config  which is lower than the default Linux config  Chrome will sometimes have to recreate the proxy tunnel up to a dozen times to retrieve one web page   Applications that don t retry  such as Maven and Eclipse P2  simply fail     The original motive for the parameter  avoid duplication  was made redundant by a TCP RFC that specifies timestamp inclusion on all TCP requests

User · Answer

Pax is correct about the reasons for TIME WAIT  and why you should be careful about lowering the default setting   A better solution is to vary the port numbers used for the originating end of your sockets   Once you do this  you won t really care about time wait for individual sockets   For listening sockets  you can use SO REUSEADDR to allow the listening socket to bind despite the TIME WAIT sockets sitting around

User · Answer

setting the tcp reuse is more useful than changing time wait  as long as you have the parameter  kernels 3 2 and above  unfortunately that disqualifies all versions  of RHEL and XenServer      Dropping the value  particularly for VPN connected users  can result in constant recreation of proxy tunnels on the outbound connection  With the default Netscaler  XenServer  config  which is lower than the default Linux config  Chrome will sometimes have to recreate the proxy tunnel up to a dozen times to retrieve one web page   Applications that don t retry  such as Maven and Eclipse P2  simply fail     The original motive for the parameter  avoid duplication  was made redundant by a TCP RFC that specifies timestamp inclusion on all TCP requests

User · Answer

Pax is correct about the reasons for TIME WAIT  and why you should be careful about lowering the default setting   A better solution is to vary the port numbers used for the originating end of your sockets   Once you do this  you won t really care about time wait for individual sockets   For listening sockets  you can use SO REUSEADDR to allow the listening socket to bind despite the TIME WAIT sockets sitting around

User · Answer

TIME WAIT might not be the culprit    int listen int sockfd  int backlog     According to Unix Network Programming Volume1  backlog is defined to be the sum of completed connection queue and incomplete connection queue   Let s say the backlog is 5  If you have 3 completed connections  ESTABLISHED state   and 2 incomplete connections  SYN RCVD state   and there is another connect request with SYN  The TCP stack just ignores the SYN packet  knowing it ll be retransmitted some other time  This might be causing the degradation   At least that s what I ve been reading

User · Answer

Usually  only the endpoint that issues an  active close  should go into TIME WAIT state  So  if possible  have your clients issue the active close which will leave the TIME WAIT on the client and NOT on the server    See here  http   www serverframework com asynchronousevents 2011 01 time-wait-and-its-design-implications-for-protocols-and-scalable-servers html and http   www isi edu touch pubs infocomm99 infocomm99-web  for details  the later also explains why it s not always possible due to protocol design that doesn t take TIME WAIT into consideration

User · Answer

Pax is correct about the reasons for TIME WAIT  and why you should be careful about lowering the default setting   A better solution is to vary the port numbers used for the originating end of your sockets   Once you do this  you won t really care about time wait for individual sockets   For listening sockets  you can use SO REUSEADDR to allow the listening socket to bind despite the TIME WAIT sockets sitting around

User · Answer

A TCP connection is specified by the tuple  source IP  source port  destination IP  destination port    The reason why there is a TIME WAIT state following session shutdown is because there may still be live packets out in the network on their way to you  or from you which may solicit a response of some sort    If you were to re-create that same tuple and one of those packets showed up  it would be treated as a valid packet for your connection  and probably cause an error due to sequencing    So the TIME WAIT time is generally set to double the packets maximum age  This value is the maximum age your packets will be allowed to get to before the network discards them   That guarantees that  before you re allowed to create a connection with the same tuple  all the packets belonging to previous incarnations of that tuple will be dead   That generally dictates the minimum value you should use  The maximum packet age is dictated by network properties  an example being that satellite lifetimes are higher than LAN lifetimes since the packets have much further to go

User · Answer

Pax is correct about the reasons for TIME WAIT  and why you should be careful about lowering the default setting   A better solution is to vary the port numbers used for the originating end of your sockets   Once you do this  you won t really care about time wait for individual sockets   For listening sockets  you can use SO REUSEADDR to allow the listening socket to bind despite the TIME WAIT sockets sitting around

User · Answer

A TCP connection is specified by the tuple  source IP  source port  destination IP  destination port    The reason why there is a TIME WAIT state following session shutdown is because there may still be live packets out in the network on their way to you  or from you which may solicit a response of some sort    If you were to re-create that same tuple and one of those packets showed up  it would be treated as a valid packet for your connection  and probably cause an error due to sequencing    So the TIME WAIT time is generally set to double the packets maximum age  This value is the maximum age your packets will be allowed to get to before the network discards them   That guarantees that  before you re allowed to create a connection with the same tuple  all the packets belonging to previous incarnations of that tuple will be dead   That generally dictates the minimum value you should use  The maximum packet age is dictated by network properties  an example being that satellite lifetimes are higher than LAN lifetimes since the packets have much further to go

User · Answer

TIME WAIT might not be the culprit    int listen int sockfd  int backlog     According to Unix Network Programming Volume1  backlog is defined to be the sum of completed connection queue and incomplete connection queue   Let s say the backlog is 5  If you have 3 completed connections  ESTABLISHED state   and 2 incomplete connections  SYN RCVD state   and there is another connect request with SYN  The TCP stack just ignores the SYN packet  knowing it ll be retransmitted some other time  This might be causing the degradation   At least that s what I ve been reading

[tcp] Setting TIME_WAIT TCP

Examples related to tcp

Examples related to network-protocols