How can I debug what is causing a connection refused or a connection time out

Question

I have the following code that has worked for about a year   import urllib2  req   urllib2 Request  https   somewhere com    lt Request gt  lt  Request gt    data   urllib2 urlopen req  print data read     Lately  there have been some random errors    urllib2 URLError   lt urlopen error  Errno 111  Connection refused gt   lt urlopen error  Errno 110  Connection timed out gt    The trace of the failure is   Traceback  most recent call last     File  test py   line 4  in  lt module gt      data   urllib2 urlopen req  read     File   usr lib python2 7 urllib2 py   line 126  in urlopen     return  opener open url  data  timeout    File   usr lib python2 7 urllib2 py   line 400  in open     response   self  open req  data    File   usr lib python2 7 urllib2 py   line 418  in  open       open   req    File   usr lib python2 7 urllib2 py   line 378  in  call chain     result   func  args    File   usr lib python2 7 urllib2 py   line 1215  in https open     return self do open httplib HTTPSConnection  req    File   usr lib python2 7 urllib2 py   line 1177  in do open     raise URLError err  urllib2 URLError   lt urlopen error  Errno 111  Connection refused gt    The above errors happen randomly  the script can run successfully the first time but then fails on the second run and vice versa   What should I do to debug and figure out where the issue is coming from   How can I tell if the endpoint has consumed my request and returned a response but never reached me   With telnet  I just tested with telnet  sometimes it succeeds  sometimes it doesn t  just like my Python   On success     telnet somewhere com 443 Trying XXX YY ZZZ WWW    Connected to somewhere com  Escape character is       Connection closed by foreign host    On a refused connection     telnet somewhere com 443 Trying XXX YY ZZZ WWW    telnet  Unable to connect to remote host  Connection refused   On a timeout     telnet somewhere com 443 Trying XXX YY ZZZ WWW    telnet  Unable to connect to remote host  Connection timed out

User · Answer

Use a packet analyzer to intercept the packets to/from somewhere.com. Studying those packets should tell you what is going on.

Time-outs or connections refused could mean that the remote host is too busy.

User · Answer

The problem  The problem is in the network layer  Here are the status codes explained    Connection refused  The peer is not listening on the respective network port you re trying to connect to  This usually means that either a firewall is actively denying the connection or the respective service is not started on the other site or is overloaded  Connection timed out  During the attempt to establish the TCP connection  no response came from the other side within a given time limit  In the context of urllib this may also mean that the HTTP response did not arrive in time  This is sometimes also caused by firewalls  sometimes by network congestion or heavy load on the remote  or even local  site    In context  That said  it is probably not a problem in your script  but on the remote site  If it s occuring occasionally  it indicates that the other site has load problems or the network path to the other site is unreliable   Also  as it is a problem with the network  you cannot tell what happened on the other side  It is possible that the packets travel fine in the one direction but get dropped  or misrouted  in the other   It is also not a  direct  DNS problem  that would cause another error  Name or service not known or something similar   It could however be the case that the DNS is configured to return different IP addresses on each request  which would connect you  DNS caching left aside  to different addresses hosts on each connection attempt  It could in turn be the case that some of these hosts are misconfigured or overloaded and thus cause the aforementioned problems   Debugging this  As suggested in the another answer  using a packet analyzer can help to debug the issue  You won t see much however except the packets reflecting exactly what the error message says   To rule out network congestion as a problem you could use a tool like mtr or traceroute or even ping to see if packets get lost to the remote site  Note that  if you see loss in mtr  and any traceroute tool for that matter   you must always consider the first host where loss occurs  in the route from yours to remote  as the one dropping packets  due to the way ICMP works  If the packets get lost only at the last hop over a long time  say  100 packets   that host definetly has an issue  If you see that this behaviour is persistent  over several days   you might want to contact the administrator   Loss in a middle of the route usually corresponds to network congestion  possibly due to maintenance   and there s nothing you could do about it  except whining at the ISP about missing redundance    If network congestion is not a problem  i e  not more than  say  5  of the packets get lost   you should contact the remote server administrator to figure out what s wrong  He may be able to see relevant infos in system logs  Running a packet analyzer on the remote site might also be more revealing than on the local site  Checking whether the port is open using netstat -tlp is definetly recommended then

[python] How can I debug what is causing a connection refused or a connection time out?

Examples related to python

Examples related to networking