Strange urllib2.urlopen() behavior on Ubuntu 10.10 - python

I am experiencing strange behavior with urllib2.urlopen() on Ubuntu 10.10. The first request to a url goes fast but the second takes a long time to connect. I think between 5 and 10 seconds. On windows this just works normal?
Does anybody have an idea what could cause this issue?
Thanks, Onno

5 seconds sounds suspiciously like the DNS resolving timeout.
A hunch, It's possible that it's cycling through the DNS servers in your /etc/resolv.conf and if one of them is broken, the default timeout is 5 seconds on linux, after which it will try the next one, looping back to the top when it's tried them all.
If you have multiple DNS servers listed in resolv.conf, try removing all but one. If this fixes it; then after that see why you're being assigned incorrect resolving servers.

you can enable debugging of the urllib2 maybe it can help you found out the problem
import urllib2
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
opener.open('http://www.google.com')

Related

Python3 Don't wait for socket response?

Thanks for everyone who helped with: Check If Port is Open in Python3?
But I think my problem is still unresolved, When I scan all nearly 64000 ports it takes days to finish, while tools like nmap can find all open ports in 3 seconds.
I think the problem is that none saw my request about defining port to be open only if it wasn't immediately closed by the server. How can I reflect this in the code?
Looking at this code: https://www.geeksforgeeks.org/port-scanner-using-python/
I have noticed the use of: s.close() why I need it? Is it necessary if I want my code to run fast?

socket.timeout; Explanation?

I am building a port scanning program ((irrelevant to the question, just explaining the background)), and I know the IP of the host, but not what ports are open. Hence, the scan.
It is in the early stages of development, so the error handling is bad, but not bad enough to make why Python does this explainable.
It tries to connect to, say, 123.456.7.8, 1. Obviously it's a ridiculous port to be open, so it throws an error. The error is No Route to Host or the such, right? Wrong! It is instead Operation Timed Out!.
Okay, let's increase the timeout in case my calculations were incorrect.
.
..
...
....All that did was rinse and repeat!
About 20 minutes later, the timeout is at 20 seconds, and it still is timing out. Really? Why does python raise a timed out error though, instead of No route to host! or similar?
I need to distinguish between time outs and connection failures, because there is a difference between late and nowhere. This prevents me from doing so, creating an infinite loop of hurry up and wait.
Whatever shall I do? Wherever shall I go?
Python socket module is a thin wrapper around your platform's socket API. The issue is unrelated to Python.
It is not necessary that you get No Route to Host error. Moreover it is common that a firewall just drops received packets (for a filtered port) that may manifest as a timeout error in your code. See Drop vs. Reject (ignore the conclusion but read the explanation of what is happening).
To workaround, make multiple concurrent connections and set a fixed timeout or use raw-sockets and send the packets yourself (you could use scapy, to investigate the behavior).

How to get rid of unkillable Freeswitch channels

After upgrading from Freeswitch 1.2.9 (1.2.9+git~20130506T233047Z~7c88f35451) to Freeswitch 1.4.21 (1.4.21-35~64bit), freeswitch stopped dropping channels after they were hung up, and when we tried to do a manual uuid_kill, it gives us this lovely error:
-ERR No such channel!
Even though show channels shows that channel clearly there. From the bugs on jira.freeswitch.com that I've seen, it looks like it may be a code problem. A little more info on our environment/code:
We have a python twisted loop that connects to the client so the client can run commands on the server and vice versa. As soon as that twisted connection dies (the client is closed/disconnected) the channels are killed as well, but we need the channel to die before then as we're taking a lot of calls per second and need them to die when the other end is disconnected. We can't close and reopen the client every time a call is done, or reconnect as that would take way too much time and defeats the purpose of our use of the software.
Once again, this error only started happening when we changed to installing the freeswitch server using apt-get instead of directly from source. This lets us get a new server up and running extremely faster, and we would rather not take the extra time to use our previous method. Please tell me if there's any code you would like to look at, and ask for any clarification you need, but we would really like this to be fixed soon. Thanks in advance!
Edit: For more clarification, we're mainly using mod_callcenter, mod_conference, and mod_sofia with our software.
Edit 2: For a little more clarification, we're running this on Ubuntu 14.04 Server
We are using an ESL connection to connect and run commands in freeswitch from python, and we think that's the root of the problem. We tried exiting the connection, but that destroys both channels.
Also, all of the bugs filed already for this problem on Jira are closed for not being bugs. I thought I may have a bit more success here, as it is a programming type question.
You need to reproduce the issue in a test environment and file the bug report to Jira. At best you should also try reproducing it with the latest master branch (only Debian 8 is supported):
https://freeswitch.org/confluence/display/FREESWITCH/Debian+8+Jessie
I had a similar problem when I used mod_perl, and a Perl object was referring to a session, and it was not properly destructed (if I remember it right, I had two Perl objects attached to the same session). That resulted in channels which were impossible to kill.
I suppose you are using a ESL connection between your application and FreeSWITCH, right?

What is PyCurl's default timeout

Well, I think the title of the question is very self explanatory, so you probably don't need to keep reading, but here it goes:
I have been working with PyCurl for a while, and I've always set my timeouts using
curlConnector = pycurl.Curl()
curlConnector.setopt(pycurl.CONNECTTIMEOUT, 30)
but I have started wondering what is the default timeout, or how to find it and I haven't seen any satisfactory answer so far.. If I don't manually specify it, what is the default timeout? Whatever comes from the socket? (Just in case it's relevant, I work on Ubuntu 12.04 and python2.7)
I downloaded PyCurl. In the doc/ directory of the tarball, are a couple of doc files. One of these is doc/curlobject.html, which says setup "Corresponds to curl_easy_setopt in libcurl". Following that link gets you to http://curl.haxx.se/libcurl/c/curl_easy_setopt.html, which upon search for 'CONNECTTIMEOUT', says:
CURLOPT_CONNECTTIMEOUT
Pass a long. It should contain the maximum time in seconds that you allow the connection to the server to take.
This only limits the connection phase, once it has connected, this option is of no more use.
Set to zero to switch to the default built-in connection timeout - 300 seconds.
See also the CURLOPT_TIMEOUT option.
So, I'd say the default timeout is 300 seconds.

set timeout to http response read method in python

I'm building a download manager in python for fun, and sometimes the connection to the server is still on but the server doesn't send me data, so read method (of HTTPResponse) block me forever. This happens, for example, when I download from a server, which located outside of my country, that limit the bandwidth to other countries.
How can I set a timeout for the read method (2 minutes for example)?
Thanks, Nir.
If you're stuck on some Python version < 2.6, one (imperfect but usable) approach is to do
import socket
socket.setdefaulttimeout(10.0) # or whatever
before you start using httplib. The docs are here, and clearly state that setdefaulttimeout is available since Python 2.3 -- every socket made from the time you do this call, to the time you call the same function again, will use that timeout of 10 seconds. You can use getdefaulttimeout before setting a new timeout, if you want to save the previous timeout (including none) so that you can restore it later (with another setdefaulttimeout).
These functions and idioms are quite useful whenever you need to use some older higher-level library which uses Python sockets but doesn't give you a good way to set timeouts (of course it's better to use updated higher-level libraries, e.g. the httplib version that comes with 2.6 or the third-party httplib2 in this case, but that's not always feasible, and playing with the default timeout setting can be a good workaround).
You have to set it during HTTPConnection initialization.
Note: in case you are using an older version of Python, then you can install httplib2; by many, it is considered a superior alternative to httplib, and it does supports timeout.
I've never used it, though, and I'm just reporting what documentation and blogs are saying.
Setting the default timeout might abort a download early if it's large, as opposed to only aborting if it stops receiving data for the timeout value. HTTPlib2 is probably the way to go.
5 years later but hopefully this will help someone else...
I was wrecking my brain trying to figure this out. My problem was a server returning corrupt content and thus giving back less data than it thought it had.
I came up with a nasty solution that seems to be working properly. Here it goes:
# NOTE I directly disabling blocking is not necessary but it represents
# an important piece to the problem so I am leaving it here.
# http_response.fp._sock.socket.setblocking(0)
http_response.fp._sock.settimeout(read_timeout)
http_response.read(chunk_size)
NOTE This solution also works for the python requests ANY library that implements the normal python sockets (which should be all of them?). You just have to go a few levels deeper:
resp.raw._fp.fp._sock.socket.setblocking()
resp.raw._fp.fp._sock.settimeout(read_timeout)
resp.raw.read(chunk_size)
As of this writing, I have not tried the following but in theory it should work:
resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.socket.setblocking()
resp.raw._fp.fp._sock.settimeout(read_timeout)
for chunk in resp.iter_content(chunk_size):
# do stuff
Explanation
I stumbled upon this approach when reading this SO question for setting a timeout on socket.recv
At the end of the day, any http request has a socket. For the httplib that socket is located at resp.raw._fp.fp._sock.socket. The resp.raw._fp.fp._sock is a socket._fileobj (which I honestly didn't look far into) and I imagine it's settimeout method internally sets it on the socket attribute.

Categories