I am building a port scanning program ((irrelevant to the question, just explaining the background)), and I know the IP of the host, but not what ports are open. Hence, the scan.
It is in the early stages of development, so the error handling is bad, but not bad enough to make why Python does this explainable.
It tries to connect to, say, 123.456.7.8, 1. Obviously it's a ridiculous port to be open, so it throws an error. The error is No Route to Host or the such, right? Wrong! It is instead Operation Timed Out!.
Okay, let's increase the timeout in case my calculations were incorrect.
.
..
...
....All that did was rinse and repeat!
About 20 minutes later, the timeout is at 20 seconds, and it still is timing out. Really? Why does python raise a timed out error though, instead of No route to host! or similar?
I need to distinguish between time outs and connection failures, because there is a difference between late and nowhere. This prevents me from doing so, creating an infinite loop of hurry up and wait.
Whatever shall I do? Wherever shall I go?
Python socket module is a thin wrapper around your platform's socket API. The issue is unrelated to Python.
It is not necessary that you get No Route to Host error. Moreover it is common that a firewall just drops received packets (for a filtered port) that may manifest as a timeout error in your code. See Drop vs. Reject (ignore the conclusion but read the explanation of what is happening).
To workaround, make multiple concurrent connections and set a fixed timeout or use raw-sockets and send the packets yourself (you could use scapy, to investigate the behavior).
Related
Pseudo-code to better explain question:
#!/usr/bin/env python2.7
import pycurl, threading
def threaded_work():
conn = pycurl.Curl()
conn.setopt(pycurl.TIMEOUT, 10)
# Make a request to host #1 just to open the connection to it.
conn.setopt(pycurl.URL, 'https://host1.example.com/')
conn.perform_rs()
while not condition_that_may_take_very_long:
conn.setopt(pycurl.URL, 'https://host2.example.com/')
print 'Response from host #2: ' + conn.perform_rs()
# Now, after what may be a very long time, we must request host #1 again with a (hopefully) already established connection.
conn.setopt(pycurl.URL, 'https://host1.example.com/')
print 'Response from host #1, hopefully with an already established connection from above: ' + conn.perform_rs()
conn.close()
for _ in xrange(30):
# Multiple threads must work with host #1 and host #2 individually.
threading.Thread(target = threaded_work).start()
I am omitting extra, only unnecessary details for brevity so that the main problem has focus.
As you can see, I have multiple threads that must work with two different hosts, host #1 and host #2. Mostly, the threads will be working with host #2 until a certain condition is met. That condition may take hours or even longer to be met, and will be met at different times in different threads. Once the condition (condition_that_may_take_very_long in the example) is met, I would like host #1 to be requested as fast as possible with the connection that I have already established at the start of the threaded_work method. Is there any efficient way to efficiently accomplish this (open to the suggestion of using two PycURL handles, too)?
Pycurl uses libcurl. libcurl keeps connections alive by default after use, so as long as you keep the handle alive and use that for the subsequent transfer, it will keep the connection alive and ready for reuse.
However, due to modern networks and network equipment (NATs, firewalls, web servers), connections without traffic are often killed off relatively soon so having an idle connection and expecting it to actually work after "hours", is a very slim chance and rare occurance. Typically, libcurl will then discover that the connection has been killed in the mean time and create a new one to use at the next use.
Additionally, and in line with what I've described above, since libcurl 7.65.0 it now defaults to not reusing connections anymore that are older than 118 seconds. Changeable with the CURLOPT_MAXAGE_CONN option. The reason is that they barely ever work so by avoiding having to keep them around, detect them to be dead and reissue the request, this is an optimization.
Using Airflow worker and webserver/scheduler as a Docker images running on Kubernetes Engine on EC2
We have a task which has KubernetesPodOperator which is resource intensive and runs every 15min.
Got these error as email in airflow-worker
Try 2 out of 3
Exception:
('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
Log: Link
Host: airflow-worker-deployment-123456789
Log file: /usr/local/airflow/logs/DAG_NAME/TASK_NAME/2019-03-14T10:50:00+00:00.log
Mark success: Link
Any idea what It can be ?
so, better late than never
it is because of known bug in KubernetesPodOperator.
to avoid this behavior you have to set operators get_logs parameter to False. default value is True.
details here
https://issues.apache.org/jira/browse/AIRFLOW-3534
https://issues.apache.org/jira/browse/AIRFLOW-5571
Well, it means that thge program logic didn't get the data it expected to receive from some socket. This can mean anything, from an intermittent network problem to the data simply not arriving in time and the logic not programmed to wait for it. IF the task is automatically retried, you may not even need to worry about intermittent problems.
If you wish to diagnose this further, you need to gather some diagnostic information. Problems are always diagnosed by the same scenario:
Identify the exact place in the program when the problem manifests itself.
Examine the program's state at that moment and find out which of the values is wrong.
Trace the erroneous value back to its origin.
The first can be identified by the stack trace and/or searching the codebase for relevant logic. The second -- debugging or debug printing. The 3rd is usually done by rerunning the program with breakpoints set earlier, at the step in the logic that produces the erroneous value; in your case, you can only do that very slowly, by waiting for the problem to happen again, so you are forced to do educated guesses from the codebase.
From time to time I suddenly have a need to connect to a device's console via its serial port. The problem is, I never remember what port settings (baud rate, data bits, stop bits, etc...) to use with each particular device, and documentation never seems to be lying around when it's really needed.
I wrote a Python script, which uses a simple brute-force method (i.e. iterates over all possible settings, sends some test input and displays the response for a human to decide if it makes sense ), but:
it takes a long time to complete
does not always work (perhaps port reset/timeout issues)
just does not seem like a proper way to do this :)
So the question is: does anyone know of a procedure to auto-detect what port settings the remote device is using?
Although part 1 is no direct answer to your question:
There are devices, which just have a autodetection (called Auto-bauding) method included, that means: Send a character using your current settings (9k6, 115k2, ..) to the device and chances are high that the device will answer with your (!) settings. I've seen this on HP switches.
Second approach: try to re-order the connection possibilities. E.g. chances are high that the other end uses 9k6 with no hardware handshake, but less that it uses 38k4 with software Xon/Xoff.
If you break down your tries into just a few, the "brute force" method will be much more efficient.
I'm using Python 3.2.3 on Windows 7, and one piece of code I have connects to a server with a blocking socket, with a user-specified timeout value. The code is simply:
testconn = socket.create_connection((host, port), timeout)
The code works fine, apart from the odd fact that timing out seems to take longer than it should on invalid requests. I tried connecting to www.google.com:59855 deliberately (random port should mean it should try connecting until it reaches the timeout), with a timeout of 5 seconds, but it seemed to take 15 seconds at least to timeout.
Are there any possible reasons for this, and/or any fixes? (It's not a huge problem if it's not fixable, but a solution would be appreciated nevertheless.) Thanks in advance.
This isn't an issue specific to Python 3 or Windows. Take a look at the docs for create_connection(): http://docs.python.org/library/socket.html#socket.create_connection
The important snippet is:
if host is a non-numeric hostname, it will try to resolve it for both
AF_INET and AF_INET6, and then try to connect to all possible
addresses in turn until a connection succeeds.
It resolves the name using socket.getaddrinfo. If you run
socket.getaddrinfo('google.com', 59855, 0, socket.SOCK_STREAM)
You'll probably get a few results returned. When you call socket.create_connection, it will iterate over all of those results, each waiting for timeout seconds until it fails. Because it waits timeout seconds for EACH result, the total time is obviously going to be greater than timeout.
If you call create_connection with an IP address rather than host name, e.g.
testconn = socket.create_connection(('74.125.226.201', 59855), timeout=5)
you should get your 5 second timeout.
And if you're really curious, take a look at the source for create_connection. It's pretty simple and you can see the loop that is causing your problems:
https://github.com/python/cpython/blob/3.2/Lib/socket.py#L408
I have to filter and modify network traffic using Linux kernel libnetfilter_queue (precisely the python binding) and dpkt, and i'm trying to implement delayed packet forward.
Normal filtering works really well, but if i try to delay packets with function like this
def setVerdict(pkt, nf_payload):
nf_payload.set_verdict_modified(nfqueue.NF_ACCEPT, str(pkt), len(pkt))
t = threading.Timer(10, setVerdict, [pkt, nf_payload])
t.start()
It crashs throwing no exception (surely is a low level crash). Can i implement delay using directly libnetfilter like this or I must copy pkt, drop it and send the copy using standard socket.socket.send()?
Thank you
Sorry for the late reply, but I needed to do something like this, although slightly more complicated. I used the C-version of the library and I copied packets to a buffer inside my program, and then issued a DROP verdict. After a timeout relating to your delay, I reinject the packet using a raw socket. This works fine, and seems quite efficient.
I think the reason for your crash was due to the fact that you didnt issue a verdict fast enough.
I can't answer your question, but why not use the "netem" traffic-queue module on the outgoing interface to delay the packet?
It is possible to configure tc queues to apply different policies to packets which are "marked" in some way; the normal way to mark such packets is with a netfilter module (e.g. iptables or nfqueue).