How can i ignore server response to save bandwidth? - python

I am using a server to send some piece of information to another server every second. The problem is that the other server response is few kilobytes and this consumes the bandwidth on the first server ( about 2 GB in an hour ). I would like to send the request and ignore the return ( not even receive it to save bandwidth ) ..
I use a small python script for this task using (urllib). I don't mind using any other tool or even any other language if this is going to make the request only.

A 5K reply is small stuff and is probably below the standard TCP window size of your OS. This means that even if you close your network connection just after sending the request and checking just the very first bytes of the reply (to be sure that request has been really received) probably the server already sent you the whole answer and the packets are already on the wire or on your computer.
If you cannot control (i.e. trim down) what is the server reply for your notification the only alternative I can think to is to add another server on the remote machine waiting for a simple command and doing the real request locally and just sending back to you the result code. This can be done very easily may be even just with bash/perl/python using for example netcat/wget locally.
By the way there is something strange in your math as Glenn Maynard correctly wrote in a comment.

For HTTP, you can send a HEAD request instead of GET or POST:
import urllib2
request = urllib2.Request('https://stackoverflow.com/q/5049244/')
request.get_method = lambda: 'HEAD' # override get_method
response = urllib2.urlopen(request) # make request
print response.code, response.url
Output
200 https://stackoverflow.com/questions/5049244/how-can-i-ignore-server-response-t
o-save-bandwidth
See How do you send a HEAD HTTP request in Python?

Sorry but this does not make much sense and is likely a violation of the HTTP protocol. I consider such an idea as weird and broken-by-design. Either make the remote server shut up or configure your application or whatever is running on the remote server on a different protocol level using a smarter protocol with less bandwidth usage. Everything else is hard being considered as nonsense.

Related

When is the content of a GET request received in python?

I am fairly new to computer networking and want to use the python requests library for downloading large files from an external FTP server. I have a conceptual question as to when the content of a large file is received and how the client tells the server when to send over the content.
My code looks somewhat like
import requests
...
response = requests.get(url_to_very_large_file, stream=True)
...
with open(save_path, "wb") as file:
for chunk in response.iter_chunks(chunk_size):
file.write(chunk)
Now response arrives back from the server very quickly (less than a second), but the content of the file (say 2 GB heavy for the sake of argument) surely cannot arrive that fast. I'm also confused that response already has a content attribute. What happens under the hood?
More precisely:
What is in response.content?
Does the server now bombard my client with the 2 GB content right away, or is another request sent to the server when I ask for response.iter_chunks or response.content.read()? At which point does the server start sending over the 2GB of content?
Does the server know in which chunk_size I am reading /expecting the files?
Where are the chunks stored in the meantime, if they are received by the client but not read into memory?
response.content attribute contains the returned bytes from the remote server. This attribute is a property, so if you sent the request with stream=True option, it won't contain the content upon creation, until you access it- which is the moment where it'll pull all the data from the server.
When you send a request to a server, you're establishing a connection which the server will send data through. This doesn't have to happen at once, and if your underlying client is not pulling a data to its RAM, server will wait for you for a while. By using .iter_chunks method you're slowly pulling data from the server few bytes at a time.
They don't, and considering how TCP connection works it isn't necessary either.
Server doesn't send us a data until we got a room for it, hence they're not on our machine unless they're on our memory.
If you have already learnt other languages like Java, you could think of property as getter/setter but in more integrated way. Check the post I linked above for better explanations.
It might be helpful to learn how TCP connection and socket works, since those are the ones that does all the stuff under the hood.

Is it possible to recreate a request from the packets programatically?

For a script I am making, I need to be able to see the parameters that are sent with a request.
This is possible through Fiddler, but I am trying to automate the process.
Here are some screenshots to start with. As you can see in the first picture of Fiddler, I can see the URL of a request and the parameters sent with that request.
I tried to do some packet sniffing with scapy with the code below to see if I can get a similar result, but what I get is in the second picture. Basically, I can get the source and destination of a packet as ip addresses, but the packets themselves are just bytes.
def sniffer():
t = AsyncSniffer(prn = lambda x: x.summary(), count = 10)
t.start()
time.sleep(8)
results = t.results
print(len(results))
print(results)
print(results[0])
From my understanding, after we establish a TCP connection, the request is broken down into several IP packets and then sent over to the destination. I would like to be able to replicate the functionality of Fiddler, where I can see the url of the request and then the values of parameters being sent over.
Would it be feasible to recreate the information of a request through only the information gathered from the packets?
Or is this difference because the sniffing is done on Layer 2, and then maybe Fiddler operates on Layer 3/4 before/after the translation into IP packets is done, so it actually sees the content of the original request itself and the result of the combination of packets? If my understanding is wrong, please correct me.
Basically, my question boils down to: "Is there a python module I can use to replicate the features of Fiddler to identify the destination url of a request and the parameters sent along with that request?"
The sniffed traffic is HTTPS traffic - therefore just by sniffing you won't see any details on the HTTP request/response because it is encrypted via SSL/TLS.
Fiddler is a proxy with HTTPS interception, that is something totally different compared to sniffing traffic on network level. This means that for the client application Fiddler "mimics" the server and for the server Fiddler mimics the client. This allows Fiddler to decrypt the requests/responses and show them to you.
If you want to perform request interception on python level I would recommend to you to use mitmproxy instead of Fiddler. This proxy also can perform HTTPS interception but it is written in Python and therefore much easier to integrate in your Python environment.
Alternatively if you just want to see the request/response details of a Python program it may be easier to do so by setting the log-level in an appropriate way. See for example this question: Log all requests from the python-requests module

Connection reset when server doesn't fully consume request body

I'm now familiar with the general cause of this problem from another SO answer and from the uWSGI documentation, which states:
If an HTTP request has a body (like a POST request generated by a
form), you have to read (consume) it in your application. If you do
not do this, the communication socket with your webserver may be
clobbered.
However, I don't understand what exactly is happening at the TCP level for this problem to occur. Not knowing the details of this process, I would assume the server can simply discard what remains in the stream, but that's obviously not the case.
If I consume only part of the request body in my application and ultimately return a 200 response, a web browser will report a connection reset error. Who reset the connection? The webserver or the client? It seems like all the data has been sent by the client already, but the application has just not exhausted the stream. Is there something that happens when the stream is exhausted in the application that triggers the webserver to indicate it has finished reading?
My application is Python/Flask, but I've seen questions about this from several languages and frameworks. For example, this fails if exhaust() is not called on the request stream:
#app.route('/upload', methods=['POST'])
def handle-upload():
file = request.stream
pandas.read_csv(file, nrows=100)
response = # Do stuff
file.exhaust()
return jsonify(response)
While there is some buffering throughout the chain, large file transfers are not going to complete until the receiver has consumed them. The buffers will fill up, and packets will be dropped until the buffers are drained. Eventually, the browser will give up trying to send the file and drop the connection.

Redirecting HTTP requests to device without static/public IP

I'm using a service that sends me some data from user over webhooks. If there is any user interaction on this service, it hits my URL with HTTP request, with the data in POST/GET, and then expects text/json response to show back to the user. The response has to be in few seconds, otherwise the HTTP request times out and the service has no way of finding out what should be the response to the user.
The problem here is that now I'm not processing these data on my server with public IP, but I need to do it on my RPi, which keeps moving, which meains it has different IP every few hours, and mostly not public.
I'm sure I will still need to use the server with public IP to redirect these requests to my RPi, and I have few ideas, but I don't know what is reliable or if it even would work.
Let the API talk to my server and save the data. Then have the RPi constantly asking my server if there are any new data. Propably the dumbest idea - not ideal to use over metered connection, propably longer reply, and it will be harder to return the RPi's reply in the HTTP request made from API.
Having (Python) script running on my server, that will a) serve as socket server and RPi will connect to this socket, and b) have running SimpleHTTPRequestHandler to process requests from API and send them to the socket, the reply with RPi's reply. Propably easy way to keep connection between my server and RPi, allowing me to pass data in both directions.
Open SSH tunnel between the RPi and my server. This way, I could process the requests from service directly on my RPi. But how reliable is this solution? (Keeping it alive, opening the tunnel automatically, etc, propably question for superuser forum)
I'm thinking of going with choice 3 if it will be possible, but first I'd like to hear what you guys think. Is this a good and reliable idea? Or are there any better ways I don't know about? Or did anybody already faced this problem?
To sum it up:
Something sends HTTP request to public IP. I need to process this request (and reply to it) in Python script on device without public IP. I have a server with public IP that could be used as a bridge. I much don't care what will run on the server, if it will be able to redirect these requests.
Thanks

Send TCP messages at certain rate with Python

I am trying to generate some traffic to a server by sending TCP messages to it.
For this, I am using a Python script which opens a TCP socket and then sends some data over it. After receiving a reply, the TCP connection gets closed.
Question: I would like to be able to predefine a rate with which the script will be sending the requests to the server, eg: 5 messages per second. However, I do not have a clue how to script this via Python :(.
Anyone an idea how to do this (a short example would be super ! ;) ?
Thanks in advance.
Note: I might need to add an extra difficulty: since the server has to reply,
I guess I have to make the script working asynchronously ... That way, I can
send the requests out without having to wait for a reply on the previous request...
What you're looking for is an implementation of the token bucket algorithm. It's analogous to a bucket with a fixed capacity, where each consumer can't perform the action until it gets a token, and the bucket is refilled at a fixed rate.
The algorithm is easy to implement, but the link below has an example:
http://code.activestate.com/recipes/511490-implementation-of-the-token-bucket-algorithm/

Categories