Does Python requests session keep page active?

Does Python requests session keep page active? - python

So I am currently writing a script that will allow me to wait on a website that has queue page before I can access contents
Essentially queue page is where they let people in randomly. In order to increase my chance of getting in faster , I am writing multi thread script and have each thread wait in line.
First thing that came to my mind is would session.get() works in this case?
If I send session get request every 10 seconds, would I stay hold my position in queue? Or would I end up at the end?
Some info about website, they randomly let people in. I am not sure if refreshing page reset your chance or not. But best thing would be to leave page open and let it do it things.
I could use phantomjs but I would rather not have over 100 headless browser open slowing down my program and computer

You don't need to keep sending the session, as long as you keep the Python application running you should be good.

Related

Django : Stop calling same view if it's already called

I have a view, and when user click on a button it gets called. But if a user clicks the button twice, it gets called for another even if the first is still executing or running.
This produces a problem (if I am correct), and that is: it stops execution for the first one and starts executing for the other. How can I stop this calling of views twice?

When a request has been sent to the server, Django will pick it up and send a response when it's finished. if there is no one there to receive the response then nothing happens but the processes have been done already. You should ask for verification if your process is important to be received.
When the user sends the second request, both requests will be processed and will return a response. If you're working with an API and your frontend isn't rendered in the backend server then the user will probably receive both responses and depending on the code on the front side, many things can happen. they might see both responses or just one. or the first one might get updated instantly when the second response comes in.
This might lead to problems depending on the processes you are running. for example, if you are updating a record in a database and the update can only happen once, then probably you'll get an error in the second try. it is really really rare and mostly impossible that the update part for both requests happens in the exact same time depending on your database, your code, and many other things.
There are many ways to handle such a problem which most of them depends on many things but I'm gonna list you few of these options:
1 - Change the display for the button to none so the user won't be able to click on it again. This is a very simple solution and works most of the times unless you have users trying to hurt your system intentionally.
2 - Redirect the user to another page and then wait for a response there.
3 - If your view is doing some heavy processes which are expensive to run, then just create a queue system with some limitations for each user. This is usually done in scaled projects with a lot of users.
4 - Use a rate limiting system to deny too many requests at once or block any none normal traffic.

my advice, use jquery
$("btnSubmit").click(function(event){
event.preventDefault();
//disable the submit button
$("#btnSubmit").attr("disabled", true);
// call ur view using ajax here
});
and then after ajax ended activate the button
$("#btnSubmit").attr("disabled", false);

How to make Selenium WebDriver not to wait for whole page to load

I am using python 2.7 with Selenium webdriver for Firefox and I have a problem I can't solve or find solved on the internet.
My task is to open around 10k web pages (adsl router web interfaces - via ip address) and upload new firmware. I wrote the code but in order to finish it I must learn how to make selenium webdriver not to wait for page load like forever, but to wait for 2 minutes(it is time needed for new firmware to upload) and then proceed to next step.
I thought I let it wait for forever (wait for router to reconnect - much slower but doable by me without help) but the catch is when I click upload button it takes 2 minutes of uploading new firmware, then router reboots to apply changes (takes less then 2 minutes), then tries to connect (around 10 seconds) and even then if it gets some other IP it will never load and my program wait for forever.
So, I want to skip all of that and I want program to proceed to next router after first 2 minutes. Can it be done? I read something about "pageLoadingStrategy" but I couldn't understand how to use it.
Please write me if it is not understandable, because English is not my native language. Below you can see the code sample, after button.submit() it should wait for 2 minutes and proceed and not wait forever:
def firmware_upload():
global ip
br.get("http://"+ip+"/upload.html")
button = br.find_element_by_xpath('//input[#type="file" and #name="filename"]')
button.send_keys("/home/djura/Downloads/KW5815A_update_140417")
button.submit()
print ("Odradjen UPDATE SOFTWARE-a!")
return

see if this works.
try
{
br.manage().timeouts().pageLoadTimeout(2, TimeUnit.MINUTES);
global ip
br.get("http://"+ip+"/upload.html")
button = br.find_element_by_xpath('//input[#type="file" and #name="filename"]')
button.send_keys("/home/djura/Downloads/KW5815A_update_140417")
button.submit()
print ("Odradjen UPDATE SOFTWARE-a!")
return
}
catch(TimeoutException e)
{
print("2min over");
}

The problem is probably because you are using button.submit, which if I'm not mistaken waits for the return of the action. So instead you should find the actual submit button and click on it using click, e.g.
submit_button = br.find_element_by_id('SUBMIT_BTN_ID')
submit_button.click()
P.S. the fact that in your example code your button variable actually refers to an input element is misleading.

Trying to get a loop running concurrently with web.py

At this stage in my little project, I want to get web.py to display some information retrived from some sensors attached to my Raspberry Pi. I want the information to get retrieved regularly, so I put a loop at the bottom of my code that refreshes the sensors using my library, sticks the readings in a dict and sleeps for 10 seconds.
while True:
onewiretemp.refreshSensor()
temperatures = onewiretemp.sensorTemp
time.sleep(10)
Unfortunately, the side effect of this is that when I run my program and try loading a page, my loop runs and the page never loads. I also have to manually kill the process to stop it. I assume this is because the loop is running indefinitely.
The rest of my code is basically the tutorial and some templating stuff from the web.py website. That part works when I disable the loop and manually add in the data formatted the same as the output from my onewiretemp module.
Is there a better way of doing this that doesn't break everything?

Run the sensor poll loop in a separate thread. Protect temperatures with a lock when you read or write it. Your main web.py script can grab the lock and get the latest data.

long time running python script

I have application of following parts:
client->nginx->uwsgi(python)
and some python scripts can be running long time (2-6 minutes). After execution of script I should give to client content, but connection break with error "gateway timeout 504". What can I use for my case to avoid this error?

So is your goal to reduce the run time of the scripts, or to not have them time out? Browsers are going to give up on a 6 minute request no matter what you try.
Perhaps try doing the work on the server, and then polling for progress with AJAX requests?
Or, if possible, try optimizing the scripts. For example, if you have some horribly slow SQL stuff going on, try cleaning that up.
Otherwise, without more information, a more specific answer is hard to give.

I once set up a system where the "main page" contained an Iframe which showed the output of the long running program as text/plain. I think the the handler for the the Iframe content was a Python CGI script which emitted all headers and then the program output line by line under an Apache server.
I don't know whether this would work under your configuration.

This heavily depends on your server setup (i.e. how easy it is to push data back to the client), but is it possible while running your lengthy application to periodically send some “null” content (e.g plain newlines assuming your output is html) so that the browser thinks this is just a slow connection and not a stalled one?

How to check if remote file exits behind a proxy

I writing app that connect to a web server (I am the owner of he server) sends information provided by the user, process that information and send result back to the application. The time needed to process the results depends on the user request (from few seconds to a few minutes).
I use a infinite loop to check if the file exist (may be there is a more intelligent approach... may be I could estimated the maximum time a request could take and avoid using and infinite loop)
the important part of the code looks like this
import time
import mechanize
br = mechanize.Browser()
br.set_handle_refresh(False)
proxy_values={'http':'proxy:1234'}
br.set_proxies(proxy_values)
While True:
try:
result=br.open('http://www.example.com/sample.txt').read()
break
except:
pass
time.sleep(10)
Behind a proxy the loop never ends, but if i change the code for something like this,
time.sleep(200)
result=br.open('http://www.example.com/sample.txt').read()
i.e. I wait enough time to ensure that the file is created before trying to read it, I indeed get the file :-)
It seems like if mechanize ask for a file that does not exits everytime mechanize will ask again I will get no file...
I replicated the same behavior using Firefox. I ask for a non-existing file then I create that file (remember I am the owner of the server...) I can not get the file.
And using mechanize and Firefox I can get deleted files...
I think the problem is related to the Proxy cache, I think I can´t delete that cache, but may be there is some way to tell the proxy I need to recheck if the file exists...
Any other suggestion to fix this problem?

The simplest solution could be to add a (unused) GET parameter to avoid caching the request.
ie:
i = 0
While True:
try:
result=br.open('http://www.example.com/sample.txt?r=%d' % i).read()
break
except:
i += 1
time.sleep(10)
The extra parameter should be ignored by the web application.
A HTTP HEAD is probably the correct way to do this, see this question for a example.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.