I am trying to write a code that will download all the data from a server which holds the .rar files about imaginary cadastrial particles for student projects. What I got for now is the query for the server which only needs to input a specific number of particle and access it as url to download the .rar file.
url = 'http://www.pg.geof.unizg.hr/geoserver/wfs?request=getfeature&version=1.0.0&service=wfs&&propertyname=broj,naziv_ko,kc_geom&outputformat=SHAPE-ZIP&typename=gf:katastarska_cestica&filter=<Filter+xmlns="http://www.opengis.net/ogc"><And><PropertyIsEqualTo><PropertyName>broj</PropertyName><Literal>1900/1</Literal></PropertyIsEqualTo><PropertyIsEqualTo><PropertyName>naziv_ko</PropertyName><Literal>Suma Striborova Stara (9997)</Literal></PropertyIsEqualTo></And></Filter>'
This is the "url" I want to open with the web browser module for a particle "1900/1" but this way I get an error:
This XML file does not appear to have any style information associated with it. The document tree is shown below.
When I manually input this url it downloads the file without a problem.
What is the way I can make this python web application work?
I used a webbrowser.open_new(url) option which does not work.
You're using the wrong tool. webbrowser is for controlling a native web browser. If you just want to download a file, use the requests module (or urllib.request if you can't install Requests).
import requests
r = requests.get('http://www.pg.geof.unizg.hr/geoserver/wfs', params={
'request': 'getfeature',
...
'filter': '<Filter xmlns=...>'
})
print(r.content) # or write it to a file, or whatever
Note requests will handle encoding GET parameters for you -- you don't need to worry about escaping the request yourself.
Related
My goal for this python code is to create a way to obtain job information into a folder. The first step is being unsuccessful. When running the code I want the url to print https://www.indeed.com/. However instead the code returns https://secure.indeed.com/account/login. I am open to using urlib or cookielib to resolve this ongoing issue.
import requests
import urllib
data = {
'action':'Login',
'__email':'email#gmail.com',
'__password':'password',
'remember':'1',
'hl':'en',
'continue':'/account/view?hl=en',
}
response = requests.get('https://secure.indeed.com/account/login',data=data)
print(response.url)
If you're trying to scrape information from indeed, you should use the selenium library for python.
https://pypi.python.org/pypi/selenium
You can then write your program within the context of a real user browsing the site normally.
I use a webapp that can generate a PDF report of some data stored in the app. To get to that report, however, requires several clicks and monkeying around with the app.
I support a group of users of this app (we use the app, we don't create the app) and I'd like them to be able to generate and view this report with as few clicks as possible. Thankfully, this web app provides a lot of data via a RESTful API. So I did some scripting.
I have a Python script that makes an HTTP GET request, processes the JSON results, and uses that resultant data to dynamically build a URL. Here's a simplified version of my python code:
#!/usr/bin/env python
import requests
app_id="12345"
secret="67890"
api_url='https://api.webapp.example/some_endpoint'
resp = requests.get(api_url, auth=(app_id,secret))
json_data = resp.json()
# Simplification of the data processing I'm doing
my_data = json_data['attr1']['attr2'] + my_data_processing
# Result of the script is a link to a dynamically generated PDF
pdf_url = 'https://pdf.webapp.example/items/' + my_data
The above is a simplification of the code I actually have, but it shows the relevant points. In my actual script, I continue on by doing another GET with the dynamically built URL. The webapp generates a PDF based on the my_data portion of the URL, and I write that PDF to file. This works very well today.
Currently, this is a python script that runs on my local machine on-demand. However, I'd like to host this somewhere on the web so that when a user hits a URL in their browser it runs and generates the pdf_url, instead of having to install this script on each user's local machine, and so that the PDF can be generated and viewed on a mobile device.
The thought is that the user can open http://example.com/report-shortcut, the python script would run server-side, dynamically build the URL, and redirect the user to that URL, which would then show the PDF in the browser (assuming the user is using a browser that shows PDFs like Chrome, Safari, etc). Alternately, if a redirect is problematic, going to http://example.com/report-shortcut could just show an HTML page with a link to the URL generated by the Python script.
I'm looking for a solution on how to host this Python script and have it run when a user accesses a webpage. I've looked into AWS Lambda and Django, but both seem like overkill for such a simple script (~20 lines of code, plus comments and whitespace). I've also looked at Python CGI scripting, which looks promising, but I have no experience setting up something like that.
Looking for suggestions on how best to host and run this code when a user goes to the example URL.
PS: I thought about just re-implementing in Javascript, but I'd rather the API key not be publicly accessible.
I suggest building the script in AWS Lambda and using the API Gateway to invoke it.
You could create the pdf, store it in S3 and generate a pre-signed URL. Then return a response 302 to the user to redirect them to the pre-signed URL. This will display the PDF in their browser.
Very quick to setup and using Boto3 getting the PDF into S3 and generating the URL is simple.
It will be much simpler than some of your other suggestions.
See API Gateway
& Boto3
I'm a developer for a big GUI app and we have a web site for bug tracking. Anybody can submit a new bug to the bug tracking site. We can detect certain failures from our desktop app (i.e. an unhandled exception) and in such cases we would like to open the submit-new-bug form in the user predefined browser, adding whatever information we can gather about the failure to some form fields. We can either retrieve the submit-new-bug form using GET or POST http methods and we can provide default field values to that form. So from the http server side everything is pretty much OK.
So far we can successfully open a URL passing the default values as GET parameters in the URL using the webbrowser module from the Python Standard Library. There are, however, some limitations of this method such as the maximum allowed length of the URL for some browsers (specially MS IE). The webbrowser module doesn't seem to have a way to request the URL using POST. OTOH there's the urllib2 module that provides the type of control we want but AFAIK it lacks the possibility of opening the retrieved page in the user preferred browser.
Is there a way to get this mixed behavior we want (to have the fine control of urllib2 with the higher level functionallity of webbrowser)?
PS: We have thought about the possibility of retreiving the URL with urllib2, saving its content to a temp file and opening that file with webbrowser. This is a little nasty solution and in this case we would have to deal with other issues such as relative URLs. Is there a better solution?
This is not proper answer. but it also work
import requests
import webbrowser
url = "https://www.facebook.com/login/device-based/regular/login/?login_attempt=1&lwv=110"
myInput = {'email':'mymail#gmail.com','pass':'mypaass'}
x = requests.post(url, data = myInput)
y = x.text
f = open("home.html", "a")
f.write(y)
f.close()
webbrowser.open('file:///root/python/home.html')
I don't know of any way you can open the result of a POST request in a web browser without saving the result to a file and opening that.
What about taking an alternative approach and temporarily storing the data on the server. Then the page can be opened in the browser with a simple id parameter, and the saved partially filled form would be shown.
You could use tempfile.NamedTemporaryFile():
import tempfile
import webbrowser
import jinja2
t = jinja2.Template('hello {{ name }}!') # you could load template from a file
f = tempfile.NamedTemporaryFile() # deleted when goes out of scope (closed)
f.write(t.render(name='abc'))
f.flush()
webbrowser.open_new_tab(f.name) # returns immediately
A better approach if the server can be easily modified is to make POST request with partial parameters using urllib2 and open url generated by server using webbrowser as suggested by #Acorn.
I have a link like this, direct to a mp3 file. So when I put it in my browser, basically asks me if I want to download the file, however when I do the same thing with python by the following code :
> data = urllib2.urlopen("http://www23.zippyshare.com/d/44123087/497548/Lil%20Wayne%20ft.%20Eminem%20-%20Drop%20The%20World.mp3".read())
I will redirected to another link like this. Therefore, instead of the MP3 data, I am getting the html code for
'http://www23.zippyshare.com/v/44123087/file.html'
any ideas ?
thanks
urllib2 handles redirection transparently. You might want to see what the server is actually doing when it is presenting such a redirection as well allowing you to download. You might want to subclass the redirect handler and see which property of the header is giving you the url and use urlretrieve to download that.
Setting the cookies, trying explicitly might be a good try as well.
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.open('yourmp3filelink')
Your link redirects to an HTML webpage, most likely because your download request is timing out. That's often how these download websites work: you never get a static link to the download, only a temporarily assigned link.
My guess is that there's no way to get that static link using that website. You'd have to know where that file was actually coming from.
So no, nothing is wrong with your python code; just your sources.
I am writing a script which will run on my server. Its purpose is to download the document. If any person hit the particular url he/she should be able to download the document. I am using urllib.urlretrieve but it download document on the server side not on the client. How to download in python at client side?
If the script runs on your server, its purpose is to serve a document, not to download it (the latter would be the urllib solution).
Depending on your needs you can:
Set up static file serving with e.g. Apache
Make the script execute on a certain URL (e.g. with mod_wsgi), then the script should set the Content-Type (provides document type such as "text/plain") and Content-Disposition (provides download filename) headers and send the document data
As your question is not more specific, this answer can't be either.
Set the appropriate Content-type header, then send the file contents.
If the document is on your server and your intention is that the user should be able to download this file, couldn't you just serve the url to that resource as a hyperlink in your HTML code. Sorry if I have been obtuse but this seems the most logical step given your explanation.
You might want to take a look at the SocketServer module.