So I am trying to download a file from and API which will be in csv format
I generate a link with user inputs and store it in a variable exportLink
import requests
#getProjectName
projectName = raw_input('ProjectName')
#getApiToken
apiToken = "mytokenishere"
#getStartDate
startDate = raw_input('Start Date')
#getStopDate
stopDate = raw_input('Stop Date')
url = "https://api.awrcloud.com/get.php?action=export_ranking&project=%s&token=%s&startDate=%s&stopDate=%s" % (projectName,apiToken,startDate,stopDate)
exportLink = requests.get(url).content
exportLink will store the generated link
which I must then call to download the csv file using another
requests.get() command on exportLink
When I click the link it opens the download in a browser,
is there any way to automate this so it opens the zip and I can begin
to edit the csv using python i.e removing some stuff?
If you have bytes object zipdata that you got with requests.get(url).content, you can extract file by file to another bytes object
import zipfile
import io
import csv
with zipfile.ZipFile(io.BytesIO(zipdata)) as z:
for f in z.filelist:
csvdata = z.read(f)
and then do something with csvdata
reader = csv.reader(io.StringIO(csvdata.decode()))
...
Related
I want to download text files using python, how can I do so?
I used requests module's urlopen(url).read() but it gives me the bytes representation of file.
For me, I had to do the following (Python 3):
from urllib.request import urlopen
data = urlopen("[your url goes here]").read().decode('utf-8')
# Do what you need to do with the data.
You can use multiple options:
For the simpler solution you can use this
file_url = 'https://someurl.com/text_file.txt'
for line in urllib.request.urlopen(file_url):
print(line.decode('utf-8'))
For an API solution
file_url = 'https://someurl.com/text_file.txt'
response = requests.get(file_url)
if (response.status_code):
data = response.text
for line in enumerate(data.split('\n')):
print(line)
When downloading text files with python I like to use the wget module
import wget
remote_url = 'https://www.google.com/test.txt'
local_file = 'local_copy.txt'
wget.download(remote_url, local_file)
If that doesn't work try using urllib
from urllib import request
remote_url = 'https://www.google.com/test.txt'
file = 'copy.txt'
request.urlretrieve(remote_url, file)
When you are using the request module you are reading the file directly from the internet and it is causing you to see the text in byte format. Try to write the text to a file then view it manually by opening it on your desktop
import requests
remote_url = 'test.com/test.txt'
local_file = 'local_file.txt'
data = requests.get(remote_url)
with open(local_file, 'wb')as file:
file.write(data.content)
The code below will parse JSON from the URL to retrieve 10 urls and put them in an output.txt file.
import json
import urllib.request
response = urllib.request.urlopen('https://json-test.com/test').read()
jsonResponse = json.loads(response)
jsonResponse = json.loads(response.decode('utf-8'))
for child in jsonResponse['results']:
print (child['content'], file=open("C:\\Users\\test\\Desktop\\test\\output.txt", "a"))
Now that there are 10 links to csv files in the output.txt , trying to figure out how I can download and save the 10 files. Tried doing doing something like this but not working.
urllib.request.urlretrieve(['content'], "C:\\Users\\test\\Desktop\\test\\test1.csv")
Even if I get the above working it is just for 1 file, there are 10 file links in the output.txt. Any ideas?
Here is a exhausting guide on how to download files over http.
If the text file contains one link per line, you can iterate through the lines like this:
file = open('path/to/file.ext', 'r')
id = 0
for line in file:
# ... some regex checking if the text is actually a valid url
response = urllib.request.urlretrieve(line, 'path/to/file' + str(id) + '.ext')
id+=1
I was trying to do that task with Matlab using :
url = 'the url of the file';
file_name = 'data.mat';
outfilename = websave(filename,url);
load(outfilename);
but it didn't work, how can i do that using python? kindly note i want the .mat as it is not an html , csv or any other format i just that file just downloaded(i can do it manually but i have hundreds that's why i need that)
.(python 3)
using urllib2:
import urllib2
response = urllib2.urlopen("the url")
file = open("filename.mat", 'w')
file.write(response.read())
file.close()
I'm working on a script that will automatically update an installed version of Calibre. Currently I have it downloading the latest portable version. I seem to be having trouble saving the zipfile. Currently my code is:
import urllib2
import re
import zipfile
#tell the user what is happening
print("Calibre is Updating")
#download the page
url = urllib2.urlopen ( "http://sourceforge.net/projects/calibre/files" ).read()
#determin current version
result = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', url).groups()[0][:-1]
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
urllib2.urlopen( download )
#save
output = open('install.zip', 'w')
output.write(zipfile.ZipFile("install.zip", ""))
output.close()
You don't need to use zipfile.ZipFile for this (and the way you're using it, as well as urllib2.urlopen, has problems as well). Instead, you need to save the urlopen result in a variable, then read it and write that output to a .zip file. Try this code:
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
request = urllib2.urlopen( download )
#save
output = open("install.zip", "w")
output.write(request.read())
output.close()
There also can be a one-liner:
open('install.zip', 'wb').write(urllib.urlopen('http://status.calibre-ebook.com/dist/portable/' + result).read())
which doesn't have a good memory-efficiency, but still works.
If you just want to download a file from the net, you can use urllib.urlretrieve:
Copy a network object denoted by a URL to a local file ...
Example using requests instead of urllib2:
import requests, re, urllib
print("Calibre is updating...")
content = requests.get("http://sourceforge.net/projects/calibre/files").content
# determine current version
v = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', content).groups()[0][:-1]
download_url = "http://status.calibre-ebook.com/dist/portable/{0}".format(v)
print("Downloading {0}".format(download_url))
urllib.urlretrieve(download_url, 'install.zip')
# file should be downloaded at this point
have you tryed
output = open('install.zip', 'wb') // note the "b" flag which means "binary file"
I'm trying to return a CSV from an action in my webapp, and give the user a prompt to download the file or open it from a spreadsheet app. I can get the CSV to spit out onto the screen, but how do I change the type of the file so that the browser recognizes that this isn't supposed to be displayed as HTML? Can I use the csv module for this?
import csv
def results_csv(self):
data = ['895', '898', '897']
return data
To tell the browser the type of content you're giving it, you need to set the Content-type header to 'text/csv'. In your Pylons function, the following should do the job:
response.headers['Content-type'] = 'text/csv'
PAG is correct, but furthermore if you want to suggest a name for the downloaded file you can also set response.headers['Content-disposition'] = 'attachment; filename=suggest.csv'
Yes, you can use the csv module for this:
import csv
from cStringIO import StringIO
...
def results_csv(self):
response.headers['Content-Type'] = 'text/csv'
s = StringIO()
writer = csv.writer(s)
writer.writerow(['header', 'header', 'header'])
writer.writerow([123, 456, 789])
return s.getvalue()