Unable to download the converted file from zamzar api using python program, as specified on the https://developers.zamzar.com/docs but as i am using the code correctly along with api key. It is only showing error code : 20. Wasted 4hour behind this error, someone please.
import requests
from requests.auth import HTTPBasicAuth
file_id =291320
local_filename = 'afzal.txt'
api_key = 'my_key_of_zamzar_api'
endpoint = "https://sandbox.zamzar.com/v1/files/{}/content".format(file_id)
response = requests.get(endpoint, stream=True, auth=HTTPBasicAuth(api_key, ''))
try:
with open(local_filename, 'wb') as f:
for chunk in response.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
print("File downloaded")
except IOError:
print("Error")
THis is the code I am using for downloading the converted file.
This code easily convert files into different formats :
import requests
from requests.auth import HTTPBasicAuth
#--------------------------------------------------------------------------#
api_key = 'Put_Your_API_KEY' #your Api_key from developer.zamzar.com
source_file = "tmp/armash.pdf" #source_file_path
target_file = "results/armash.txt" #target_file_path_and_name
target_format = "txt" #targeted Format.
#-------------------------------------------------------------------------#
def check(job_id,api_key):
check_endpoint = "https://sandbox.zamzar.com/v1/jobs/{}".format(job_id)
response = requests.get(check_endpoint, auth=HTTPBasicAuth(api_key, ''))
#print(response.json())
#print(response.json())
checked_data=response.json()
value_list=checked_data['target_files']
#print(value_list[0]['id'])
return value_list[0]['id']
def download(file_id,api_key,local_filename):
downlaod_endpoint = "https://sandbox.zamzar.com/v1/files/{}/content".format(file_id)
download_response = requests.get(downlaod_endpoint, stream=True, auth=HTTPBasicAuth(api_key, ''))
try:
with open(local_filename, 'wb') as f:
for chunk in download_response.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
print("File downloaded")
except IOError:
print("Error")
endpoint = "https://sandbox.zamzar.com/v1/jobs"
file_content = {'source_file': open(source_file, 'rb')}
data_content = {'target_format': target_format}
res = requests.post(endpoint, data=data_content, files=file_content, auth=HTTPBasicAuth(api_key, ''))
print(res.json())
data=res.json()
#print(data)
print("=========== Job ID ============\n\n")
print(data['id'])
target_id=check(data['id'],api_key)
print("\n================= target_id ===========\n\n")
print(target_id)
download(target_id,api_key,target_file)
Hope this well somebody!.
I'm the lead developer for the Zamzar API.
So the Zamzar API docs contain a section on error codes (see https://developers.zamzar.com/docs#section-Error_codes). The relevant code for your error is:
{
"message" : "API key was missing or invalid",
"code" : 20
}
This can mean either that you did not specify an API key at all or that the API key used was invalid for the file you are attempting to download. It seems more likely to be the latter, since your code contains an api_key variable.
Looking at your code it's possible that you have used the job ID (291320) to try and download your file, when in fact you should be using a file ID.
Each conversion job can output 1 or more converted files and you need to specify the file ID for the one you wish to grab. You can see a list of all converted file ID's for your job by querying /jobs/ID and looking at the target_files array. This is outlined in the API docs at https://developers.zamzar.com/docs#section-Download_the_converted_file
So if you change your code to use the file ID from the target_files array of your Job your download should spring into life.
I'm sorry you wasted time on this. Clearly if it has reached S.O. our docs haven't done a good enough job of explaining this distinction so we'll look at what we can do to make them clearer.
Happy converting !
Related
I have this code for server
#app.route('/get', methods=['GET'])
def get():
return send_file("token.jpg", attachment_filename=("token.jpg"), mimetype='image/jpg')
and this code for getting response
r = requests.get(url + '/get')
And i need to save file from response to hard drive. But i cant use r.files. What i need to do in these situation?
Assuming the get request is valid. You can use use Python's built in function open, to open a file in binary mode and write the returned content to disk. Example below.
file_content = requests.get('http://yoururl/get')
save_file = open("sample_image.png", "wb")
save_file.write(file_content.content)
save_file.close()
As you can see, to write the image to disk, we use open, and write the returned content to 'sample_image.png'. Since your server-side code seems to be returning only one file, the example above should work for you.
You can set the stream parameter and extract the filename from the HTTP headers. Then the raw data from the undecoded body can be read and saved chunk by chunk.
import os
import re
import requests
resp = requests.get('http://127.0.0.1:5000/get', stream=True)
name = re.findall('filename=(.+)', resp.headers['Content-Disposition'])[0]
dest = os.path.join(os.path.expanduser('~'), name)
with open(dest, 'wb') as fp:
while True:
chunk = resp.raw.read(1024)
if not chunk: break
fp.write(chunk)
I am connecting to the API that returns a JSON object with several pieces of data and use that data to build an html page. I am having trouble with downloading a local copy of the image using python and including the image tag linking to the image. When I run the code, I receive the error stating AttributeError: 'tuple' object has not attribute 'content'. I have the following code:
import urllib.request
import json
out = open('outfile.txt','w')
link = "https://api.nasa.gov/planetary/apod?api_key="
print(link)
resp = urllib.request.urlopen(link)
data = resp.read()
print(str(data, 'utf-8'))
returnJson = json.loads(data)
img_url = returnJson['url']
title = returnJson['title']
current_date = returnJson['date']
print(img_url)
print(title)
print(current_date)
resp = urllib.request.urlretrieve(img_url)
img_file_name = img_url.split('/')[-1]
with open(img_file_name, 'wb') as f:
f.write(resp.content)
urllib.request.urlretrieve returns a tuple, which doesn't have a content attribute. Instead, it copies the content to a local file. Moreover, this function is legacy and may be deprecated in the future, according to the docs. I would recommend following the advice in the urllib.request docs, which is:
The Requests package is recommended for a higher-level HTTP client interface.
Firstly, your API key is in your question - you might want to edit that out so no one else uses it!
The error is in the last line of the sample you've given us
f.write(resp.content)
At this point, resp is set to the response of urllib.request.urlretrieve(img_url). However, urllib.request.urlretrieve actually returns a tuple - (filename, headers). The filename is where the downloaded resource is stored on the system, and headers is the response headers for the request.
Modifying your code, I believe this might be more what you want?
import os
#rest of your code here
(filename, headers) = urllib.request.urlretrieve(img_url)
img_file_name = img_url.split('/')[-1]
os.replace(filename, img_file_name)
EDIT: os.rename doesn't seem to like existing files, however os.replace does!
I am scraping a website which is accessible from this link, using Beautiful Soup. The idea is to download all href that contain the string .pdf using the get module.
The code below demonstrated the procedure and is working as intended:
filename = 'new_name.pdf'
url_to_download_pdf='https://bradscholars.brad.ac.uk/https://www.brad.ac.uk/library/additional-help/bradford-scholars-faqs/digital_preservation_policy.pdf'
with open(filename, 'wb') as f:
f.write(requests.get(url_to_download_pdf).content)
However, there is instance where the url such as given above (i.e., the variable url_to_download_pdf) direct to Page not found page. As a result, an unusable and unreadable pdf is downloaded.
Opening the file with pdf reader in Windows give the following warning
I am curious if there is any ways to avoid accessing and downloading an invalid pdf file?
Instead of directly accessing the contents of the file with
f.write(requests.get(url_to_download_pdf).content)
You can first check the status of the request, and then if it is a valid request, then only save to file.
filename = 'new_name.pdf'
url_to_download_pdf='https://bradscholars.brad.ac.uk/https://www.brad.ac.uk/library/additional-help/bradford-scholars-faqs/digital_preservation_policy.pdf'
response = requests.get(url_to_download_pdf)
if(response.status_code != 404):
with open(filename, 'wb') as f:
f.write(response.content)
You have to validate that the file you request for, already exists. If the file exists, the response code of the request will be 200. So here an example of how to do that:
filename = 'new_name.pdf'
url_to_download_pdf='https://bradscholars.brad.ac.uk/https://www.brad.ac.uk/library/additional-help/bradford-scholars-faqs/digital_preservation_policy.pdf'
with open(filename, 'wb') as f:
response = requests.get(url_to_download_pdf)
if response.status_code == 200:
f.write(response.content)
else:
print("Error, the file doesn't exist")
Thanks for the suggestion by the user.
As per #Nicolas,
Do the save as pdf only if the response return 200
if response.status_code == 200:
In the previous version, an empty file will be created regardless of the response because following with open(filename, 'wb') as f: was created before the checking status_code
To mitigate this, the with open(filename, 'wb') as f: should be initiated only if the condition set was as intended.
The complete code then is as below:
import requests
filename = 'new_name.pdf'
url_to_download_pdf='https://bradscholars.brad.ac.uk/https://www.brad.ac.uk/library/additional-help/bradford-scholars-faqs/digital_preservation_policy.pdf'
my_req = requests.get(url_to_download_pdf)
if my_req.status_code == 200:
with open(filename, 'wb') as f:
f.write(my_req.content)
Please correct me if I am wrong as I am a beginner in python.
I have a web services URL which contains an XML file:
http://abc.tch.xyz.edu:000/patientlabtests/id/1345
I have a list of values and I want to append each value in that list to the URL and download file for each value and the name of the downloaded file should be the same to the value appended from the list.
It is possible to download one file at a time but I have 1000's of values in the list and I was trying to write a function with a for loop and I am stuck.
x = [ 1345, 7890, 4729]
for i in x :
url = http://abc.tch.xyz.edu:000/patientlabresults/id/{}.format(i)
response = requests.get(url2)
****** Missing part of the code ********
with open('.xml', 'wb') as file:
file.write(response.content)
file.close()
The files downloaded from URL should be like
"1345patientlabresults.xml"
"7890patientlabresults.xml"
"4729patientlabresults.xml"
I know there is a part of the code which is missing and I am unable to fill in that missing part. I would really appreciate if anyone can help me with this.
Accessing your web service url seem not to be working. Check this.
import requests
x = [ 1345, 7890, 4729]
for i in x :
url2 = "http://abc.tch.xyz.edu:000/patientlabresults/id/"
response = requests.get(url2+str(i)) # i must be converted to a string
Note: When you use 'with' to open a file, you do not have close the file since it will closed automatically.
with open(filename, mode) as file:
file.write(data)
Since the Url you provide is not working, I am going to use a different url. And I hope you get the idea and how to write to a file using the custom name
import requests
categories = ['fruit', 'car', 'dog']
for category in categories :
url = "https://icanhazdadjoke.com/search?term="
response = requests.get(url + category)
file_name = category + "_JOKES_2018" #Files will be saved as fruit_JOKES_2018
r = requests.get(url + category)
data = r.status_code #Storing the status code in 'data' variable
with open(file_name+".txt", 'w+') as f:
f.write(str(data)) # Writing the status code of each url in the file
After running this code, the status codes will be written in each of the files. And the file will also be named as follows:
car_JOKES_2018.txt
dog_JOKES_2018.txt
fruit_JOKES_2018.txt
I hope this gives you an understanding of how to name the files and write into the files.
I think you just want to create a path using str.format as you (almost) are for the URL. maybe something like the following
import os.path
x = [ 1345, 7890, 4729]
for i in x:
path = '1345patientlabresults.xml'.format(i)
# ignore this file if we've already got it
if os.path.exists(path):
continue
# try and get the file, throwing an exception on failure
url = 'http://abc.tch.xyz.edu:000/patientlabresults/id/{}'.format(i)
res = requests.get(url)
res.raise_for_status()
# write the successful file out
with open(path, 'w') as fd:
fd.write(res.content)
I've added some error handling and better behaviour on retry
I am downloading multiple CSV files from a website using Python. I would like to be able to check the response code on each request.
I know how to download the file using wget, but not how to check the response code:
os.system('wget http://example.com/test.csv')
I've seen a lot of people suggesting using requests, but I'm not sure that's quite right for my use case of saving CSV files.
r = request.get('http://example.com/test.csv')
r.status_code # 200
# Pipe response into a CSV file... hm, seems messy?
What's the neatest way to do this?
You can use the stream argument - along with iter_content() it's possible to stream the response contents right into a file (docs):
import requests
r = None
try:
r = requests.get('http://example.com/test.csv', stream=True)
with open('test.csv', 'w') as f:
for data in r.iter_content():
f.write(data)
finally:
if r is not None:
r.close()