I was trying to import some images to my local computer from amazon using their api for data analysis
This is my code I am getting this error.Don't know what to do Please suggest some solution
Code
from PIL import Image
import requests
from io import BytesIO
for index, row in images.iterrows():
url = row['large_image_url']
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.save('images/15k_images/'+row['asin']+'.jpeg')
Error
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-36-ce8803db815e> in <module>()
9 from io import BytesIO
10
---> 11 for index, row in images.iterrows():
12 url = row['large_image_url']
13 response = requests.get(url)
NameError: name 'images' is not defined
Related
I want to extract and save images as .png, from a pdf file. I use the following Python code and PyMuPDF:
import fitz
import io
from PIL import Image
file = "pdf1.pdf"
pdf_file = fitz.open(file)
for page_index in range(len(pdf_file)):
page = pdf_file[page_index]
image_list = page.getImageList()
if image_list:
print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
else:
print("[!] No images found on page", page_index)
for image_index, img in enumerate(page.getImageList(), start=1):
xref = img[0]
base_image = pdf_file.extractImage(xref)
image_bytes = base_image["image"]
image_ext = base_image["ext"]
image = Image.open(io.BytesIO(image_bytes))
image.save(open(f"image{page_index+1}_{image_index}.{image_ext}", "wb"))
But I get the following error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-bb8715bc185b> in <module>()
10 # get the page itself
11 page = pdf_file[page_index]
---> 12 image_list = page.getImageList()
13 # printing number of images found in this page
14 if image_list:
AttributeError: 'Page' object has no attribute 'getImageList'
Is it related to the pdf file structure ( a non-dictionary type)? How could I solve it in that case?
You forgot to mention the PyMuPDF version you used. Your method name getImageList had been deprecated for a long time - a new name page.get_images() should be have been used. In the most recent version 1.20.x the old name is finally removed.
If you have a lot of old code using those old names you can either use a utility to make a global change, or execute fitz.restore_aliases() after import fitz.
I am trying to use this code to download an image from the given URL
import urllib.request
resource = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
output = open("file01.jpg","wb")
output.write(resource)
output.close()
However, I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-39-43fe4522fb3b> in <module>()
41 resource = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
42 output = open("file01.jpg","wb")
---> 43 output.write(resource)
44 output.close()
TypeError: a bytes-like object is required, not 'tuple'
I get that its the wrong data type for the .write() object but I don't know how to feed resource into output
Right, Using urllib.request.urlretrieve like this way:
import urllib.request
resource, headers = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
image_data = open(resource, "rb").read()
with open("file01.jpg", "wb") as f:
f.write(image_data)
PS: urllib.request.urlretrieve return a tuple, the first element is the location of temp file, you could try to get the bytes of temp file, and save it to a new file.
In Official document:
The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.
So I would recommend you to use urllib.request.urlopen,try code below:
import urllib.request
resource = urllib.request.urlopen("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
output = open("file01.jpg", "wb")
output.write(resource.read())
output.close()
I am trying to read the content of an XML file for parsing using the BOTO3 library and getting below error while doing that.
I am using the below python code.
import xml.etree.ElementTree as et
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket_name')
key = 'audit'
for obj in bucket.objects.filter(Prefix="Folder/XML.xml"):
key = obj.key
body = obj.get()['Body'].read()
parsed_xml = et.fromstring(body)
I am getting below error while printing parsed_xml variable or body.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
in ()
----> 1 parsed
NameError: name 'parsed_xml' is not defined
If I will print body in the above code, it should be shown in XML tags.
You have to define 'parsed_xml' outside the 'for' sentence.
parsed_xml = ''
I am new to python. But I got a task and I need to Displaying/getting Images from an URL.
I have been using Jupyter notebook with python to try to do this.
import sys
print(sys.version)
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
I was trying to do it as in this post but none of the answers work.
With
import urllib, cStringIO
file = cStringIO.StringIO(urllib.urlopen(URL).read())
img = Image.open(file)
I get:
ImportError Traceback (most recent call last)
<ipython-input-33-da63c9426dad> in <module>()
1 url='http://images.mid-day.com/images/2017/feb/15-Justin-Bieber.jpg'
2 print(url)
----> 3 import urllib, cStringIO
4
5 file = cStringIO.StringIO(urllib.urlopen(URL).read())
ImportError: No module named 'cStringIO'
With:
from PIL import Image
import requests
from io import BytesIO
response = requests.get(url)
img = Image.open(BytesIO(response.content))
I get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-168cd6221ea3> in <module>()
1 #response = requests.get("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg")
----> 2 response.read("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg").decode('utf-8')
3 img = Image.open(StringIO(response.content))
AttributeError: 'Response' object has no attribute 'read'
With:
from PIL import Image
import requests
from StringIO import StringIO
response = requests.get(url)
img = Image.open(StringIO(response.content))
I get:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-37-5716207ad35f> in <module>()
3 from PIL import Image
4 import requests
----> 5 from StringIO import StringIO
6
7 response = requests.get(url)
ImportError: No module named 'StringIO'
Etc....
I thought it was going to be an easy task, but so far I haven't been able to find an answer.
I really hope someone can help me
This worked for me
from PIL import Image
import requests
from io import BytesIO
url = "https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.show()
You are getting an error because you used the line response.read("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg").decode('utf-8') instead. I'd switch back to using response = requests.get(url)
Additionally, for your error: ImportError: No module named 'cStringIO', you are using python3. StringIO and cStringIO from python 2 were removed in python 3. Use from io import StringIO instead. See StringIO in Python3 for more details.
This might be duplicated with https://stackoverflow.com/a/46954931/4010864.
For your third option with PIL you can try this:
from PIL import Image
import requests
import matplotlib.pyplot as plt
response = requests.get(url, stream=True)
img = Image.open(response.raw)
plt.imshow(img)
plt.show()
Using pyedflib to import edf files, is it possible to import datasets directly from their source? Or is it always necessary to download data and import locally?
for example, I would like to do this:
pyedflib.EdfReader("https://www.physionet.org/pn6/chbmit/chb02/chb02_02.edf")
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-64-d123ce671a2f> in <module>()
----> 1 pyedflib.EdfReader("https://www.physionet.org/pn6/chbmit/chb02/chb02_02.edf")
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.__init__()
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.open()
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.check_open_ok()
IOError: can not open file, no such file or directory
Answer received on GitHub page
import pyedflib
import os
url = "https://www.physionet.org/pn6/chbmit/chb01/chb01_01.edf"
filename = "./chb.edf"
try:
from urllib import urlretrieve # Python 2
except ImportError:
from urllib.request import urlretrieve # Python 3
urlretrieve(url,filename)
pyedflib.EdfReader(filename)
os.remove(filename)
https://github.com/holgern/pyedflib/issues/22#issuecomment-341649760