NameError: name 'images' is not defined - python

I was trying to import some images to my local computer from amazon using their api for data analysis
This is my code I am getting this error.Don't know what to do Please suggest some solution
Code
from PIL import Image
import requests
from io import BytesIO
for index, row in images.iterrows():
url = row['large_image_url']
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.save('images/15k_images/'+row['asin']+'.jpeg')
Error
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-36-ce8803db815e> in <module>()
9 from io import BytesIO
10
---> 11 for index, row in images.iterrows():
12 url = row['large_image_url']
13 response = requests.get(url)
NameError: name 'images' is not defined

Related

Problems extracting files from a pdf with PyM

I want to extract and save images as .png, from a pdf file. I use the following Python code and PyMuPDF:
import fitz
import io
from PIL import Image
file = "pdf1.pdf"
pdf_file = fitz.open(file)
for page_index in range(len(pdf_file)):
page = pdf_file[page_index]
image_list = page.getImageList()
if image_list:
print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
else:
print("[!] No images found on page", page_index)
for image_index, img in enumerate(page.getImageList(), start=1):
xref = img[0]
base_image = pdf_file.extractImage(xref)
image_bytes = base_image["image"]
image_ext = base_image["ext"]
image = Image.open(io.BytesIO(image_bytes))
image.save(open(f"image{page_index+1}_{image_index}.{image_ext}", "wb"))
But I get the following error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-bb8715bc185b> in <module>()
10 # get the page itself
11 page = pdf_file[page_index]
---> 12 image_list = page.getImageList()
13 # printing number of images found in this page
14 if image_list:
AttributeError: 'Page' object has no attribute 'getImageList'
​Is it related to the pdf file structure ( a non-dictionary type)? How could I solve it in that case?
You forgot to mention the PyMuPDF version you used. Your method name getImageList had been deprecated for a long time - a new name page.get_images() should be have been used. In the most recent version 1.20.x the old name is finally removed.
If you have a lot of old code using those old names you can either use a utility to make a global change, or execute fitz.restore_aliases() after import fitz.

Using urllib.request to write an image

I am trying to use this code to download an image from the given URL
import urllib.request
resource = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
output = open("file01.jpg","wb")
output.write(resource)
output.close()
However, I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-39-43fe4522fb3b> in <module>()
41 resource = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
42 output = open("file01.jpg","wb")
---> 43 output.write(resource)
44 output.close()
TypeError: a bytes-like object is required, not 'tuple'
I get that its the wrong data type for the .write() object but I don't know how to feed resource into output
Right, Using urllib.request.urlretrieve like this way:
import urllib.request
resource, headers = urllib.request.urlretrieve("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
image_data = open(resource, "rb").read()
with open("file01.jpg", "wb") as f:
f.write(image_data)
PS: urllib.request.urlretrieve return a tuple, the first element is the location of temp file, you could try to get the bytes of temp file, and save it to a new file.
In Official document:
The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.
So I would recommend you to use urllib.request.urlopen,try code below:
import urllib.request
resource = urllib.request.urlopen("http://farm2.static.flickr.com/1184/1013364004_bcf87ed140.jpg")
output = open("file01.jpg", "wb")
output.write(resource.read())
output.close()

Reading XML file's content from AWS S3 bucket using boto3 library

I am trying to read the content of an XML file for parsing using the BOTO3 library and getting below error while doing that.
I am using the below python code.
import xml.etree.ElementTree as et
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket_name')
key = 'audit'
for obj in bucket.objects.filter(Prefix="Folder/XML.xml"):
key = obj.key
body = obj.get()['Body'].read()
parsed_xml = et.fromstring(body)
I am getting below error while printing parsed_xml variable or body.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
in ()
----> 1 parsed
NameError: name 'parsed_xml' is not defined
If I will print body in the above code, it should be shown in XML tags.
You have to define 'parsed_xml' outside the 'for' sentence.
parsed_xml = ''

Displaying/getting Images from an URL in Python

I am new to python. But I got a task and I need to Displaying/getting Images from an URL.
I have been using Jupyter notebook with python to try to do this.
import sys
print(sys.version)
3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
I was trying to do it as in this post but none of the answers work.
With
import urllib, cStringIO
file = cStringIO.StringIO(urllib.urlopen(URL).read())
img = Image.open(file)
I get:
ImportError Traceback (most recent call last)
<ipython-input-33-da63c9426dad> in <module>()
1 url='http://images.mid-day.com/images/2017/feb/15-Justin-Bieber.jpg'
2 print(url)
----> 3 import urllib, cStringIO
4
5 file = cStringIO.StringIO(urllib.urlopen(URL).read())
ImportError: No module named 'cStringIO'
With:
from PIL import Image
import requests
from io import BytesIO
response = requests.get(url)
img = Image.open(BytesIO(response.content))
I get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-168cd6221ea3> in <module>()
1 #response = requests.get("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg")
----> 2 response.read("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg").decode('utf-8')
3 img = Image.open(StringIO(response.content))
AttributeError: 'Response' object has no attribute 'read'
With:
from PIL import Image
import requests
from StringIO import StringIO
response = requests.get(url)
img = Image.open(StringIO(response.content))
I get:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-37-5716207ad35f> in <module>()
3 from PIL import Image
4 import requests
----> 5 from StringIO import StringIO
6
7 response = requests.get(url)
ImportError: No module named 'StringIO'
Etc....
I thought it was going to be an easy task, but so far I haven't been able to find an answer.
I really hope someone can help me
This worked for me
from PIL import Image
import requests
from io import BytesIO
url = "https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img.show()
You are getting an error because you used the line response.read("https://baobab-poseannotation-appfile.s3.amazonaws.com/media/project_5/images/images01/01418849d54b3005.o.1.jpg").decode('utf-8') instead. I'd switch back to using response = requests.get(url)
Additionally, for your error: ImportError: No module named 'cStringIO', you are using python3. StringIO and cStringIO from python 2 were removed in python 3. Use from io import StringIO instead. See StringIO in Python3 for more details.
This might be duplicated with https://stackoverflow.com/a/46954931/4010864.
For your third option with PIL you can try this:
from PIL import Image
import requests
import matplotlib.pyplot as plt
response = requests.get(url, stream=True)
img = Image.open(response.raw)
plt.imshow(img)
plt.show()

Import .edf file directly from online archive in python

Using pyedflib to import edf files, is it possible to import datasets directly from their source? Or is it always necessary to download data and import locally?
for example, I would like to do this:
pyedflib.EdfReader("https://www.physionet.org/pn6/chbmit/chb02/chb02_02.edf")
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-64-d123ce671a2f> in <module>()
----> 1 pyedflib.EdfReader("https://www.physionet.org/pn6/chbmit/chb02/chb02_02.edf")
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.__init__()
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.open()
pyedflib/_extensions/_pyedflib.pyx in pyedflib._extensions._pyedflib.CyEdfReader.check_open_ok()
IOError: can not open file, no such file or directory
Answer received on GitHub page
import pyedflib
import os
url = "https://www.physionet.org/pn6/chbmit/chb01/chb01_01.edf"
filename = "./chb.edf"
try:
from urllib import urlretrieve # Python 2
except ImportError:
from urllib.request import urlretrieve # Python 3
urlretrieve(url,filename)
pyedflib.EdfReader(filename)
os.remove(filename)
https://github.com/holgern/pyedflib/issues/22#issuecomment-341649760

Categories