Python - read array and download each image locally - python

my favourite helpful people.
Im trying to download openweathermap weather icons locally, and trying to find a way of recursively saving them.
I have this code to save individual ones, but would like to be able to try a for each loop to run through a list/array to get all the others. Basic stuff I know, just having a brain freeze right now.
This is the single download code:
import urllib.request
imgURL = "http://openweathermap.org/img/w/03d.png"
urllib.request.urlretrieve(imgURL, "weathericons/03d.png")
03d.png is one of the files I would like to iterate through an array or list to get the other images, so would like to download based on changing the last part of the URL with each of the image names.
Hope someone can help, many thanks

You can use a for loop with some format strings to generate the desired URLs:
import urllib.request
for i in range(1, 5):
imgURL = f"http://openweathermap.org/img/w/0{i}d.png"
urllib.request.urlretrieve(imgURL, f"weathericons/0{i}d.png")

Related

How do you download Google Image Search images with Python in 2022?

I'm working on a project which requires downloading the first n results from Google Images given a query term, and I'm wondering how to do this. It seems that they deprecated their API recently, and I haven't been able to find a good up-to-date answer. Ultimately, I want to
Enter query term
Save the URLs for the first n images in a txt file (example)
Download the images from those URLs
I have seen similar solutions that use Selenium but I was hoping to use Requests instead. I'm not very familiar with HTML parsing, but have used beautifulsoup before! Any help is greatly appreciated. I'm currently using Python 3.8.

How to break "select different image" CAPTCHA with Python

So I have this code in Python3 that scraps data from websites through object recongnition (I used this to automate the download process inside a flash player based website) and Selenium. The problem is that I'm stuck with this website that have a custom made Captcha where the user have to select the different image from the group and I donĀ“t know how to download or get these images from the site in order to identify the different one, has anyone solved a problem like this? or have an idea on how to solve this captcha with any other technique or method?
This is the login that has the CAPTCHA
And here's the link to the site which is in spanish. The captcha basically says "Select the different image"
https://portalempresas.sb.cl/login.php
Thanks!
To download those images as png files you could do:
from io import BytesIO
from PIL import Image
# Download image function
def downloadImage(element,imgName):
img = element.screenshot_as_png
stream = BytesIO(img)
image = Image.open(stream).convert("RGB")
image.save(imgName)
# Find all the web elements of the captcha images
image_elements = driver.find_elements_by_xpath("*//div[contains(#class,'captcha-image')]")
# Output name for the images
image_base_name = "Imagen_[idx].png"
# Download each image
for i in range(len(image_elements)):
downloadImage(image_elements[i],image_base_name.replace("[idx]","%s"%i))
Edit 1:
If you want to compare 2 images to see if they are equal you could try with this post
Edit 2:
Using the solution edited above, these are the results:

Python: How to download images with the URLs in the excel and replace the URLs with the pictures?

As shown in the below picture,there's an excel sheet and about 2,000 URLs of cover images in the F column.
What I want to do is that downloading the pictures with the URLs and replace the URL with the image correspondingly.
Download,Insert the pictures into F column and remove the URLs automatically.
How to complement it with Python ? Any suggestion or code is welcomed.Thanks.
I hope this answers your question:
Write a loop over the rows using Pandas library; you might find https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.read_excel.html and How to iterate over rows in a DataFrame in Pandas? interesting.
Within every iteration save the corresponding picture into a folder (maybe name them with your Pandas index); Refer to
python save image from url
to learn how to save a picture from a URL.
Use XlsxWriter library to put them on their respective cell; see an example at
https://xlsxwriter.readthedocs.io/example_images.html

Python Reddit API have URLs in variable but need to sort them

So I'm using PRAW, a python wrapper for reddit api (http://praw.readthedocs.io/en to find out more), and I have managed to print the URL's of my latest upvoted posts.
# start subreddit instance
subreddit = reddit.subreddit('dankmemes')
print("\n -> Current Subreddit: ")
print(subreddit.display_name)
redditor2 = reddit.redditor('me')
for upvoted in reddit.redditor('Kish_v').upvoted():
print(upvoted.url)
This outputs a long list of imgur URLs etc. However, I want to be able to download those images to a folder and then reupload them as a sort of scraper.
So I have got to the point where upvoted.url holds my URLs but would the best way to do the above be to put the links into an "array" and then download those images individually? - How would I go around doing this, sorry I am fairly new to python as I came from PHP to use this well documented API.
Thank you,
Kish
Check out the requests module, this SO question should get you pointed in the right direction with downloading the URLs you have on hand. I can't comment on the reuploading half.
How to download image using requests

downloading a file from a page

I would like to make a script (in any language, but preferably python or perl) download a specific type of file being streamed by a web page. However i do not know this files location so i will have to find it out by finding all the files being streamed by the page, and selecting the one i want based on file type.
a similar example would be to say i want to download a video off youtube, however there is no pattern or way to find the URL except finding the files being streamed to my computer.
The part i cannot figure out is how to find all the files being streamed by the page. The rest i can do myself. The file name is not mentioned anywhere in the source of the html page.
Example of the problem...
This works fine:
import urllib
urllib.urlretrieve ("http://example.com/anything.mp3", "a.mp3")
However this does not:
import urllib
urllib.urlretrieve ("http://example.com/page-where-the-mp3-file-is-being-streamed.html", "a.mp3")
If someone can help me figure out how to download all the files from a page or find the files being streamed i would really appreciate it. All i need is to know which language/library/method can accomplish this.Thanks

Categories