The extension in image url is different from actual format - python

I downloaded an image from a url such as "https://www.xxxx.com/filename.jpeg. I expected that that image is a jpeg image whose format is acceptable for Computer Vision Annotation Tool (CVAT). However, it was saved as filename.heif or filename.jpeg.heif, so it causes an error when I tried to create a task with that image because heif format is not acceptable in CVAT. (CVAT automatically downloads images and create a task once I put image urls and submit them.)
I usually put more than 1000 image urls to create a task, and it is really hard to find invalid url or image among them.
Is there any way to find the "actual format" only by looking at the image url? Or can I just skip invalid urls in CVAT?
Thank you.

Related

Loading a .tiff dataset in FiftyOne through browser

I have a .tiff image dataset that I want to load in FiftyOne. I’ve gone through the Docs and only found Geotiff dataType so I load it as a fiftyone.types.ImageDirectory.
I got: Type image/tiff may not be supported.
Came on SOF searching for a solution and came across this answer from Eric https://stackoverflow.com/a/73775999/19902725 Suggesting using a browser extension or Safari as it natively supports loading .tiff
1 - The extensions work by intercepting the URL and checking if it ends with a .TIFF so it itself could handle the request. Fiftyone loads the DS using a URL but loads the individual images in it dynamically which won’t trigger the extension to load the image. 'At least in Brave browser'
2 - Switched to Safari after giving up on the extension route but the loaded images are cropped to less than a quarter of the original image (1440 × 1080)
Any other solutions?
An alternative is to use the newly added support for multiple media fields per sample. With this, you could generate a png or jpg for each tiff image and store these alternate filepaths on your samples in a new field, then toggle between tiff and png/jpg media in the App.
sample = fo.Sample(filepath="/path/to/img.tiff")
sample["jpg_filepath"] = "/path/to/img.jpg"
dataset.add_sample(sample)
dataset.app_config.media_fields.append("jpg_filepath")
dataset.save() # must save after edits

How to break "select different image" CAPTCHA with Python

So I have this code in Python3 that scraps data from websites through object recongnition (I used this to automate the download process inside a flash player based website) and Selenium. The problem is that I'm stuck with this website that have a custom made Captcha where the user have to select the different image from the group and I don´t know how to download or get these images from the site in order to identify the different one, has anyone solved a problem like this? or have an idea on how to solve this captcha with any other technique or method?
This is the login that has the CAPTCHA
And here's the link to the site which is in spanish. The captcha basically says "Select the different image"
https://portalempresas.sb.cl/login.php
Thanks!
To download those images as png files you could do:
from io import BytesIO
from PIL import Image
# Download image function
def downloadImage(element,imgName):
img = element.screenshot_as_png
stream = BytesIO(img)
image = Image.open(stream).convert("RGB")
image.save(imgName)
# Find all the web elements of the captcha images
image_elements = driver.find_elements_by_xpath("*//div[contains(#class,'captcha-image')]")
# Output name for the images
image_base_name = "Imagen_[idx].png"
# Download each image
for i in range(len(image_elements)):
downloadImage(image_elements[i],image_base_name.replace("[idx]","%s"%i))
Edit 1:
If you want to compare 2 images to see if they are equal you could try with this post
Edit 2:
Using the solution edited above, these are the results:

How can i get someone's profile pic on Discord to edit it using PIL?

I'm trying to make some code on python to edit someone's profile pic, but all I've got so far is this:
image = ctx.message.author.avatar_url
background = Image.open(image)
Apparently that just gets the URL itself, but i need the image itself to edit a picture with PIL. Any insight on how to get it?
with requests.get(ctx.message.author.avatar_url) as r:
img_data = r.content
with open('image_name.jpg', 'wb') as handler:
handler.write(img_data)
So I played about with this link a bit:
https://cdn.discordapp.com/avatars/190434822328418305/6a56d4edf2a82409ffc8253f3afda455.png
And I was able to save my own avatar image (the one I use for my accounts everywhere). I was then able to open the file regularly with the photo viewer app within Pycharm.
After, it would simply become a case of opening the new jpeg file with PIL or pillow instead of trying to open anything from a website, if that makes sense.
You should consider that this will save a file onto your Discord bot server, so this is extremely crude, a malformed or maliciously formed jpeg file could lead to some sort of remote vulnerability.
Furthermore to your comment, if you want the size of the image you download to be bigger, for example, please see the amended link below to solve your problem there:
https://cdn.discordapp.com/avatars/190434822328418305/6a56d4edf2a82409ffc8253f3afda455.png?size=<Number from list [16,32,64,128,256,512,1024,2048]>
Hope this helps :)

Is there a way to resize jpgs via http in python?

I am writing a webcrawler that finds and saves the urls of all the images on a website. I can get these without problem. I need to upload these urls, along with a thumbnail version of them, to a server via http request, which will render the image and collect feature information to use in various AI applications.
For some urls this works no problem.
http://images.asos-media.com/products/asos-waxed-parka-raincoat-with-zip-detail/7260214-1-khaki
resizes into
http://images.asos-media.com/products/asos-waxed-parka-raincoat-with-zip-detail/7260214-1-khaki?wid=200
but for actual .jpg images this method doesn't work, like for this one:
https://cdn-images.farfetch-contents.com/11/85/29/57/11852957_8811276_480.jpg
How can I resize the jpgs via url?
Resizing the image via the URL only works if the site you're hitting is using a dynamic media service or tool in their stack. That's why ASOS will allow you to append a query with the dimensions for resize, however different DM tools will have different query parameters.
If you want to make it tolerant you're best off downloading the image, resizing it with Python and then uploading it.

how to save image using python?

I want to save a .jsp image from a web page in to my computer using python.
I have tried many methods including
retrieve function in mechanize and
urllib.urlretrieve('http://example.com/img.jsp', 'img.jsp')
but the problem is when I try to open the image using the image library it throws the following error
File "code.py", line 71, in extract_image
im = Image.open(image_file)
File "/usr/lib/python2.6/dist-packages/PIL/Image.py", line 1980, in open
raise IOError("cannot identify image file")
I have even tried saving the image in .png format, but its not working.
But I can do the save manually by going to the image url and then saving the image.
Pls help!
You haven't provided enough information, but my guess is that the web server isn't responding the way you think it is -- did you peek the HTTP traffic with Fiddler or Firebug or look at what's in the file?
Can you get a copy of the image some other way -- if so, compare that to what you downloaded programmatically.
Finally, I am not sure what a JSP image is -- if img.jsp is a JavaServerPage that responds with an image, that doesn't make the image a JSP image -- it's still in the format that corresponds to its Content-type.

Categories