I am currently trying to take a screenshot of a specific web element using Selenium and Pillow on Python 2.7. The code I have seems to be able to take the screenshot and then crop the image but it is actually not cropping at the exact location of the element. After researching I found that this could be happening due to the dpi used by the Mac Book Pro retina display but I havenĀ“t found a workaround for this.
What could be the best way to handle this "issue" with the retina display and PIL?
This is the code I am using:
try:
#Wait until finishing loading
waitElement = WebDriverWait(driver,60).until(
EC.presence_of_element_located((By.XPATH, "//*[#id='someElement']/span/input[2]"))
)
try:
#Go to trends report
driver.get("https://somewebsite.com/trends")
waitElement = WebDriverWait(driver,60).until(
EC.presence_of_element_located((By.ID, "Totals"))
)
try:
summary = driver.find_element_by_id("Totals")
driver.save_screenshot("sc.png")
location = summary.location
size = summary.size
x = location['x']
y = location['y']
w = size['width']
h = size['height']
width = x + w
height = y + h
im = Image.open('sc.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('summary.png')
except NoSuchElementException:
print("Something")
except NoSuchElementException:
print("Something")
finally:
driver.close()
I was facing this (broken screenshot issue in case of ratina display on mac) & solved it using
Snippet I used for this
//basically , here I am using 'devicePixelRatio' to calculate proper width/height
WebElement ele = findElement(By.xpath(webElementXpath));
Point point = ele.getLocation();
int devicePixelRatio = Integer.parseInt(returnExecuteJavaScript("return window.devicePixelRatio"));
int eleWidth = ele.getSize().getWidth() * devicePixelRatio;
int eleHeight = ele.getSize().getHeight() * devicePixelRatio;
// ....... for cropping stuff , you can use your own code
Related
I would like to get a screenshot of a full web page with Selenium python. To do that I used this answer.
But now I would like to divide the screenshot and limit the height of the resulting screenshot to 30 000px. For example, if a webpage height is 40 000px, I would like a first 30 000px screenshot and then a 10 000 px screenshot.
The solution should not be to save the full page and then crop the image, because I need it to work for very long webpages. Indeed it is not possible to get a screenshot with 120 000px height or you get this kind of error :
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: [Exception... "Failure" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://marionette/content/capture.js :: capture.canvas :: line 154" data: no]
I tried this but it does not wok at all :
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from time import sleep
def save_screenshot(d, path):
max_size = 30000
original_size = d.get_window_size()
required_width = d.execute_script('return document.body.parentNode.scrollWidth')+74
required_height = d.execute_script('return document.body.parentNode.scrollHeight')+74
if required_height <= max_size:
d.set_window_size(required_width, required_height)
d.find_element_by_tag_name('body').screenshot(path) # avoids scrollbar
else :
for i in range(0,int(required_height/max_size)):
d.set_window_position(0,max_size*i)
d.set_window_size(required_width, max_size)
d.find_element_by_tag_name('body').screenshot(path.replace(".png",str(i+1)+".png"))
d.set_window_position(0,max_size*int(required_height/max_size))
d.set_window_size(required_width, required_height%max_size)
d.find_element_by_tag_name('body').screenshot(path.replace(".png",str(int(required_height/max_size)+1)+".png"))
d.set_window_position(0,0)
d.set_window_size(original_size['width'], original_size['height'])
if __name__ == '__main__':
options = Options()
options.headless = True
driver = webdriver.Firefox(options=options,executable_path=r"C:/Users/path/geckodriver.exe")
driver.get("http://www.lwb.com/")
sleep(3)
save_screenshot(driver,"C:/Users/path/test.png")
driver.close()
Can someone help me here please ?
Thank you,
Mylha
you can crop the saved screenshot using pillow pip install pillow
def crop_image(img, img_limit):
img_width, img_height = img.size
crop_dim = (0, 0, img_width, img_limit) # left top right bottom
cropped_img = img.crop(crop_dim)
return cropped_img
after doing save_screenshot(driver, "path/to/img"), do the following :
from PIL import Image
img_limit = 30000 # your image size limit
img = Image.open("path/to/img")
img = crop_image(img, img_limit)
img.save("path/to/img")
if you don't want to save the image before you manipulate it you can use get_screenshot_as_png, which will return binary data instead of saving it :
from PIL import Image
from io import BytesIO
img_limit = 30000 # your image size limit
img_binary = driver.get_screenshot_as_png()
img = Image.open(BytesIO(img_binary))
img = crop_image(img, img_limit)
img.save("path/to/save")
make sure to do del img and del img_binary when you're done to delete the binary data from memory
in order to take one screenshot of the entire page, do this:
from selenium import webdriver
DRIVER_PATH = "path/to/chrome/driver"
URL = "site.url"
options = webdriver.ChromeOptions()
options.add_argument("headless")
driver = webdriver.Chrome(executable_path = DRIVER_PATH, chrome_options = options)
# setting a long window height to take one full screenshot
driver.set_window_size(1920, 90000) # width , height
driver.get(URL)
driver.maximize_window()
img_binary = driver.get_screenshot_as_png()
PS : if you use this method to take the screenshot, you won't need pillow. simply use set_window_size to set the height and width of the window that you want, which will get you that same size in the screenshot
Sample Image
Hello,
I created an application in python that select the Region of Interest(ROI) of an image, record and label it. But I has a limit of one ROI per image, anyone know how to have multiple selection of ROI per image? Also on attached image, as you can see I have multiple window, I want it to be in one window with different options, what packages are use on this kind of application.
here's my code in python using opencv2. Thank you in advance for the help
for image in filelist:
img = cv2.imread(image)
fromCenter = False
r = cv2.selectROI(img, fromCenter)
lbl = simpledialog.askstring("Image Label", "Please Enter Label")
result = eTree.SubElement(results, "Image")
path = eTree.SubElement(result, 'Path')
roi = eTree.SubElement(result, 'ROI')
label = eTree.SubElement(result, 'Label')
path.text = str(image)
roi.text = str(r)
label.text = str(lbl)
tree = eTree.ElementTree(results)
i = i + 1
if i == count:
format = [('XML Files', '*.xml'), ('All Files', '*.*')]
save = filedialog.asksaveasfilename(filetype=format, defaultextension='*.xml')
tree.write(save, xml_declaration=True, encoding='utf-8', method="xml")
Well at least for the first part of the question, have you considered to try the cv2.createROIs() instead of cv2.createROI() ? When the image window is opened you then select your first ROI and press enter, then the second and press enter etc. And when you are finished then press the escape key. It returns x,y,w,h of each ROI. Note that you will have to change your code accordingly but it will allow you to select multiple ROI.
Input image:
Example:
import cv2
img = cv2.imread('rois.png')
fromCenter = False
ROIs = cv2.selectROIs('Select ROIs', img, fromCenter)
ROI_1 = img[ROIs[0][1]:ROIs[0][1]+ROIs[0][3], ROIs[0][0]:ROIs[0][0]+ROIs[0][2]]
ROI_2 = img[ROIs[1][1]:ROIs[1][1]+ROIs[1][3], ROIs[1][0]:ROIs[1][0]+ROIs[1][2]]
ROI_3 = img[ROIs[2][1]:ROIs[2][1]+ROIs[2][3], ROIs[2][0]:ROIs[2][0]+ROIs[2][2]]
cv2.imshow('1', ROI_1)
cv2.imshow('2', ROI_2)
cv2.imshow('3', ROI_3)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
For custom ROI you can use EasyROI. It supports rectangle, line, circle and polygon.
For using it:
pip install EasyROI
from EasyROI import EasyROI
roi_helper = EasyROI()
roi = roi_helper.draw_rectangle(frame, quantity=2)
I am doing a project and I want him to be invisible. Therefore, i used this website - http://pytoexe.com/ to create an window based exe file from my python script which means it will not use the windows console.
Unfortuently, since i am using phantomjs driver in my code, he opens a phantomjs console which interrupt me.
In order to slove this problem I need to add a line or script that prevent from the phantomjs console to appear ( changing something in my selenium files/ something like that would not work cause it probably problem in their files and i cannot doing anything with that) .
Someone know what to do?
this is my exe file
and This is my code:
from selenium import webdriver
import time
from PIL import Image
from constants import *
import operator
import os
#Email Constants
DEFAULT_CONTENT = 'example email stuff here'
HOST = 'smtp.gmail.com'
PORT = 587
EMAIL = 'freeadsandspam#gmail.com'
CUSTOMERS = []
SUBJECTS = ['new ad were found', 'ad were found by SMADS', 'best ad for you']
COMMASPACE = ', '
#Getting History
OUTPUT_FILE_PATH = 'C:\search_logger.txt'
COPY_OF_THE_HISTORY_PATH = 'C:\history'
NEW_OUTPUT_FILE_PATH = 'C:\last_search_logger.txt'
#PhantomJs And Ads Finding
PHANTOM_JS_PATH = 'C:\phantomjs-2.1.1-windows\\bin\phantomjs.exe'
OUTPUT_AD_PATH = 'ad.png'
DEFAULT_WINDOW_SIZE = (1024, 768)
AD_DATABASE = 'https://www.findads.com.au/'
KEYWORD_BUTTON_XPATH = '//*[#id="txtSearch"]'
SEARCH_BUTTON_XPATH = '/html/body/div[1]/div/form/button'
CONSTANT_TIME_SLEEPING = 3
AD_XPATH = '/html/body/div[1]/section/div/div[1]/div[4]/div[1]/div[1]/section[24]'
COMPARE_ELEMENT_XPATH = '//*[#id="fSearch"]'
CATAGORY_SORT_XPATH = '/html/body/div[1]/section/div/div[1]/div[5]/div/div[3]/form/div[1]/div[1]'
class PhantomJsDriver:
"""
A class that taking care on the ads finding
in the internet, doing it with PhantomJs -
background driver
"""
def __init__(self, ad_keyword, window_size=DEFAULT_WINDOW_SIZE, panthom_js_path=PHANTOM_JS_PATH, output_ad_path=OUTPUT_AD_PATH):
"""
this function init our object
in order to use the other functions
that the object offer.
Parameters
----------
phantom_js_path : str
path of the PhantomJs ( this essential because
we cannot get the PhantomJs file otherwise)
output_ad_path : str
where you want to save the ad that the system
had found and how you call the name of the ad
file ( eg: ad.png )
ad_keyword : str
the keyword that define what ad the system bring
( eg: dog will bring dog ad )
window_size : double int (int1,int2)
define the window size of the browser ( mainly for the
screenshot )
"""
self.phantom_js_path = panthom_js_path
self.output_ad_path = output_ad_path
self.ad_keyword = ad_keyword
self.window_size = window_size
self.list_of_images = []
self.dict = {}
def get_ad(self):
"""
this function save the ad by searching in the internet
( on specific website ) the keyword that the user chose
and copy it into the output_ad_path.
"""
for i in range(0, 5):
driver = webdriver.PhantomJS(self.phantom_js_path)
driver.set_window_size(self.window_size[0], self.window_size[1])
driver.get(AD_DATABASE)
keyword = driver.find_element_by_xpath(KEYWORD_BUTTON_XPATH)
keyword.send_keys(self.ad_keyword)
search_button = driver.find_element_by_xpath(SEARCH_BUTTON_XPATH)
search_button.click()
driver.save_screenshot("ad" + str(i) + ".png")
element = driver.find_element_by_xpath(AD_XPATH) # find part of the page you want image of
self.crop_image(i, element)
def crop_image(self, i, element):
"""
this function crop the screenshot of the ads website from
the previous function into one single ad.
"""
im = Image.open("ad" + str(i) + ".png") # uses PIL library to open image in memory
location = element.location
size = element.size
left = location['x']
top = location['y']
right = location['x'] + size['width'] + 50
bottom = location['y'] + size['height']
weight, height = im.size
print height
im = im.crop((left, top, right, bottom)) # defines crop points
im.save('test' + str(i) + '.png') # saves new cropped image
self.list_of_images.append('test' + str(i) + '.png')
self.dict['test' + str(i) + '.png'] = 0
def choose_the_best_ad(self):
for img1 in self.list_of_images:
for img2 in self.list_of_images:
im1 = Image.open(img1)
im2 = Image.open(img2)
if list(im1.getdata()) == list(im2.getdata()):
self.dict[img1] += 1
self.dict[img2] += 1
print self.dict
BestImage = max(self.dict.iteritems(), key=operator.itemgetter(1))[0]
print BestImage
if os.path.exists("TheImage.png"):
os.remove("TheImage.png")
os.rename(BestImage, "TheImage.png")
driver = PhantomJsDriver("dog")
driver.get_ad()
driver.choose_the_best_ad()
Well, while I don't consider myself an experienced programmer, especially when it comes to multimedia applications, I had to do this image viewer sort of a program that displays images fullscreen, and the images change when the users press <- or -> arrows.
The general idea was to make a listbox where all the images contained in a certain folder (imgs) are shown, and when someone does double click or hits RETURN a new, fullscreen frame is generated containing, at first, the image selected in the listbox but after the user hits the arrows the images change in conecutive order, just like in a regular image viewer.
At the beginning, when the first 15 - 20 images are generated, everything goes well, after that, although the program still works an intermediary, very unpleasant and highly unwanted, effect appears between image generations, where basically some images that got generated previously are displayed quickly on the screen and after this the right, consecutive image appears. On the first apparitions the effect is barelly noticeble, but after a while it takes longer and longer between generations.
Here's the code that runs when someone does double click on a listbox entry:
def lbclick(self, eve):
frm = wx.Frame(None, -1, '')
frm.ShowFullScreen(True)
self.sel = self.lstb.GetSelection() # getting the selection from the listbox
def pressk(eve):
keys = eve.GetKeyCode()
if keys == wx.WXK_LEFT:
self.sel = self.sel - 1
if self.sel < 0:
self.sel = len(self.chk) - 1
imgs() # invocking the function made for displaying fullscreen images when left arrow key is pressed
elif keys == wx.WXK_RIGHT:
self.sel = self.sel + 1
if self.sel > len(self.chk) - 1:
self.sel = 0
imgs() # doing the same for the right arrow
elif keys == wx.WXK_ESCAPE:
frm.Destroy()
eve.Skip()
frm.Bind(wx.EVT_CHAR_HOOK, pressk)
def imgs(): # building the function
imgsl = self.chk[self.sel]
itm = wx.Image(str('imgs/{0}'.format(imgsl)), wx.BITMAP_TYPE_JPEG).ConvertToBitmap() # obtaining the name of the image stored in the list self.chk
mar = itm.Size # Because not all images are landscaped I had to figure a method to rescale them after height dimension, which is common to all images
frsz = frm.GetSize()
marx = float(mar[0])
mary = float(mar[1])
val = frsz[1] / mary
vsize = int(mary * val)
hsize = int(marx * val)
panl = wx.Panel(frm, -1, size = (hsize, vsize), pos = (frsz[0] / 2 - hsize / 2, 0)) # making a panel container
panl.SetBackgroundColour('#000000')
imag = wx.Image(str('imgs/{0}'.format(imgsl)), wx.BITMAP_TYPE_JPEG).Scale(hsize, vsize, quality = wx.IMAGE_QUALITY_NORMAL).ConvertToBitmap()
def destr(eve): # unprofessionaly trying to destroy the panel container when a new image is generated hopeing the unvanted effect, with previous generated images will disappear. But it doesn't.
keycd = eve.GetKeyCode()
if keycd == wx.WXK_LEFT or keycd == wx.WXK_RIGHT:
try:
panl.Destroy()
except:
pass
eve.Skip()
panl.Bind(wx.EVT_CHAR_HOOK, destr) # the end of my futile tries
if vsize > hsize: # if the image is portrait instead of landscaped I have to put a black image as a container, otherwise in the background the previous image will remain, even if I use Refresh() on the container (the black bitmap named fundal)
intermed = wx.Image('./res/null.jpg', wx.BITMAP_TYPE_JPEG).Scale(frsz[0], frsz[1]).ConvertToBitmap()
fundal = wx.StaticBitmap(frm, 101, intermed)
stimag = wx.StaticBitmap(fundal, -1, imag, size = (hsize, vsize), pos = (frsz[0] / 2 - hsize / 2, 0))
fundal.Refresh()
stimag.SetToolTip(wx.ToolTip('Esc = exits fullscreen\n<- -> arrows = quick navigation'))
def destr(eve): # the same lame attempt to destroy the container in the portarit images situation
keycd = eve.GetKeyCode()
if keycd == wx.WXK_LEFT or keycd == wx.WXK_RIGHT:
try:
fundal.Destroy()
except:
pass
eve.Skip()
frm.Bind(wx.EVT_CHAR_HOOK, destr)
else: # the case when the images are landscape
stimag = wx.StaticBitmap(panl, -1, imag)
stimag.Refresh()
stimag.SetToolTip(wx.ToolTip('Esc = exits fullscreen\n<- -> arrows = quick navigation'))
imgs() # invocking the function imgs for the situation when someone does double click
frm.Center()
frm.Show(True)
Thanks for any advice in advance.
Added later:
The catch is I'm trying to do an autorun presentation for a DVD with lots of images on it. Anyway it's not necessarely to make the above piece of code work properly if there are any other options. I've already tried to use windows image viewer, but strangely enough it doesn't recognizes relative paths and when I do this
path = os.getcwd() # getting the path of the current working directory
sel = listbox.GetSelection() # geting the value of the current selection from the list box
imgname = memlist[sel] # retrieving the name of the images stored in a list, using the listbox selection, that uses as choices the same list
os.popen(str('rundll32.exe C:\WINDOWS\System32\shimgvw.dll,ImageView_Fullscreen {0}/imgsdir/{1}'.format(path, imgname))) # oepning the images up with bloody windows image viewer
it opens the images only when my program is on the hard disk, if it's on a CD / image drive it doesn't do anything.
There's an image viewer included with the wxPython demo package. I also wrote a really simple one here. Either one should help you in your journey. I suspect that you're not reusing your panels/frames and instead you're seeing old ones that were never properly destroyed.
Currently I'm using rsvg to load the svg (from a string, not from a file) and drawing to cairo. Anyone know a better way? I use PIL elsewhere in my application, but I don't know of a way to do this with PIL.
Here's what I currently have:
import cairo
import rsvg
def convert(data, ofile, maxwidth=0, maxheight=0):
svg = rsvg.Handle(data=data)
x = width = svg.props.width
y = height = svg.props.height
print "actual dims are " + str((width, height))
print "converting to " + str((maxwidth, maxheight))
yscale = xscale = 1
if (maxheight != 0 and width > maxwidth) or (maxheight != 0 and height > maxheight):
x = maxwidth
y = float(maxwidth)/float(width) * height
print "first resize: " + str((x, y))
if y > maxheight:
y = maxheight
x = float(maxheight)/float(height) * width
print "second resize: " + str((x, y))
xscale = float(x)/svg.props.width
yscale = float(y)/svg.props.height
surface = cairo.ImageSurface(cairo.FORMAT_ARGB32, x, y)
context = cairo.Context(surface)
context.scale(xscale, yscale)
svg.render_cairo(context)
surface.write_to_png(ofile)
How about imagemagic? - http://www.imagemagick.org/script/magick-vector-graphics.php It can read/write from/to stdin/stdout so You can integrate it with your app even if You don't want to use files
I have inkscape installed so I am just farming out the process to the inkscape command with inkscape -f file.svg -e file.png
Using this code:
import subprocess
inkscape_dir=r"C:\Program Files (x86)\Inkscape"
assert os.path.isdir(inkscape_dir)
os.chdir(inkscape_dir)
subprocess.Popen(['inkscape.exe',"-f",fname,"-e",fname_png])
I am on windows 7, and got the Windows 5 Error [Access Denied] (or something like that) until I switched to the inkscape directory
You can also use PhantomJS for this (see http://phantomjs.org/screen-capture.html)
From a shell:
phantomjs rasterize.js http://ariya.github.com/svg/tiger.svg tiger.png
Or from python using selenium:
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.set_window_size(1024, 768)
driver.get('http://ariya.github.com/svg/tiger.svg')
driver.save_screenshot('tiger.png')