I'm looking for a library that enables to "create pictures" (or even videos) with the following functions:
Accepting picture inputs
Resizing said inputs to fit given template / scheme
Positioning the pictures in pre-set up layers or coordinates
A rather schematic approach to look at this:
whereas the red spots are supposed to represent e.g. text, picture (or if possible video) elements.
The end goal would be to give the .py script multiple input pictures and the .py creating a finished version like mentioned above.
Solutions I tried were looking into Python PIL, but I wasn't able to find what I was looking for.
Yes, it is possible to do this with Python.
The library you are looking for is OpenCV([https://opencv.org][1]/).
Some basic OpenCV python tutorials (https://docs.opencv.org/master/d9/df8/tutorial_root.html).
1) You can use imread() function to read images from files.
2) You can use resize() function to resize the images.
3) You can create a empty master numpy array matching the size and depth(color depth) of the black rectangle in the figure you have shown, resize your image and copy the contents into the empty array starting from the position you want.
Below is a sample code which does something close to what you might need, you can modify this to suit your actual needs. (Since your requirements are not clear I have written the code like this so that it can at least guide you.)
import numpy as np
import cv2
import matplotlib.pyplot as plt
# You can store most of these values in another file and load them.
# You can modify this to set the dimensions of the background image.
BG_IMAGE_WIDTH = 100
BG_IMAGE_HEIGHT = 100
BG_IMAGE_COLOR_DEPTH = 3
# This will act as the black bounding box you have shown in your figure.
# You can also load another image instead of creating empty background image.
empty_background_image = np.zeros(
(BG_IMAGE_HEIGHT, BG_IMAGE_WIDTH, BG_IMAGE_COLOR_DEPTH),
dtype=np.int
)
# Loading an image.
# This will be copied later into one of those red boxes you have shown.
IMAGE_PATH = "./image1.jpg"
foreground_image = cv2.imread(IMAGE_PATH)
# Setting the resize target and top left position with respect to bg image.
X_POS = 4
Y_POS = 10
RESIZE_TARGET_WIDTH = 30
RESIZE_TARGET_HEIGHT = 30
# Resizing
foreground_image= cv2.resize(
src=foreground_image,
dsize=(RESIZE_TARGET_WIDTH, RESIZE_TARGET_HEIGHT),
)
# Copying this into background image
empty_background_image[
Y_POS: Y_POS + RESIZE_TARGET_HEIGHT,
X_POS: X_POS + RESIZE_TARGET_WIDTH
] = foreground_image
plt.imshow(empty_background_image)
plt.show()
Related
I am trying to create a pipeline in which I first render an image using the blender python API (I am using Blender 2.90) and then perform some image processing in python. I want to fetch the image directly from blender without first writing the rendered image to disk and then loading it again. I ran the following code within the blender GUI to do so:
import bpy
import numpy as np
import PIL.Image as Image
from skimage.util import img_as_ubyte
resolution_x = 512
resolution_y = 512
# render settings
scene = bpy.context.scene
scene.render.engine = 'BLENDER_EEVEE'
scene.render.resolution_x = resolution_x
scene.render.resolution_y = resolution_y
scene.render.image_settings.file_format = 'PNG'
scene.render.filepath = "path/to/good_image.png"
# create Viewer Layer in Compositor
scene.use_nodes = True
tree = scene.node_tree
nodes = tree.nodes
links = tree.links
for node in nodes:
nodes.remove(node)
render_layer_node = nodes.new('CompositorNodeRLayers')
viewer_node = nodes.new('CompositorNodeViewer')
links.new(viewer_node.inputs[0], render_layer_node.outputs[0])
# render scene and get pixels from Viewer Node
bpy.ops.render.render(write_still=True)
pixels = bpy.data.images['Viewer Node'].pixels
# do some processing and save
img = np.flip(img_as_ubyte(np.array(pixels[:]).reshape((resolution_y, resolution_x, 4))), axis=0)
Image.fromarray(img).save("path/to/bad_image.png")
Problem: The image I get from the Viewer Node is much darker (bad image) than the image saved in the conventional way (good image). Does anyone have an idea why this happens and how to fix it? Does blender maybe treat pixel values differently than I expect?
Some additional information:
Before conversion to uint8, the values of the alpha channel within the dark image are 1.0 (as they actually should be). Background values in the dark image are not 0.0 or negative (as one might guess from appearance), but 0.05...
What I tried:
I thought that pixels might be scaled within range -1 to 1, so I rescaled the pixels to range 0 to 1 before transforming to uint8... Did not lead to the correct image either :(
It's because the image that you get from the Viewer Node is the one "straight from compositing" before color management takes place. You can have a look at the documentation here: this image is still in the linear space.
Your good_image.png on the other hand is obtained after transformation into the "Display Space" (see diagram in the doc). Hence it was transformed into a log-space, maybe gamma-corrected, etc.
Finally, you can get an image that is close to (but slightly different though) to the good image from the viewer node by calling bpy.data.images['Viewer Node'].save_render(filepath) instead, but there is no way to directly extract the color-managed version without rendering to a file first. You can probably do it yourself by adding PyOpenColorIO to your script and applying the color management from this module.
I'm trying to make a plugin for gimp that opens two images as separate layers and transforms one of them (more on that below). I'm using GIMP 2.10.12.
I've been struggling to find a proper complete documentation for GIMP's Python interface and am mostly just working from what code snippets I've been able to find. This is what I have so far:
#!/usr/bin/env python2
import os
from gimpfu import *
def load_pair(img_f):
mask_f = img_f.replace(IMG_DIR, PRED_DIR)
result_f = os.path.splitext(img_f.replace(IMG_DIR, SAVE_DIR))[0]
result_dir = os.path.dirname(result_f)
if not os.path.isdir(result_dir):
os.makedirs(result_dir)
img = gimp.Image(100, 100)
pdb.gimp_display_new(img)
for f, name, pos in ((img_f, "Image", 0), (mask_f, "Mask", 1)):
layer = pdb.gimp_file_load_layer(img, f)
pdb.gimp_layer_set_name(layer, name)
pdb.gimp_image_insert_layer(img, layer, None, pos)
register(
"python_fu_open_image_pair",
...,
"<Toolbox>/Image/Open Image Pair",
"",
[(PF_FILE, "img_f", "Image:", None)],
[],
load_pair
)
main()
This kind of does what I want but with a couple of problems.
Question 1
Currently I'm using gimp.Image(100, 100) to open a new image. This means I have to then Fit Canvas to Layers and adjust the zoom and position every time I load a new image pair.
Is there a way to find an image's size from pdb before opening it or do I have to use another library (like PIL) for this? I'm trying to keep my plugin's dependencies to a minimum.
The two images are guaranteed to have the same size.
Since File->Open automatically adjusts the canvas to the image size, I would hope there'd be a nice way to achieve this.
Question 2
I would like to automatically create and set the current working file to result_f + '.xcf' (see above code) - such that File -> Save would automatically save to this file. Is this possible in pdb?
Question 3
Most importantly, I currently have the Mask images saved as black-and-white images. Upon loading a mask as a new layer, I'd like to transform the black colour to transparent and white colour to green (0,255,0). Additionally, since they are saved as .jpg images, the white and black aren't necessarily exactly 255 and 0 intensities but can be off by a bit.
How do I do this automatically in my plugin?
The good way would be to load the first image normally, and the rest as additional layers. Otherwise you can reset the canvas size (pdb.gimp_image_resize(...)) once you have loaded all the layers, and then create the Display.
You can give a name and a default file to the image by setting image.name and image.filename.
To convert the white to green use pdb.plug_in_colors_channel_mixer(...) and set all the gains to 0., except green in green. Make the black transparent use pdb.plug_in_colortoalpha(...).
PS: For color2alpha:
import gimpcolor
color=gimpcolor.RGB(0,255,0) # green, integer args: 0->255)
# or
color=gimpcolor.RGB(0.,1.,0) # green, floating point args (0.->1.)
pdb.plug_in_colortoalpha(image, layer, color)
The Python doc is a direct copy of the Scheme one. In Python, the RUN-INTERACTIVE parameter is not positional, so it doesn't appear in most calls, if you need it, it is a keyword parameter.
I am new to PsychoPy, having previously worked with Pygame for several months (I switched to enable stimuli to be presented on multiple screens).
I am trying to figure out how to use PsychoPy to display an animation created using a sequence of images. I previously achieved this in Pygame by saving the entire sequence of images in a single large png file (a spritesheet) and then flipping only a fraction of that image (eg. 480 x 480 pixels) per frame, while moving onto the next equally sized section of the image in the next frame. This is roughly what my code looked like in Pygame. I would be really keen to hear if there is an equivalent way of generating animations in PsychoPy by selecting only parts of an image to be displayed with each frame. So far, googling this has not provided any answers!
gameDisplay=pygame.display.set_mode((800, 480))
sequence=pygame.image.load('C:\Users\...\image_sequence.png')
#This image contains 10 images in a row which I cycle through to get an animation
image_width=480
image_height=480
start=time.time()
frame_count=0
refresh=0
while time.time()<=start+15:
gameDisplay.blit(sequence,(160,0),(frame_count*image_width,0,image_width,image_height))
if time.time()>= start+(refresh*0.25): #Flip a new image say every 250 msec
pygame.display.update()
frame_count+=1
refresh+=1
if frame_count ==10:
frame_count=0
You could use a square aperture to restrict what's visible and then move the image. So something like this (untested, but could give you some ideas):
from psychopy import visual
win = visual.Window(units='pix') # easiest to use pixels as unit
aperture = visual.Aperture(win, shape='rect', size=(480, 480))
image = visual.ImageStim('C:\Users\...\image_sequence.png')
# Move through x positions
for x in range(10):
image.pos = [(-10.0/2*+0.5+x)*480, 0] # not sure this is right, but it should move through the x-positions
image.draw()
win.flip()
If you have the original images, I think that it would be simpler to just display the original images in sequence.
import glob
from psychopy import visual
image_names = glob.glob('C:\Users\...\*.png')
# Create psychopy objects
win = visual.Window()
image_stims = [visual.ImageStim(win, image) for image in image_names]
# Display images one by one
for image in image_stims:
image.draw()
win.flip()
# add more flips here if you want a lower frame rate
Perhaps it is even fast enough to load them during runtime without dropping frames, which would simplify the code and load memory less:
# Imports, glob, and win here
# Create an ImageStim and update the image each frame
stim = visual.ImageStim(win)
for name in image_names:
stim.image = name
stim.draw()
win.flip()
Actually, given a spritesheet you might be able to do something funky and more efficient using the GratingStim. This loads an image as a texture and then allows you to set the spatial frequncy (sf) and phase of that texture. If 1.0/sf (in both dimensions) is less than the width of the stimulus (in both dimensions) only a fraction of the texture will be shown and the phase determines which fraction that will be. It isn't designed for this purpose - it's usually used to create more than one cycle of texture not less than one - but I think it will work.
I'm writing a code that part of it is reading an image source and displaying it on the screen for the user to interact with. I also need the sharpened image data. I use the following to read the data and display it in pyGame
def image_and_sharpen_array(file_name):
#read the image data and return it, with the sharpened image
image = misc.imread(file_name)
blurred = ndimage.gaussian_filter(image,3)
edge = ndimage.gaussian_filter(blurred,1)
alpha = 20
out = blurred + alpha*(blurred - edge)
return image,out
#get image data
scan,sharpen = image_and_sharpen_array('foo.jpg')
w,h,c = scan.shape
#setting up pygame
pygame.init()
screen = pygame.display.set_mode((w,h))
pygame.surfarray.blit_array(screen,scan)
pygame.display.update()
And the image is displayed on the screen only rotated and inverted. Is this due to differences between misc.imread and pyGame? Or is this due to something wrong in my code?
Is there other way to do this? The majority of solution I read involved saving the figure and then reading it with ``pyGame''.
I often use the numpy swapaxes() method:
In this case we only need to invert x and y axis (axis number 0 and 1) before displaying our array :
return image.swapaxes(0,1),out
I thought technico provided a good solution - just a little lean on info. Assuming get_arr() is a function that returns the pixel array:
pixl_arr = get_arr()
pixl_arr = numpy.swapaxes(pixl_arr, 0, 1)
new_surf = pygame.pixelcopy.make_surface(pixl_arr)
screen.blit(new_surf, (dest_x, dest_y))
Alternatively, if you know that the image will always be of the same dimensions (as in iterating through frames of a video or gif file), it would be more efficient to reuse the same surface:
pixl_arr = get_arr()
pixl_arr = numpy.swapaxes(pixl_arr, 0, 1)
pygame.pixelcopy.array_to_surface(old_surf, pixl_arr)
screen.blit(old_surf, (dest_x, dest_y))
YMMV, but so far this is working well for me.
Every lib has its own way of interpreting image arrays. By 'rotated' I suppose you mean transposed. That's the way PyGame shows up numpy arrays. There are many ways to make it look 'correct'. Actually there are many ways even to show up the array, which gives you full control over channel representation and so on. In pygame version 1.9.2, this is the fastest array rendering that I could ever achieve. (Note for earlier version this will not work!).
This function will fill the surface with array:
def put_array(surface, myarr): # put array into surface
bv = surface.get_view("0")
bv.write(myarr.tostring())
If that is not working, use this, should work everywhere:
# put array data into a pygame surface
def put_arr(surface, myarr):
bv = surface.get_buffer()
bv.write(myarr.tostring(), 0)
You probably still get not what you want, so it is transposed or have swapped color channels. The idea is, manage your arrays in that form, which suites this surface buffer. To find out what is correct channel order and axes order, use openCV library (cv2.imread(filename)). With openCV you open images in BGR order as standard, and it has a lot of conversion functions. If I remember correctly, when writing directly to surface buffer, BGR is the correct order for 24 bit and BGRA for a 32 bit surface. So you can try to put the image array which you get out of file with this function and blit to the screen.
There are other ways to draw arrays e.g. here is whole set of helper functions http://www.pygame.org/docs/ref/surfarray.html
But I would not recommend using it, since surfaces are not for direct pixel manipulating, you will probably get lost in references.
Small tip: To do 'signalling test' use a picture, like this. So you will immediately see if something is wrong, just load as array and try to render.
My suggestion is to use the pygame.transform module. There are the flip and rotate methods, which you can use to however your transformation is. Look up the docs on this.
My recommendation is to save the output image to a new Surface, and then apply the transformations, and blit to the display.
temp_surf = pygame.Surface((w,h))
pygame.surfarray.blit(temp_surf, scan)
'''transform temp_surf'''
screen.blit(temp_surf, (0,0))
I have no idea why this is. It is probably something to do with the order in which the axes are transferred from a 2d array to a pygame Surface.
Any ideas how to use Python with the PIL module to shrink select all? I know this can be achieved with Gimp. I'm trying to package my app as small as possible, a GIMP install is not an option for the EU.
Say you have 2 images, one is 400x500, other is 200x100. They both are white with a 100x100 textblock somewhere within each image's boundaries. What I'm trying to do is automatically strip the whitespace around that text, load that 100x100 image textblock into a variable for further text extraction.
It's obviously not this simple, so just running the text extraction on the whole image won't work! I just wanted to query about the basic process. There is not much available on Google about this topic. If solved, perhaps it could help someone else as well...
Thanks for reading!
If you put the image into a numpy array, it's simple to find the edges which you can use PIL to crop. Here I'm assuming that the whitespace is the color (255,255,255), you can adjust to your needs:
from PIL import Image
import numpy as np
im = Image.open("test.png")
pix = np.asarray(im)
pix = pix[:,:,0:3] # Drop the alpha channel
idx = np.where(pix-255)[0:2] # Drop the color when finding edges
box = map(min,idx)[::-1] + map(max,idx)[::-1]
region = im.crop(box)
region_pix = np.asarray(region)
To show what the results look like, I've left the axis labels on so you can see the size of the box region:
from pylab import *
subplot(121)
imshow(pix)
subplot(122)
imshow(region_pix)
show()
The general algorithmn would be to find the color of the top left pixel, and then do a spiral scan inwards until you find a pixel not of that color. That will define one edge of your bounding box. Keep scanning until you hit one more of each edge.
http://blog.damiles.com/2008/11/basic-ocr-in-opencv/
might be of some help. You can use the simple bounding box method described in that tutorial or #Tyler Eaves spiral suggestion which works equally as well