Superimpose objects on a video stream using Python and POVRAY

Superimpose objects on a video stream using Python and POVRAY - python

I am using Vapory which is a wrapper Python library for Povray. It allows using Python functions to manipulate typical Povray operations.
I want to superimpose 3D models in every frame of my video stream. The way to do this in Vapory is the following:
from vapory import *
from moviepy.video.io.ffmpeg_writer import ffmpeg_write_image
light = LightSource([10, 15, -20], [1.3, 1.3, 1.3])
wall = Plane([0, 0, 1], 20, Texture(Pigment('color', [1, 1, 1])))
ground = Plane( [0, 1, 0], 0,
Texture( Pigment( 'color', [1, 1, 1]),
Finish( 'phong', 0.1,
'reflection',0.4,
'metallic', 0.3)))
sphere1 = Sphere([-4, 2, 2], 2.0, Pigment('color', [0, 0, 1]),
Finish('phong', 0.8,
'reflection', 0.5))
sphere2 =Sphere([4, 1, 0], 1.0, Texture('T_Ruby_Glass'),
Interior('ior',2))
scene = Scene( Camera("location", [0, 5, -10], "look_at", [1, 3, 0]),
objects = [ ground, wall, sphere1, sphere2, light],
included=["glass.inc"] )
def embed_in_scene(image):
ffmpeg_write_image("__temp__.png", image)
image_ratio = 1.0*image.shape[1]/image.shape[0]
screen = Box([0, 0, 0], [1, 1, 0], Texture(
Pigment( ImageMap('png', '"__temp__.png"', 'once')),
Finish('ambient', 1.2) ),
'scale', [10, 10/image_ratio,1],
'rotate', [0, 20, 0],
'translate', [-3, 1, 3])
new_scene = scene.add_objects([screen])
return new_scene.render(width=800, height=480, antialiasing=0.001)
clip = (VideoFileClip("bunny.mp4") # File containing the original video
.subclip(23, 47) # cut between t=23 and 47 seconds
.fl_image(embed_in_scene) # <= The magic happens
.fadein(1).fadeout(1)
.audio_fadein(1).audio_fadeout(1))
clip.write_videofile("bunny2.mp4",bitrate='8000k')
which results with a video stream as follows:
What I want, however, is that movie box being the whole scene, and spheres to remain where they are. The first thought was to remove the rotation function from the code and it did work, however I still cannot stretch the movie frame to the end corners of the actual scene.
Any thoughts?
EDIT: So I was able to move the camera, get the object to the center. However I still could not get the movie full screen. This is because the camera object is told to look towards the coordinates, and I don't know what coordinates the camera should be directed at, in order to get the picture in full screen. See:

Related

Using Multiple Instances of KalmanFilter

I want to implement Kalmnan Filter for multi object tracking
Therefore I want to call a different instance of the Kalman Filter class for every object in a frame.
I am having trouble with calling different instances of Kalman Filter that will work independently of each other for every object tracked in a frame
Here is the Class Kalman filter that I am working with
import numpy as np
class KalmanFilter:
kf = cv2.KalmanFilter(4, 2)
kf.measurementMatrix = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], np.float32)
kf.transitionMatrix = np.array([[1, 0, 1, 0], [0, 1, 0, 1], [0, 0, 1, 0], [0, 0, 0, 1]], np.float32)
def predict(self, coordX, coordY):
''' This function estimates the position of the object'''
measured = np.array([[np.float32(coordX)], [np.float32(coordY)]])
self.kf.correct(measured)
predicted = self.kf.predict()
x, y = int(predicted[0]), int(predicted[1])
return x,y
Please help me to modify this code so that whenever I call this class I give it a object_id for which it creates a new instance of the filter only for that object.
thanks

Can you translate a flashing light into morse code?

I have made a morse code translator and I want it to be able to record a flashing light and make it into morse code. I think I will need OpenCV or a light sensor, but I don't know how to use either of them. I haven't got any code for it yet, as I couldn't find any solutions anywhere else.

The following is just a concept of what you could try. Yes, you could also train a neural network for this but if your setup is simple enough, some engineering will do.
We first create a "toy-video" to work with:
import numpy as np
import matplotlib.pyplot as plt
# Create a toy "video"
image = np.asarray([
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 1, 2, 2, 1],
[0, 0, 2, 4, 4, 2],
[0, 0, 2, 4, 4, 2],
[0, 0, 1, 2, 2, 1],
])
signal = np.asarray([0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0])
x = list(range(len(signal)))
signal = np.interp(np.linspace(0, len(signal), 100), x, signal)[..., None]
frames = np.einsum('tk,xy->txyk', signal, image)[..., 0]
Plot a few frames:
fig, axes = plt.subplots(1, 12, sharex='all', sharey='all')
for i, ax in enumerate(axes):
ax.matshow(frames[i], vmin=0, vmax=1)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax.set_title(i)
plt.show()
Now that you have this kind of toy video, it's pretty straight forward to convert it back to some sort of binary signal. You'd simply compute the average brightness of each frame:
reconstructed = frames.mean(1).mean(1)
reconstructed_bin = reconstructed > 0.5
plt.plot(reconstructed, label='original')
plt.plot(reconstructed_bin, label='binary')
plt.title('Reconstructed Signal')
plt.legend()
plt.show()
From here we only have to determine the length of each flash.
# This is ugly, I know. Just for understanding though:
# 1. Splits the binary signal on zero-values
# 2. Filters out the garbage (accept only lists where len(e) > 1)
# 3. Gets the length of the remaining list == the duration of each flash
tmp = np.split(reconstructed_bin, np.where(reconstructed_bin == 0)[0][1:])
flashes = list(map(len, filter(lambda e: len(e) > 1, tmp)))
We can now take a look at how long flashes take:
print(flashes)
gives us
[5, 5, 5, 10, 9, 9, 5, 5, 5]
So.. "short" flashes seem to take 5 frames, "long" around 10. With this we can classify each flash as either being "long" or "short" by defining a sensible threshold of 7 like so:
# Classify each flash-duration
flashes_classified = list(map(lambda f: 'long' if f > 7 else 'short', flashes))
And let's repeat for pauses
# Repeat for pauses
tmp = np.split(reconstructed_bin, np.where(reconstructed_bin != False)[0][1:])
pauses = list(map(len, filter(lambda e: len(e) > 1, tmp)))
pauses_classified = np.asarray(list(map(lambda f: 'w' if f > 6 else 'c', pauses)))
pauses_indices, = np.where(np.asarray(pauses_classified) == 'w')
Now we can visualize the results.
fig = plt.figure()
ax = fig.gca()
ax.bar(range(len(flashes)), flashes, label='Flash duration')
ax.set_xticks(list(range(len(flashes_classified))))
ax.set_xticklabels(flashes_classified)
[ax.axvline(idx-0.5, ls='--', c='r', label='Pause' if i == 0 else None) for i, idx in enumerate(pauses_indices)]
plt.legend()
plt.show()

It somewhat depends on your environment. You might try inexpensively with a Raspberry Pi Zero (£9) or even a Pico (£4) or Arduino and an attached LDR - Light Dependent Resistor for £1 rather than a £100 USB camera.
Your program would then come down to repeatedly measuring the resistance (which depends on the light intensity) and making it into long and short pulses.
This has the benefit of being cheap and not requiring you to learn OpenCV, but Stefan's idea is far more fun and has my vote!

When rotating shapes, vertices are just slightly off

I'm working on a manim challenge from this video: https://youtu.be/HKPm8FZYaqI?t=700. The challenge is to code the animation which starts at 11:40 and ends at 11:49.
I got up to the point where the triangles are rotated and copied over to the second square, but for some reason they are just the ones that I had to rotate are just ever so slightly off, while the ones I didn't have to rotate seem to be perfect.
Look at this image:
The triangles fit perfectly inside the square to the right. But in the right square, the ones that were rotated (1 and 4) do not. Below is a closeup of what I mean for triangle number 1:
Of course, this is how I want it to look:
The dimensions of the shapes and maybe the colours are a little different, but that is because this is the solution of the author of the video, and the previous was my attempt. I don't care about that, I only care about why the triangles don't fit perfectly in my attempt like they do here.
Zooming in on this picture, we see that the triangles do indeed fit perfectly:
Any insight into why this is happening would be very much appreciated!
The source code for my animation is this:
class Pythagoras(Scene):
def construct(self):
title = TextMobject("Pythagorean Theorem")
title.to_edge(UL)
pre_square = Polygon(
[-2, 2, 0],
[2, 2, 0],
[2, -2, 0],
[-2, -2, 0],
color=WHITE
)
self.wait()
square2 = Polygon(
[-1.41, 1.41, 0],
[1.41, 1.41, 0],
[1.41, -1.41, 0],
[-1.41, -1.41, 0]
)
square2.rotate(PI/6)
triangle1 = Polygon(
[-2, 2, 0],
[-2 + math.sqrt(6), 2, 0],
[-2, 2 - math.sqrt(2), 0],
color=YELLOW
)
triangle2 = Polygon(
[2, 2, 0],
[-2 + math.sqrt(6), 2, 0],
[2, 2 - math.sqrt(6), 0],
color=YELLOW
)
triangle3 = Polygon(
[2, 2 - math.sqrt(6), 0],
[2, -2, 0],
[2 - math.sqrt(6), -2, 0],
color=YELLOW
)
triangle4 = Polygon(
[-2, 2 - math.sqrt(2), 0],
[-2, -2, 0],
[2 - math.sqrt(6), -2, 0],
color=YELLOW
)
triangles = [triangle1, triangle2, triangle3, triangle4]
for triangle in triangles:
triangle.set_fill(YELLOW, 0.6)
self.play(Write(title), ShowCreation(pre_square), ShowCreation(triangle1), ShowCreation(triangle2), ShowCreation(triangle3), ShowCreation(triangle4))
self.wait()
group = VGroup(pre_square, triangle1, triangle2, triangle3, triangle4)
self.play(ApplyMethod(group.to_edge, LEFT, {"buff": 1.6}))
self.wait()
square3 = pre_square.copy()
self.play(ApplyMethod(square3.shift, RIGHT * 7))
triangle2.generate_target()
triangle2.target.shift(RIGHT * (7- math.sqrt(6)))
triangle1.generate_target()
triangle1.target = triangle2.target.copy().rotate(PI)
triangle3.generate_target()
triangle3.target.shift(RIGHT * 7)
triangle4.generate_target()
triangle4.target = triangle3.target.copy().rotate(PI)
self.play(MoveToTarget(triangle1.copy()), MoveToTarget(triangle2.copy()), MoveToTarget(triangle3.copy()), MoveToTarget(triangle4.copy()))
self.wait()

The problem is the thickness of the VMobjects, by default it is 4, if you change it to 2 or 1 (in the solution that I give is 1) those corners are removed. Add this in your for:
for triangle in triangles:
triangle.set_fill(YELLOW, 0.6)
triangle.set_stroke(None,1.5)
#or
#triangle.set_stroke(width=1.5)
#it is the same

There is actually nothing wrong with the code just how these triangles are drawn. The border around the triangle have width which causes the artifact. If you remove the border or consider the border as the range of the triangle the problem will go away.

Semantic Segmentation to Bounding Boxes

Suppose you are performing semantic segmentation. For simplicity, let's assume this is 1D segmentation rather than 2D (i.e. we only care about finding objects with width).
So the desired output of our model might be something like:
[
[0, 0, 0, 0, 1, 1, 1], # label channel 1
[1, 1, 1, 0, 0, 1, 1], # label channel 2
[0, 0, 0, 1, 1, 1, 0], # label channel 3
#...
]
However, our trained imperfect model might be more like
[
[0.1, 0.1, 0.1, 0.4, 0.91, 0.81, 0.84], # label channel 1
[0.81, 0.79, 0.85, 0.1, 0.2, 0.61, 0.91], # label channel 2
[0.3, 0.1, 0.24, 0.87, 0.62, 1, 0 ], # label channel 3
#...
]
What would be a performant way, using python, for getting the boundaries of the labels (or bounding box)
e.g. (zero-indexed)
[
[[4, 6]], # "objects" of label 1
[[0, 2], [5, 6]] # "objects" of label 2
[[3, 5]], # "objects" of label 3
]
if it helps, perhaps transforming it to a binary mask would be of more use?
def binarize(arr, cutoff=0.5):
return (arr > cutoff).astype(int)
with a binary mask we just need to find the consecutive integers of the indices of nonzero values:
def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)
find "runs" of labels:
def binary_boundaries(labels, cutoff=0.5):
return [consecutive(channel.nonzero()[0]) for channel in binarize(labels, cutoff)]
name objects according to channel name:
def binary_objects(labels, cutoff=0.5, channel_names=None):
if channel_names == None:
channel_names = ['channel {}'.format(i) for i in range(labels.shape[0])]
return dict(zip(channel_names, binary_boundaries(labels, cutoff)))

Your trained model returned the float image and not the int image you were looking for (and it's not 'imperfect' if decimals were bothering you) and Yes! you do need to threshold it to get binary image.
Once you do have the binary image, lets do some work with skimage.
label_mask = measure.label(mask)
props = measure.regionprops(label_mask)
mask is your binary image and here you do have props the properties of all the regions which are detected objects actually.
Among these properties, there exists bounding box!

Scikit image: proper way of counting cells in the objects of an image

Say you have an image in the form of a numpy.array:
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
And you want to compute how many cells are inside each object, given a threshold value of 17 (example):
from scipy import ndimage
from skimage.measure import regionprops
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
props = regionprops(blobs)
If you check, this gives an image with 4 distinct objects over the threshold:
In[1]: blobs
Out[1]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 1, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1]])
In fact:
In[2]: no_objects
Out[2]: 4
I want to compute the number of cells (or area) of each object. The intended outcome is a dictionary with the object ID: number of cells format:
size={0:2,1:2,2:1,3:2}
My attempt:
size={}
for label in props:
size[label]=props[label].area
Returns an error:
Traceback (most recent call last):
File "<ipython-input-76-e7744547aa17>", line 3, in <module>
size[label]=props[label].area
TypeError: list indices must be integers, not _RegionProperties
I understand I am using label incorrectly, but the intent is to iterate over the objects. How to do this?

A bit of testing and research sometimes goes a long way.
The problem is both with blobs, because it is not carrying the different labels but only 0,1 values, and label, which needs to be replaced by an iterator looping over range(0,no_objects).
This solution seems to be working:
import skimage.measure as measure
import numpy
from scipy import ndimage
from skimage.measure import regionprops
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
#blobs is not in an amicable type to be processed right now, so:
labelled=ndimage.label(blobs)
resh_labelled=labelled[0].reshape((vals.shape[0],vals.shape[1])) #labelled is a tuple: only the first element matters
#here come the props
props=measure.regionprops(resh_labelled)
#here come the sought-after areas
size={i:props[i].area for i in range (0, no_objects)}
Result:
In[1]: size
Out[1]: {0: 2, 1: 2, 2: 1, 3: 2}
And if anyone wants to check for the labels:
In[2]: labels
Out[2]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 2, 2, 0],
[3, 0, 0, 0, 0],
[0, 0, 0, 4, 4]])
And if anyone wants to plot the 4 objects found:
import matplotlib.pyplot as plt
plt.set_cmap('OrRd')
plt.imshow(labels,origin='upper')

To answer the original question:
You have to apply regionprops to the labeled image: props = regionprops(labels)
You can then construct the dictionary using:
size = {r.label: r.area for r in props}
which yields
{1: 2, 2: 2, 3: 1, 4: 2}

That regionprops will generate a lot more information than just the area of each blob. So, if you are just looking to get the count of pixels for the blobs, as an alternative and with focus on performance, we can use np.bincount on labels obtained with ndimage.label, like so -
np.bincount(labels.ravel())[1:]
Thus, for the given sample -
In [53]: labeled_areas = np.bincount(labels.ravel())[1:]
In [54]: labeled_areas
Out[54]: array([2, 2, 1, 2])
To have these results in a dictionary, one additional step would be -
In [55]: dict(zip(range(no_objects), labeled_areas))
Out[55]: {0: 2, 1: 2, 2: 1, 3: 2}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Superimpose objects on a video stream using Python and POVRAY - python

Related

Using Multiple Instances of KalmanFilter

Can you translate a flashing light into morse code?

When rotating shapes, vertices are just slightly off

Semantic Segmentation to Bounding Boxes

Scikit image: proper way of counting cells in the objects of an image

Categories

Resources