Related
I have a numpy array of size 55 x 10 x 10, which represents 55 10 x 10 grayscale images. I'm trying to make them RGB by duplicating the 10 x 10 images 3 times.
From what I've understood, I first need to add a new dimension to house the duplicated data. I've done this using:
array_4d = np.expand_dims(array_3d, 1),
so I now have a 55 x 1 x 10 x 10 array. How do I now duplicate the 10 x 10 images and add them back into this array?
Quick edit: In the end I want a 55 x 3 x 10 x 10 array
Let us first create a 3d array of size 55x10x10
from matplotlib import pyplot as plt
import numpy as np
original_array = np.random.randint(10,255, (55,10,10))
print(original_array.shape)
>>>(55, 10, 10)
Visual of first image in array:
first_img = original_array[0,:,:]
print(first_img.shape)
plt.imshow(first_img, cmap='gray')
>>>(10, 10)
Now you can get your desired array in just one single step.
stacked_img = np.stack(3*(original_array,), axis=1)
print(stacked_img.shape)
>>>(55, 3, 10, 10)
Use axis=-1 if you want channel last
Now let us verify that the value are correct by extracting the first image from this array and taking average of 3 channels:
new_img = stacked_img[0,:,:,:]
print(new_img.shape)
>>> (3, 10, 10)
new_img_mean = new_img.mean(axis=0)
print(new_img_mean.shape)
>>> (10, 10)
np.allclose(new_img_mean, first_img) # If this is True then the two arrays are same
>>> True
For visual verification, you'll have to move the channel to last because that is what matplotlib needs. This is a 3 channel image, so we are not using cmap='gray' here
print(np.moveaxis(new_img, 0, -1).shape)
plt.imshow(np.moveaxis(new_img, 0, -1))
>>> (10, 10, 3)
I have an image of dimension 155 x 240. Like the following:
I want to extract certain shape of patchs (25 x 25).
I don't want to patch from the whole image.
I want to extract N number of patch from non-zero (not background) area of the image. How can I do that? Any idea or suggestion or implementation will be appreciated. You can try with either Matlab or Python.
Note:
I have generated a random image so that you can process it for patching. image_process variable is that image in this code.
import numpy as np
from scipy.ndimage.filters import convolve
import matplotlib.pyplot as plt
background = np.ones((155,240))
background[78,120] = 2
n_d = 50
y,x = np.ogrid[-n_d: n_d+1, -n_d: n_d+1]
mask = x**2+y**2 <= n_d**2
mask = 254*mask.astype(float)
image_process = convolve(background, mask)-sum(sum(mask))+1
image_process[image_process==1] = 0
image_process[image_process==255] = 1
plt.imshow(image_process)
Lets assume that the pixels values you want to omit is 0.
In this case what you could do, is first find the indices of the non-zero values, then slice the image in the min/max position to get only the desired area, and then simply apply extract_patches_2d with the desired window size and number of patches.
For example, given the dummy image you supplied:
import numpy as np
from scipy.ndimage.filters import convolve
import matplotlib.pyplot as plt
background = np.ones((155,240))
background[78,120] = 2
n_d = 50
y,x = np.ogrid[-n_d: n_d+1, -n_d: n_d+1]
mask = x**2+y**2 <= n_d**2
mask = 254*mask.astype(float)
image_process = convolve(background, mask)-sum(sum(mask))+1
image_process[image_process==1] = 0
image_process[image_process==255] = 1
plt.figure()
plt.imshow(image_process)
plt.show()
from sklearn.feature_extraction.image import extract_patches_2d
x, y = np.nonzero(image_process)
xl,xr = x.min(),x.max()
yl,yr = y.min(),y.max()
only_desired_area = image_process[xl:xr+1, yl:yr+1]
window_shape = (25, 25)
B = extract_patches_2d(only_desired_area, window_shape, max_patches=100) # B shape will be (100, 25, 25)
If you plot the only_desired_area you will get the following image:
This is the main logic if you wish an even tighter bound you should adjust the slicing properly.
Below is my code:
import matplotlib.pyplot as plt
import matplotlib.image as mp_image
filename = "abc.jpeg"
input_image = mp_image.imread(filename)
my_image=tf.placeholder("uint8",[None, None, 3])
myimage=tf.placeholder("uint8",[None, None, 3])
slice1=tf.slice(my_image,[0,100,0],[300,400,-1]) #[x,y,?],[x,y,?]
with tf.Session() as sess:
result = sess.run(slice1,feed_dict={my_image: input_image})
print(result.shape)
plt.imshow(result)
plt.show()
In slice 1 what does the parameters passed as list indicate[x,y,?],[x,y,?].
In tf.slice(image_tensor,[0,0,0],[100,200,-1]).What does 0 and -1 stands for here and why i cannot change them?
Looking at the docstring for tf.slice, the parameters are input_, begin and size respectively. Your code is doing
slice1=tf.slice(my_image,[x_begin,y_begin,channel_begin],[x_size,y_size,channel_size])
Note that the third parameter describes the size and not the absolute to index (#XMANX is mistaken). The size parameter accepts a sentinel value of -1, which means that all remaining elements in the dimension are included in the slice.
For example, if you had a tensor t with shape [X, Y, Z]
tf.slice(t, [x_begin, y_begin, z_begin], [x_size, y_size, z_size])
is equivalent to doing
t[x_begin : x_begin+x_size, y_begin : y_begin+y_size, z_begin : z_begin+z_size]
In order to extract just the R channel from an image, you would do something like:
import matplotlib.pyplot as plt
import matplotlib.image as mp_image
filename = "abc.jpeg"
input_image = mp_image.imread(filename)
my_image=tf.placeholder("uint8",[None, None, 3])
# Doesn't slice along the x and y dimensions, but takes only one channel
sliced = tf.slice(my_image,[0, 0, 0], [-1, -1, 1])
squeezed = tf.squeeze(slice) # Removes last dimension
with tf.Session() as sess:
result = sess.run(squeezed,feed_dict={my_image: input_image})
plt.imshow(result)
plt.show()
[x,y,?],[x,y,?] in your case, the third parameter of the shape is a number of image channels
For answer on a second question lets take a look how tf.slice working, in case of the image with RGB channels it looks like tf.slice([from_x, from_y, from_channel], [to_x, to_y, to_channel]) also in shape definition you can use -1 this way you telling sensor flow slice to maximum available value. From your code sample you trying a slice input image [0,100, 0],[300,400, 3], you can change the third param and it is a valid code, but should remember matplotlib allows you to show pictures only (M, N), (M, N, 3), (M, N, 4)
Code explanation:
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import matplotlib.image as mp_image
filename = "abc.png"
input_image = mp_image.imread(filename)
my_image = tf.placeholder("float32",[None, None, 3])
myimage = tf.placeholder("float32",[None, None, 3])
# this way you slicing RGB image
slice1=tf.slice(my_image,[0,100,0],[300, 400, -1])
with tf.Session() as sess:
result = sess.run(slice1, feed_dict={my_image: input_image})
plt.subplot(1, 2, 1)
plt.imshow(result)
# this way you slicing image and keep just R channel
slice1=tf.slice(my_image,[0,100,0],[300, 400, 1]) #[x,y,?],[x,y,?]
with tf.Session() as sess:
result = sess.run(slice1, feed_dict={my_image: input_image})
plt.subplot(1, 2, 2)
# matplotlib imshow show need extra options to render image with single chanel
plt.imshow(np.reshape(result, (result.shape[0], result.shape[1])), cmap='gray')
plt.show()
result image
I have a 3D array and I would like to use Dask to chunk up my 3D array into blocks of traces of a certain window size around each trace. A trace is just one vector of size (1, 1, z). I can do this using the numpy as_strided tricks as follows:
import numpy as np
from numpy.lib.stride_tricks import as_strided
input_volume = np.linspace(1, 1000, 1000, dtype=int).reshape((10, 10, 10))
window_size = 5
x, y, z = input_volume.shape
# Create a view on the volume of sub-cubes window_size traces wide overlapping by 1 trace in each direction
half_w = (window_size - 1) // 2
padded = np.pad(input_volume[...], [(half_w, half_w), (half_w, half_w), (0, 0)], 'edge')
x_str, y_str, z_str = padded.strides
blocks = as_strided(padded, (x, y, window_size, window_size, z), (x_str, y_str, x_str, y_str, z_str))
averaged_volume = np.mean(blocks, (2, 3))
First I pad my 3D cube in the x and y dimensions by the half window. I get the average trace from each block so in this case a block of (5, 5, z) gets reduced to a single trace. I then end up with a volume the same size as the original that has been averaged over the window size. This effectively gives me a "view" of my 3D array with as shape of (10, 10, 5, 5, 10).
This works but if the volume is large it will load the whole volume into memory.
I have been trying to achieve the same thing with a chunked array in dask but I'm having trouble getting the depth and boundaries correct to give me the same answer. How can I achieve the same thing in dask so it only loads each block of traces into memory at a time and writes back out to the average cube?
EDIT:
This is the dask code I have been trying so far but when this runs I get an IndexError: tuple index out of range when it's trying to do the average calculation:
def average(block):
return np.mean(block, axis=(0, 1))
import dask.array as da
dask_volume = da.from_array(da.pad(input_volume, [(half_w, half_w), (half_w, half_w), (0, 0)], 'edge'), chunks=(window_size ,window_size, -1))
dask_overlapping = da.overlap.overlap(dask_volume, depth={0: window_size - 1, 1: window_size -1}, boundary={0: 'none', 1: 'none'})
dask_average = dask_overlapping.map_blocks(average, chunks=(1, 1, z)).compute()
Thanks,
Mike
I have a set of data records like this:
(s1, t1), (u1, v1), color1
(s2, t2), (u2, v2), color2
.
.
.
(sN, tN), (uN, vN), colorN
In any record, the first two values are the end-points of a line segment, the third value is the color of that line segment. More specifically, (sn, tn) are the x-y coordinates of the first end-point, (un, vn) are the x-y coordinates of the second-endpoint. Also, color is an rgb with alpha value.
In general, any two line segments are disconnected (meaning that their end-points do not necessarily coincide).
How to plot this data using matplotlib with a single plot call (or as few as possible) as there could be potentially thousands of records.
Attempts
Preparing the data in one big list and calling plot against it is way too slow. For example the following code couldn't finish in a reasonable amount of time:
import numpy as np
import matplotlib.pyplot as plt
data = []
for _ in xrange(60000):
data.append((np.random.rand(), np.random.rand()))
data.append((np.random.rand(), np.random.rand()))
data.append('r')
print 'now plotting...' # from now on, takes too long
plt.plot(*data)
print 'done'
#plt.show()
I was able to speed-up the plot rendering by using the None insertion trick as follows:
import numpy as np
import matplotlib.pyplot as plt
from timeit import timeit
N = 60000
_s = np.random.rand(N)
_t = np.random.rand(N)
_u = np.random.rand(N)
_v = np.random.rand(N)
x = []
y = []
for s, t, u, v in zip(_s, _t, _u, _v):
x.append(s)
x.append(u)
x.append(None)
y.append(t)
y.append(v)
y.append(None)
print timeit(lambda:plt.plot(x, y), number=1)
This executes in under a second on my machine. I still have to figure out how to embed the color values (RGB with alpha channel).
use LineCollection:
import numpy as np
import pylab as pl
from matplotlib import collections as mc
lines = [[(0, 1), (1, 1)], [(2, 3), (3, 3)], [(1, 2), (1, 3)]]
c = np.array([(1, 0, 0, 1), (0, 1, 0, 1), (0, 0, 1, 1)])
lc = mc.LineCollection(lines, colors=c, linewidths=2)
fig, ax = pl.subplots()
ax.add_collection(lc)
ax.autoscale()
ax.margins(0.1)
here is the output:
function plot allows to draw multiple lines in one call, if your data is just in a list, just unpack it when passing it to plot:
In [315]: data=[(1, 1), (2, 3), 'r', #assuming points are (1,2) (1,3) actually and,
#here they are in form of (x1, x2), (y1, y2)
...: (2, 2), (4, 5), 'g',
...: (5, 5), (6, 7), 'b',]
In [316]: plot(*data)
Out[316]:
[<matplotlib.lines.Line2D at 0x8752870>,
<matplotlib.lines.Line2D at 0x8752a30>,
<matplotlib.lines.Line2D at 0x8752db0>]
OK, I ended up rasterising the lines on a PIL image before converting it to a numpy array:
from PIL import Image
from PIL import ImageDraw
import random as rnd
import numpy as np
import matplotlib.pyplot as plt
N = 60000
s = (500, 500)
im = Image.new('RGBA', s, (255,255,255,255))
draw = ImageDraw.Draw(im)
for i in range(N):
x1 = rnd.random() * s[0]
y1 = rnd.random() * s[1]
x2 = rnd.random() * s[0]
y2 = rnd.random() * s[1]
alpha = rnd.random()
color = (int(rnd.random() * 256), int(rnd.random() * 256), int(rnd.random() * 256), int(alpha * 256))
draw.line(((x1,y1),(x2,y2)), fill=color, width=1)
plt.imshow(np.asarray(im),
origin='lower')
plt.show()
This is by far the fastest solution and it fits my real-time needs perfectly. One caveat though is the lines are drawn without anti-aliasing.
I have tried a good few 2D rendering engines available on Python 3, while looking for a fast solution for an output stage in image-oriented Deep Learning & GAN.
Using the following benchmark: Time to render 99 lines into a 256x256 off-screen image (or whatever is more effective) with and without anti-alias.
The results, in order of efficiency on my oldish x301 laptop:
PyGtk2: ~2500 FPS, (Python 2, GTK 2, not sure how to get AA)
PyQt5: ~1200 FPS, ~350 with Antialias
PyQt4: ~1100 FPS, ~380 with AA
Cairo: ~750 FPS, ~250 with AA (only slightly faster with 'FAST' AA)
PIL: ~600 FPS
The baseline is a loop which takes ~0.1 ms (10,000 FPS) retrieving random numbers and calling the primitives.
Basic code for PyGtk2:
from gtk import gdk
import random
WIDTH = 256
def r255(): return int(256.0*random.random())
cmap = gdk.Colormap(gdk.visual_get_best_with_depth(24), True)
black = cmap.alloc_color('black')
white = cmap.alloc_color('white')
pixmap = gdk.Pixmap(None, WIDTH, WIDTH, 24)
pixmap.set_colormap(cmap)
gc = pixmap.new_gc(black, line_width=2)
pixmap.draw_rectangle(gc, True, -1, -1, WIDTH+2, WIDTH+2);
gc.set_foreground(white)
for n in range(99):
pixmap.draw_line(gc, r255(), r255(), r255(), r255())
gdk.Pixbuf(gdk.COLORSPACE_RGB, False, 8, WIDTH, WIDTH
).get_from_drawable(pixmap, cmap, 0,0, 0,0, WIDTH, WIDTH
).save('Gdk2-lines.png','png')
And here is for PyQt5:
from PyQt5.QtCore import Qt
from PyQt5.QtGui import *
import random
WIDTH = 256.0
def r255(): return WIDTH*random.random()
image = QImage(WIDTH, WIDTH, QImage.Format_RGB16)
painter = QPainter()
image.fill(Qt.black)
painter.begin(image)
painter.setPen(QPen(Qt.white, 2))
#painter.setRenderHint(QPainter.Antialiasing)
for n in range(99):
painter.drawLine(WIDTH*r0to1(),WIDTH*r0to1(),WIDTH*r0to1(),WIDTH*r0to1())
painter.end()
image.save('Qt5-lines.png', 'png')
And here is Python3-Cairo for completeness:
import cairo
from random import random as r0to1
WIDTH, HEIGHT = 256, 256
surface = cairo.ImageSurface(cairo.FORMAT_A8, WIDTH, HEIGHT)
ctx = cairo.Context(surface)
ctx.scale(WIDTH, HEIGHT) # Normalizing the canvas
ctx.set_line_width(0.01)
ctx.set_source_rgb(1.0, 1.0, 1.0)
ctx.set_antialias(cairo.ANTIALIAS_NONE)
#ctx.set_antialias(cairo.ANTIALIAS_FAST)
ctx.set_operator(cairo.OPERATOR_CLEAR)
ctx.paint()
ctx.set_operator(cairo.OPERATOR_SOURCE)
for n in range(99):
ctx.move_to(r0to1(), r0to1())
ctx.line_to(r0to1(), r0to1())
ctx.stroke()
surface.write_to_png('Cairo-lines.png')