How to manipulate image bands as arrays of numbers - python

I'm new to Python, and I'm trying to deconstruct image bands as arrays of numbers by applying the Singular Value Decomposition (SVD) to them and then putting them back together with matplotlib.image and the Image module from PIL. An SVD may also be written as a sum of dyads s1u1v1T + ... + sKuKvKT, and the point in decomposing it in this way is that a near-perfect approximation of the image can be made from just a few of those dyads, so less data is required.
There must be something wrong with the calculation, though because result_r, result_g, and result_b look like this when converted to Images, and new_image looks like this.
For an example of what this should look like, here are the first dyads of the layers of this image. The image that I'm using (April23.jpg) is this.
import matplotlib.image as image
import numpy.linalg as la
import numpy as np
from PIL import Image
def getcolumn(j, m):
col = []
for i in range(len(m)):
col.append(m[i][j])
return col
def extractCols(U):
Ucols = []
for j in range(len(U[0])):
Ucols.append(getcolumn(j, U))
return np.asarray(Ucols)
def vectorMultiply(u, v):
matrix = []
for i in range(len(u)):
newVec = []
for j in range(len(v)):
newVec.append(u[i] * v[j])
matrix.append(newVec)
return np.asarray(matrix)
im = Image.open('C:/Users/<user>/Desktop/img/April23.jpg')
im.load()
sim = Image.Image.split(im)
rsim = sim[0].save("rsim.jpg") # image bands as images
gsim = sim[1].save("gsim.jpg")
bsim = sim[2].save("bsim.jpg")
# image bands as arrays of numbers
arsim = image.imread('C:/Users/<user>/Desktop/img/rsim.jpg')
agsim = image.imread('C:/Users/<user>/Desktop/img/gsim.jpg')
absim = image.imread('C:/Users/<user>/Desktop/img/bsim.jpg')
ur, sr, vhr = la.svd(arsim, False) # SVD on each band
ug, sg, vhg = la.svd(agsim, False)
ub, sb, vhb = la.svd(absim, False)
urcols = extractCols(ur)
ugcols = extractCols(ug)
ubcols = extractCols(ub)
# calculating the first dyads
result_r = np.multiply(sr[0], vectorMultiply(urcols[0], vhr[0]))
result_g = np.multiply(sg[0], vectorMultiply(ugcols[0], vhg[0]))
result_b = np.multiply(sb[0], vectorMultiply(ubcols[0], vhb[0]))
r = Image.fromarray(result_r, "L")
g = Image.fromarray(result_g, "L")
b = Image.fromarray(result_b, "L")
new_image = Image.merge("RGB", (r, g, b))
What am I missing, here? It seems to be something with the calculations. I figured for a matrix one would have to extract the columns, say the column [1, 2, 3] from a matrix [[1,...], [2,...], [3,...]], since each element of the matrix is a row. So, I wrote extractCols() for that. numpy's matrix add and multiply seem to be fine. I wrote vectorMultiply because np.dot(), np.multiply(), and np.matmul() didn't seem to realize that u was a column and kept saying the dimensions didn't match up. I tested it and it seemed to do what I wanted it to. I was also thinking that maybe the "rows" of U are actually the columns already and don't need to be extracted, but that didn't work either. I've also tried not using np.asarray() without any luck.
Any advice is appreciated.

Related

Sort image as NP array

I'm trying to sort an image by luminosity using NumPy, which I'm new to. I've managed to create a random image and sort it.
def create_image(output, width, height, arr):
array = np.zeros([height, width, 3], dtype=np.uint8)
numOfSwatches = len(arr)
swatchWidth = int(width/ numOfSwatches)
for i in range (0, numOfSwatches):
m = i * swatchWidth
r = (i+1) * swatchWidth
array[:, m:r] = arr[i]
img = Image.fromarray(array)
img.save(output)
Which creates this image:
So far so good. Only now I want to switch from creating random images to loading them and then sorting them.
#!/usr/bin/python3
import numpy as np
from PIL import Image
# --------------------------------------------------------------
def load_image( infilename ) :
img = Image.open( infilename )
img.load()
data = np.asarray( img, dtype = "int32" )
return data
# --------------------------------------------------------------
def lum (r,g,b):
return math.sqrt( .241 * r + .691 * g + .068 * b )
myImageFile = "random_colours.png"
imageNP = load_image(myImageFile)
imageNP.sort(key=lambda rgb: lum(*rgb) )
The image should look like this:
The error I get is TypeError: 'key' is an invalid keyword argument for this function I may have created the NP array incorrectly as it worked when it was a random NP array.
Have not ever used PIL, but the following approach hopefully works (I'm not sure as I can't reproduce your exact examples), and of course there might be more efficient ways to do so.
I'm using your functions, having changed the math.sqrt function to np.sqrt in the lum function - as it is better for vector calculations. By the way, I believe this won't work with an int32 type array (as in your load_image function).
The key part is Numpy's argsort function (last line), which gives the indices that would sort the given array; this is applied to a row of the luminosity array (exploiting simmetry) and later used as indexer of img_array.
# Create random image
np.random.seed(4)
img = create_image('test.png', 75, 75, np.random.random((25,3))*255)
# Convert to Numpy array and calculate luminosity
img_array = np.array(img, dtype = np.uint8)
luminosity = lum(img_array[...,0], img_array[...,1], img_array[...,2])
# Sort by luminosity and convert to image again
img_sorted = Image.fromarray(img_array[:,luminosity[0].argsort()])
The original picture:
And the luminosity-sorted one:

cv2.resize with Python : what exactly does the interpolation methods?

Given a 9x9 matrix representing an image (its entries are a [R, G, B]), I want to create a new resized image with size 3x3 which each entry is computed as follows :
divide the 9x9 matrix into 9 blocks of 3x3 matrices
compute the mean (component-wise) of each 3x3 matrix bloc
create the 3x3 image with these means.
So far I have used the cv2 library with Python 3.6
image_blurred = cv2.resize(original_image, (3,3), interpolation=cv2.INTER_AREA)
But I am not sure about what precisely cv2.INTER_AREA does.
Could you give me some information about this ? (There are some information here but they do not give so many details.)
Many thanks.
It seems that the interpolation cv2.INTER_AREA does this averaging. I wrote a test below if you are interested.
import cv2
import numpy as np
n = 9
grid_colors = []
for _ in range(n):
column = []
for _ in range(n):
colors = []
for k in range(3):
colors.append(np.random.randint(256))
column.append(colors)
grid_colors.append(column)
moy = []
for a in range(3):
col = []
for b in range(3):
colors = []
for c in range(3):
colors.append(round(sum([grid_colors[i+3*a][j+3*b][c] for i in range(3) for j in range(3)]) / 9))
col.append(colors)
moy.append(col)
image_blurred = cv2.resize(np.array(grid_colors, dtype = np.uint8), (len(grid_colors[0]) // 3, len(grid_colors) // 3), interpolation=cv2.INTER_AREA)
print("image blurred: ")
print(image_blurred)
print("grid_colors: ")
print(grid_colors)

How to optimize for loops for generating a new random Poisson array in python?

I want to read an grayscale image, say something with (248, 480, 3) shape, then use each element of it as the lam value for making a Poisson random value and do this for each element and make a new data set with the same shape. I want to do this as much as nscan, then I want to add them all together and put them in a new data set and plot it again to get something that is similar to the first image that I put in the beginning. This code is working but it is extremely slow, I was wondering if there is any way to make it faster?
import numpy as np
import matplotlib.pyplot as plt
my_image = plt.imread('myimage.png')
def genP(data):
new_data = np.zeros(data.shape)
for i in range(data.shape[0]):
for j in range(data.shape[1]):
for k in range(data.shape[2]):
new_data[i, j, k] = np.random.poisson(lam = data[i, j, k])
return new_data
def get_total(data, nscan = 1):
total = genP(data)
for i in range(nscan):
total += genP(data)
total = total/nscan
plt.imshow(total)
plt.show()
get_total(my_image, 100)
numpy.random.poisson can entirely replace your genP() function... This is basically guaranteed to be much faster.
If size is None (default), a single value is returned if lam is a scalar. Otherwise, np.array(lam).size samples are drawn
def get_total(data, nscan = 1):
total = np.random.poisson(lam=data)
for i in range(nscan):
total += np.random.poisson(lam=data)
total = total/nscan
plt.imshow(total)
plt.show()

How to vectorize a code with python numpy.bincount, using apply along axis

I'm trying to vectorize a code with numpy, to run it using multiprocessing, but i can't understand how numpy.apply_along_axis works. This is an example of the code, vectorized using map
import numpy
from scipy import sparse
import multiprocessing
from matplotlib import pyplot
#first i build a matrix of some x positions vs time datas in a sparse format
matrix = numpy.random.randint(2, size = 100).astype(float).reshape(10,10)
x = numpy.nonzero(matrix)[0]
times = numpy.nonzero(matrix)[1]
weights = numpy.random.rand(x.size)
#then i define an array of y positions
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
#now i build an image using x-y-times coordinates and x-times weights
def mapIt(ithStep):
ncolumns = 80
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
image = list(map(mapIt, range(nStepsY)))
image = numpy.array(image)
a = pyplot.imshow(image, aspect = 10)
Here the output plot
I tried to use numpy.apply_along_axis, but this function allows me to iterate only along the rows of image, while i need to iterate along the ithStep index too. E.g.:
#now i build an image using x-y-times coordinates and x-times weights
nrows = nStepsY
ncolumns = 80
matrix = numpy.zeros(nrows*ncolumns).reshape(nrows,ncolumns)
def applyIt(image):
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
imageApplied = numpy.apply_along_axis(applyIt,1,matrix)
a = pyplot.imshow(imageApplied, aspect = 10)
It obviously return only the firs row nrows times, since nothing iterates ithStep:
And here the wrong plot
There is a way to iterate an index, or to use an index while numpy.apply_along_axis iterates?
Here the code with only matricial operations: it's quite faster than map or apply_along_axis but uses so much memory.
(in this function i use a trick with scipy.sparse, which works more intuitively than numpy arrays when you try to sum numbers on a same element)
def fullmatrix(nRows, nColumns):
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
yTimed = numpy.outer(y,times)
x3d = numpy.outer(numpy.ones(nStepsY),x)
weights3d = numpy.outer(numpy.ones(nStepsY),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions)))).todense()
return matrix
image = fullmatrix(nStepsY, 80)
a = pyplot.imshow(image, aspect = 10)
This way is simplier and very fast! Thank you so much.
nStepsY = 5
nRows = nStepsY
nColumns = 80
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
fakeRow = numpy.zeros(positions.size)
def itermatrix(ithStep):
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((weights, (fakeRow, positions))).todense()
matrix = numpy.ravel(matrix)
missColumns = (nColumns-matrix.size)
zeros = numpy.zeros(missColumns)
matrix = numpy.concatenate((matrix, zeros))
return matrix
for i in numpy.arange(nStepsY):
image[i] = itermatrix(i)
#or, without initialization of image:
imageMapped = list(map(itermatrix, range(nStepsY)))
imageMapped = numpy.array(imageMapped)
It feels like attempting to use map or apply_along_axis is obscuring the essentially iteration of the problem.
I rewrote your code as an explicit loop on y:
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nStepsY, 80))
for i, yi in enumerate(y):
yTimed = yi*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[i, positions] = values
a = pyplot.imshow(image, aspect = 10)
pyplot.show()
Looking at the code, I think I could calculate positions for all y values making a (y.shape[0],times.shape[0]) array. But the rest, the bincount and unique still have to work row by row.
apply_along_axis when working with a 2d array, and axis=1 essentially does:
res = np.zeros_like(arr)
for i in range....:
res[i,:] = func1d(arr[i,:])
If the input array has more dimensions it constructs a more elaborate indexing object [i,j,k,:]. And it can handle cases where func1d returns a different size array than the input. But in any case it is just a generalized iteration tool.
Moving the initial positions creation outside the loop:
yTimed = y[:,None]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
image = numpy.zeros((positions.shape[0], 80))
for i, pos in enumerate(positions):
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
image[i, pos] = values
Now I can cast this as an apply_along_axis problem, with an applyIt that takes a positions vector (with all the yTimed information) rather than blank image vector.
def applyIt(pos, size, weights):
acolumn = numpy.zeros(size)
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
acolumn[pos] = values
return acolumn
image = numpy.apply_along_axis(applyIt, 1, positions, 80, weights)
Timing wise I expect it's a bit slower than my explicit iteration. It has to do more setup work, including a test call applyIt(positions[0,:],...) to determine the size of its return array (i.e image has different shape than positions.)
def csrmatrix(y, times, x, weights):
yTimed = numpy.outer(y,times)
n=y.shape[0]
x3d = numpy.outer(numpy.ones(n),x)
weights3d = numpy.outer(numpy.ones(n),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
#print(y.shape, weights3d.shape, y3d.shape, positions.shape)
matrix = sparse.csr_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions))))
#print(repr(matrix))
return matrix
# one call
image = csrmatrix(y, times, x, weights)
# iterative call
alist = []
for yi in numpy.arange(1,nStepsY+1):
alist.append(csrmatrix(numpy.array([yi]), times, x, weights))
def mystack(alist):
# concatenate without offset
row, col, data = [],[],[]
for A in alist:
A = A.tocoo()
row.extend(A.row)
col.extend(A.col)
data.extend(A.data)
print(len(row),len(col),len(data))
return sparse.csr_matrix((data, (row, col)))
vimage = mystack(alist)

python numpy: array of arrays

I'm trying to build a numpy array of arrays of arrays with the following code below.
Which gives me a
ValueError: setting an array element with a sequence.
My guess is that in numpy I need to declare the arrays as multi-dimensional from the beginning, but I'm not sure..
How can I fix the the code below so that I can build array of array of arrays?
from PIL import Image
import pickle
import os
import numpy
indir1 = 'PositiveResize'
trainimage = numpy.empty(2)
trainpixels = numpy.empty(80000)
trainlabels = numpy.empty(80000)
validimage = numpy.empty(2)
validpixels = numpy.empty(10000)
validlabels = numpy.empty(10000)
testimage = numpy.empty(2)
testpixels = numpy.empty(10408)
testlabels = numpy.empty(10408)
i=0
tr=0
va=0
te=0
for (root, dirs, filenames) in os.walk(indir1):
print 'hello'
for f in filenames:
try:
im = Image.open(os.path.join(root,f))
Imv=im.load()
x,y=im.size
pixelv = numpy.empty(6400)
ind=0
for i in range(x):
for j in range(y):
temp=float(Imv[j,i])
temp=float(temp/255.0)
pixelv[ind]=temp
ind+=1
if i<40000:
trainpixels[tr]=pixelv
tr+=1
elif i<45000:
validpixels[va]=pixelv
va+=1
else:
testpixels[te]=pixelv
te+=1
print str(i)+'\t'+str(f)
i+=1
except IOError:
continue
trainimage[0]=trainpixels
trainimage[1]=trainlabels
validimage[0]=validpixels
validimage[1]=validlabels
testimage[0]=testpixels
testimage[1]=testlabels
Don't try to smash your entire object into a numpy array. If you have distinct things, use a numpy array for each one then use an appropriate data structure to hold them together.
For instance, if you want to do computations across images then you probably want to just store the pixels and labels in separate arrays.
trainpixels = np.empty([10000, 80, 80])
trainlabels = np.empty(10000)
for i in range(10000):
trainpixels[i] = ...
trainlabels[i] = ...
To access an individual image's data:
imagepixels = trainpixels[253]
imagelabel = trainlabels[253]
And you can easily do stuff like compute summary statistics over the images.
meanimage = np.mean(trainpixels, axis=0)
meanlabel = np.mean(trainlabels)
If you really want all the data to be in the same object, you should probably use a struct array as Eelco Hoogendoorn suggests. Some example usage:
# Construction and assignment
trainimages = np.empty(10000, dtype=[('label', np.int), ('pixel', np.int, (80,80))])
for i in range(10000):
trainimages['label'][i] = ...
trainimages['pixel'][i] = ...
# Summary statistics
meanimage = np.mean(trainimages['pixel'], axis=0)
meanlabel = np.mean(trainimages['label'])
# Accessing a single image
image = trainimages[253]
imagepixels, imagelabel = trainimages[['pixel', 'label']][253]
Alternatively, if you want to process each one separately, you could store each image's data in separate arrays and bind them together in a tuple or dictionary, then store all of that in a list.
trainimages = []
for i in range(10000):
pixels = ...
label = ...
image = (pixels, label)
trainimages.append(image)
Now to access a single images data:
imagepixels, imagelabel = trainimages[253]
This makes it more intuitive to access a single image, but because all the data is not in one big numpy array you don't get easy access to functions that work across images.
Refer to the examples in numpy.empty:
>>> np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]]) #random
Give your images a shape with the N dimensions:
testpixels = numpy.empty([96, 96])

Categories