I have performed mean-shift segmentation on an image and got the labels array, where each point value corresponds to the segment it belongs to.
labels = [[0,0,0,0,1],
[2,2,1,1,1],
[0,0,2,2,1]]
On the other hand, I have the corresponding grayScale image, and want to perform operations on each regions independently.
img = [[100,110,105,100,84],
[ 40, 42, 81, 78,83],
[105,103, 45, 52,88]]
Let's say, I want the sum of the grayscale values for each region, and if it's <200, I want to set those points to 0 (in this case, all the points in region 2), How would I do that with numpy? I'm sure there's a better way than the implementation I have started, which includes many, many for loops and temporary variables...
Look into numpy.bincount and numpy.where, that should get you started. For example:
import numpy as np
labels = np.array([[0,0,0,0,1],
[2,2,1,1,1],
[0,0,2,2,1]])
img = np.array([[100,110,105,100,84],
[ 40, 42, 81, 78,83],
[105,103, 45, 52,88]])
# Sum the regions by label:
sums = np.bincount(labels.ravel(), img.ravel())
# Create new image by applying threhold
final = np.where(sums[labels] < 200, -1, img)
print final
# [[100 110 105 100 84]
# [ -1 -1 81 78 83]
# [105 103 -1 -1 88]]
You're looking for the numpy function where. Here's how you get started:
import numpy as np
labels = [[0,0,0,0,1],
[2,2,1,1,1],
[0,0,2,2,1]]
img = [[100,110,105,100,84],
[ 40, 42, 81, 78,83],
[105,103, 45, 52,88]]
# to sum pixels with a label 0:
px_sum = np.sum(img[np.where(labels == 0)])
Related
I need to make subtractions inside red frames as [20-10,60-40,100-70]
that results in [10,20,30]
Current code makes subtractions but I don't know how to define red frames
seq = [10, 20, 40, 60, 70, 100]
window_size = 2
for i in range(len(seq) - window_size+1):
x=seq[i: i + window_size]
y=x[1]-x[0]
print(y)
You can build a quick solution using the fact that seq[0::2] will give you every other element of seq starting at zero. So you can compute seq[1::2] - seq[0::2] to get this result.
Without using any packages you could do:
seq = [10, 20, 40, 60, 70, 100]
sub_seq = [0]*(len(seq)//2)
for i in range(len(sub_seq)):
sub_seq[i] = seq[1::2][i] - seq[0::2][i]
print(sub_seq)
Instead you could use Numpy. Using the numpy array object you can subtract the arrays rather than explicitly looping:
import numpy as np
seq = np.array([10, 20, 40, 60, 70, 100])
sub_seq = seq[1::2] - seq[0::2]
print(sub_seq)
Here's a solution using numpy which might be useful if you have to process large amounts of data in a short time. We select values based on whether their index is even (index % 2 == 0) or odd (index % 2 != 0).
import numpy as np
seq = [10, 20, 40, 60, 70, 100]
seq = np.array(seq)
index = np.arange(len(seq))
seq[index % 2 != 0] - seq[index % 2 == 0]
I have a numpy array with 1000 RGB images with shape (1000, 90, 90, 3) and I need to work on each image, but sliced in 9 blocks. I've found many solution for slicing a single image, but how can I obtain a (9000, 30, 30, 3) array and then iteratively send to a function 9 contiguous block?
I would do smth like what I do in the code below. In my example I used parts of images from skimage.data to illustrate my method and made the shapes and sizes different so that it will look prettier. But you can do the same for your dta by adjusting those parameters.
from skimage import data
from matplotlib import pyplot as plt
import numpy as np
astronaut = data.astronaut()
coffee = data.coffee()
arr = np.stack([coffee[:400, :400, :], astronaut[:400, :400, :]])
plt.imshow(arr[0])
plt.title('arr[0]')
plt.figure()
plt.imshow(arr[1])
plt.title('arr[1]')
arr_blocks = arr.reshape(arr.shape[0], 4, 100, 4, 100, 3, ).swapaxes(2, 3)
arr_blocks = arr_blocks.reshape(-1, 100, 100, 3)
for i, block in enumerate(arr_blocks):
plt.figure(10+i//16, figsize = (10, 10))
plt.subplot(4, 4, i%16+1)
plt.imshow(block)
plt.title(f'block {i}')
# batch_size = 9
# some_outputs_list = []
# for i in range(arr_blocks.shape[0]//batch_size + ((arr_blocks.shape[0]%batch_size) > 0)):
# some_outputs_list.append(some_function(arr_blocks[i*batch_size:(i+1)*batch_size]))
Output:
I have a 2d numpy array that I want to plot so I can see how each category is positioned on the grid. The matrix (mat) looks something like this:
156 138 156
1300 137 156
138 138 1300
137 137 137
I plotted this as follows:
plt.imshow(mat, cmap='tab20', interpolation='none')
However, I want to have custom colors. I have a csv where the id's correspond with the values in the matrix:
id,R,G,B
156,200,200,200
138,170,255,245
137,208,130,40
1300,63,165,76
Is there a way I can have the values in the matrix correspond with the R, G, B values in the csv file?
Edit: someone asked for a clarification but the entire answer was deleted.
each row has an ID and a 3 columns, representing the respective R, G, and B values. So the first row has ID 156 (a domain specific code) with R 200, G 200 and B 200 (which is grey).
Now I have a 2d matrix that I want to plot, and on each coordinate where the value is 156 I want that pixel to be grey. Same with ID 1300, where the colors 63, 165, and 76 represent a green color that I want to use in the matrix.
Using a colormap
In principle the matrix with RGB values is some kind of colormap. It makes sense to use a colormap in matplotlib to get the colors for a plot. What makes this a little more complicated here is that the values are not well spaced. So one idea would be to map them to integers starting at 0 first. Then creating a colormap from those values and using it with a BoundaryNorm allows to have a equidistant colorbar. Finally one may set the ticklabels of the colorbar back to the initial values.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
a =np.array([[156, 138, 156],
[1300, 137, 156],
[138, 138, 1300],
[137, 137, 137]])
ca = np.array([[156,200,200,200],
[138,170,255,245],
[137,208,130,40],
[1300,63,165,76]])
u, ind = np.unique(a, return_inverse=True)
b = ind.reshape((a.shape))
colors = ca[ca[:,0].argsort()][:,1:]/255.
cmap = matplotlib.colors.ListedColormap(colors)
norm = matplotlib.colors.BoundaryNorm(np.arange(len(ca)+1)-0.5, len(ca))
plt.imshow(b, cmap=cmap, norm=norm)
cb = plt.colorbar(ticks=np.arange(len(ca)))
cb.ax.set_yticklabels(np.unique(ca[:,0]))
plt.show()
Plotting RGB array
You may create an RGB array from your data to directly plot as imshow. To this end you may index the original array with the colors from the color array and reshape the resulting array such that it is in the correct shape to be plotted with imshow.
import numpy as np
import matplotlib.pyplot as plt
a =np.array([[156, 138, 156],
[1300, 137, 156],
[138, 138, 1300],
[137, 137, 137]])
ca = np.array([[156,200,200,200],
[138,170,255,245],
[137,208,130,40],
[1300,63,165,76]])
u, ind = np.unique(a, return_inverse=True)
c = ca[ca[:,0].argsort()][:,1:]/255.
b = np.moveaxis(c[ind][:,:,np.newaxis],1,2).reshape((a.shape[0],a.shape[1],3))
plt.imshow(b)
plt.show()
The result is the same as above, but without colorbar (as there is no quantity to map here).
It's not particularly elegant, but it is simple
In [72]: import numpy as np
In [73]: import matplotlib.pyplot as plt
In [74]: a = np.mat("156 138 156;1300 137 156;138 138 1300;137 137 137")
In [75]: d = { 156: [200, 200, 200],
...: 138: [170, 255, 245],
...: 137: [208, 130, 40],
...: 1300: [63, 165, 76]}
In [76]: image = np.array([[d[val] for val in row] for row in a], dtype='B')
In [77]: plt.imshow(image);
The point is to generate an array of the correct dtype ('B' encodes short unsigned integer) containing the correct (and unpacked) RGB tuples.
Addendum
Following an exchange of comments following the original question in this Addendum I'll propose a possible solution to the problem of plotting the same type of data using plt.scatter() (the problem was a bit tougher than I expected...)
import numpy as np
import matplotlib.pyplot as plt
from random import choices, randrange
######## THIS IS FOR IMSHOW ######################################
# the like of my previous answer
values = [20,150,900,1200]
rgb = lambda x=255:(randrange(x), randrange(x), randrange(x))
colord = {v:rgb() for v in values}
nr, nc = 3, 5
data = np.array(choices(values, k=nr*nc)).reshape((nr,nc))
c = np.array([[colord[v] for v in row] for row in data], dtype='B')
######## THIS IS FOR SCATTER ######################################
# This is for having the coordinates of the scattered points, note that rows' indices
# map to y coordinates and columns' map to x coordinates
y, x = np.array([(i,j) for i in range(nr) for j in range(nc)]).T
# Scatter does not expect a 3D array of uints but a 2D array of RGB floats
c1 = (c/255.0).reshape(nr*nc,3)
######## THIS IS FOR PLOTTING ######################################
# two subplots, plot immediately the imshow
f, (ax1, ax2) = plt.subplots(nrows=2)
ax1.imshow(c)
# to make a side by side comparison we set the boundaries and aspect
# of the second plot to mimic imshow's
ax2.set_ylim(ax1.get_ylim())
ax2.set_xlim(ax1.get_xlim())
ax2.set_aspect(1)
# and finally plot the data --- the size of dots `s=900` was by trial and error
ax2.scatter(x, y, c=c1, s=900)
plt.show()
Pandas can help you to collect data:
im = pd.read_clipboard(header=None) # from your post
colours = pd.read_clipboard(index_col=0,sep=',') # from your post
Pandas helps also for the colormap :
colordf = colours.reindex(arange(1301)).fillna(0).astype(np.uint8)
And numpy.take build the image :
rgbim = colordf.values.take(im,axis=0))
plt.imshow(rgbim):
Using pandas and numpy, (Edit for n x m matrix):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
n = 2
m = 2
df = pd.read_csv('matrix.txt')
id = df.id.values
id = np.reshape(id, (n, m))
R = df.R.values
R = np.reshape(R/255, (n, m))
G = df.R.values
G = np.reshape(G/255, (n, m))
B = df.B.values
B = np.reshape(B/255, (n, m))
img = []
for i in range(n):
img.append([])
for j in range(m):
img[i].append((R[i][j], G[i][j], B[i][j]))
plt.imshow(img)
plt.show()
I have a list of numbers:
[10,20,30]
What I need is to expand it according to a predefined increment. Thus, let's call x the increment and x=2, my result should be:
[10,12,14,16,18,20,22,24,.....,38]
Right now I am using a for loop, but it is very slow and I am wondering if there is a faster way.
EDIT:
newA = []
for n in array:
newA= newA+ generateNewNumbers(n, p, t)
The function generates new number simply generate the new numbers to add to the list.
EDIT2:
To better define the problem the first array contains some timestamps:
[10,20,30]
I have two parameters one is the sampling rate and one is the sampling time, what I need is to expand the array adding between two timestamps the correct number of timestamps, according to the sampling rate.
For example, if I have a sampling rate 3 and a sampling time 3 the result should be:
[10,13,16,19,20,23,26,29,30,33,36,39]
You can add the same set of increments to each time stamp using np.add.outer and then flatten the result using ravel.
import numpy as np
a = [10,20,35]
inc = 3
ninc = 4
np.add.outer(a, inc * np.arange(ninc)).ravel()
# array([10, 13, 16, 19, 20, 23, 26, 29, 35, 38, 41, 44])
You can use list comprhensions but I'm not sure if I understand the stopping condition for the last point inclusion
a = [10, 20, 30, 40]
t = 3
sum([[x for x in range(y, z, t)] for y, z in zip(a[:-1], a[1:])], []) + [a[-1]]
will give
[10, 13, 16, 19, 20, 23, 26, 29, 30, 33, 36, 39, 40]
Using range and itertools.chain
l = [10,20,30]
x = 3
from itertools import chain
list(chain(*[range(i,i+10,x) for i in l]))
#Output:
#[10, 13, 16, 19, 20, 23, 26, 29, 30, 33, 36, 39]
Here is a bunch of good answers already. But I would advise numpy and linear interpolation.
# Now, this will give you the desired result with your first specifications
# And in pure Python too
t = [10, 20, 30]
increment = 2
last = int(round(t[-1]+((t[-1]-t[-2])/float(increment))-1)) # Value of last number in array
# Note if you insist on mathematically "incorrect" endpoint, do:
#last = ((t[-1]+(t[-1]-t[-2])) -((t[-1]-t[-2])/float(increment)))+1
newt = range(t[0], last+1, increment)
# And, of course, this may skip entered values (increment = 3
# But what you should do instead, when you use the samplerate is
# to use linear interpolation
# If you resample the original signal,
# Then you resample the time too
# And don't expand over the existing time
# Because the time doesn't change if you resampled the original properly
# You only get more or less samples at different time points
# But it lasts the same length of time.
# If you do what you originally meant, you actually shift your datapoints in time
# Which is wrong.
import numpy
t = [10, 20, 30, 40, 50, 60]
oldfs = 4000 # 4 KHz samplerate
newfs = 8000 # 8 KHz sample rate (2 times bigger signal and its time axis)
ratio = max(oldfs*1.0, newfs*1.0)/min(newfs, oldfs)
newlen = round(len(t)*ratio)
numpy.interp(
numpy.linspace(0.0, 1.0, newlen),
numpy.linspace(0.0, 1.0, len(t)),
t)
This code can resample your original signal too (if you have one). If you just want to cram in some more timepoints in between, you can also use interpolation. Again, don't go over the existing time. Although this code does it, to be compatible with the first one. And so that you can get ideas on what you can do.
t = [10, 20, 30]
increment = 2
last = t[-1]+((t[-1]-t[-2])/float(increment))-1 # Value of last number in array
t.append(last)
newlen = (t[-1]-t[0])/float(increment)+1 # How many samples we will get in the end
ratio = newlen / len(t)
numpy.interp(
numpy.linspace(0.0, 1.0, newlen),
numpy.linspace(0.0, 1.0, len(t)),
t)
This though results in an increment of 2.5 instead of 2. But it can be corrected. The thing is that this approach would work on floating time points as well as on integers. And fast. It will slow down if there is a lot of them, but until you reach some great number of them it will work pretty fast.
I have a series of numbers:
numbers = [100, 101, 99, 102, 99, 98, 100, 97.5, 98, 99, 95, 93, 90, 85, 80]
It's very to see by eye that the numbers start to fall sharply roughly around 10, but is there a simple way to identify that point (or close to it) on the x axis?
This is being done in retrospect, so you can use the entire list of numbers to select the x axis point where the dropoff accelerates.
Python solutions are preferred, but pseudo-code or a general methodology is fine too.
Ok, this ended up fitting my needs. I calculate a running mean, std deviation, and cdf from a t distribution to tell me how unlikely each successive value is.
This only works with decreases since I am only checking for cdf < 0.05 but it works very well.
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
numbers = np.array([100, 101, 99, 102, 99, 98, 100, 97.5, 98, 99, 95, 93, 90, 85, 80])
# Calculate a running mean
cum_mean = numbers.cumsum() / (np.arange(len(numbers)) + 1)
# Calculate a running standard deviation
cum_std = np.array([numbers[:i].std() for i in range(len(numbers))])
# Calculate a z value
cum_z = (numbers[1:] - cum_mean[:-1]) / cum_std[:-1]
# Add in NA vals to account for records without sample size
z_vals = np.concatenate((np.zeros(1+2), cum_z[2:]), axis=0)
# Calculate cdf
cum_t = np.array([stats.t.cdf(z, i) for i, z in enumerate(z_vals)])
# Identify first number to fall below threshold
first_deviation = np.where(cum_t < 0.05)[0].min()
fig, ax = plt.subplots()
# plot the numbers and the point immediately prior to the decrease
ax.plot(numbers)
ax.axvline(first_deviation-1, color='red')