I have an array which consists in a delta function (either 0 or 1). I use this function to generate a step function array by applying a forward-fill algorithm. This array is the one I need for a certain operation.
This plot displays the delta and step arrays:
However, I need to increase the resolution of this array to perform the operation. However, I cannot directly apply something like numpy.interp
which distorts the original functions.
Hence my question would be which is the efficient (and pythonic way) to increase the resolution in a step function?
This is an example script:
import matplotlib.pyplot as plt
import numpy as np
def forward_filling(arr):
idx=np.where(arr==0,0,np.arange(len(arr)))
idx=np.maximum.accumulate(idx)
return arr[idx]
fig, axis = plt.subplots(1, 1)
x_array = np.arange(0, 15)
y_delta = np.zeros(len(x_array))
y_delta[3], y_delta[7], y_delta[13] = 1, 2, 3
step_function = forward_filling(y_delta)
axis.plot(x_array, y_delta, label='delta function', marker='o')
axis.plot(x_array, step_function, label='step function')
x_high_resolution = np.linspace(0, 15, 30)
delta_interpolated = np.interp(x_high_resolution, x_array, y_delta)
step_interpolated = np.interp(x_high_resolution, x_array, step_function)
axis.plot(x_high_resolution, delta_interpolated, label='delta function high resolution', marker='o')
axis.plot(x_high_resolution, step_interpolated, label='step function high resolution')
axis.legend()
axis.set_xlabel('x')
axis.set_ylabel('y')
plt.show()
As I suppose you would like to maintain the y value in the neighbourhood of each given y value, you could "substitute" each y-value for, say, 3 of the same values using a List Comprehension:
step_function_hi_res = np.array([np.repeat(step,3) for step in step_function]).flatten()
and then make the changes in your x-values as you already did:
x_high_resolution = np.linspace(0, len(step_function),len(step_function)*3)
Related
I want to digitize (= average out over cells) photon count data into pixels given by a grid that tells how they are aligned. The photon count data is stored in a 2D array. I want to split that data into cells, each of which would correspond to a pixel. The idea is basically the same as changing an HD image to a smaller resolution. I'd like to achieve this in Python.
The digitizing function I've written:
import numpy as np
def digitize(function_data, grid_shape):
"""
function_data = 2D array of function values of some 3D shape,
eg.: exp(-(x^2 + y^2 -> want to digitize this
grid_shape: an array of length 2 which contains the dimensions of the smaller resolution
"""
l = len(function_data)
pixel_len_x = int(l/grid_shape[0])
pixel_len_y = int(l/grid_shape[1])
digitized_data = np.empty((grid_shape[0], grid_shape[1]))
for i in range(grid_shape[0]): #row-index of pixel in smaller-resolution grid
for j in range(grid_shape[1]): #column-index of pixel in smaller-resolution grid
hd_pixel = []
for k in range(pixel_len_y):
hd_pixel.append(z_data[k][j:j*pixel_len_x])
hd_pixel = np.ravel(hd_pixel) #turns 2D array into 1D to be able to compute average
pixel_avg = np.average(hd_pixel)
digitized_data[i][j] = pixel_avg
return digitized_data
In theory, this function should do what I want to achieve, but when tested it doesn't yield the expected results. Either a completed version of my function or any other method that achieves my goal would be extremely helpful.
You could also use a interpolation function, if you can use SciPy. Here we use one of the gridded data interpolating functions, RectBivariateSpline to upsample your function, but you can find numerous examples on this and other sites.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import RectBivariateSpline as rbs
# Sampling coordinates
x = np.linspace(-2,2,20)
y = np.linspace(-2,2,30)
# Your function
f = np.exp(-(x[:,None]**2 + y**2))
# Interpolator
interp = rbs(x, y, f)
# Higher resolution coordinates
x_hd = np.linspace(x.min(), x.max(), x.size * 5)
y_hd = np.linspace(y.min(), y.max(), y.size * 5)
# New higher res function
f_hd = interp(x_hd, y_hd, grid = True)
# Some plots
fig, ax = plt.subplots(ncols = 2)
ax[0].imshow(f)
ax[1].imshow(f_hd)
I am trying to do a rectangular shape function for a signal with ten seconds with values for 1 between 1 and for 4 seconds a 0 for the rest, I looked other problems but they only seemed to cover for repeating pulses while I just want this single pulse. I already tried the code below, but since I am very new to programming I can not seem to get it to work. I also saw this question but since it only gives the absolute values it does not work for me rectangular pulse train in python
y=np.zeros(10)
def rect(x):
x = np.linspace(0, 10, 100)
if 1<=x<=4:
y=1
else:
y=0
return rect(x)
f1=rect(y)
plt.plot(y,f1)
There are two ways to do: Long way using your functional approach and a short vectorized way. I present both:
Longway: Call the function within a for loop and append the values of y to a list.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100)
def rect(i):
if 1<=i<=4:
y=1
else:
y=0
return y
f1 = []
for i in x:
f1.append(rect(i))
plt.plot(x,f1)
plt.show()
Short way: Create a conditional mask and apply it to your y-array to fill it with 0 and 1 depending on the condition.
x = np.linspace(0, 10, 100)
mask = (x>=1) & (x <=4)
y = np.where(mask, 1, 0)
plt.plot(x,y)
plt.show()
I want to change x axis scale. For example, I am reading a data from txt file.
This data is like a=[ 1,2,5,9,12,17] and I want to convert to this number this scale[0,3]. I mean this data a=[ 1,2,5,9,12,17] has 6 number but I need to scale these number in [0,3] so that my axis should only be c=[0,3].I have other data c=[1,2,3,4,5,6]. I plot this data in normal way plot(a,b) but I want to scale this like plot(c,b). I don't know which function I will use for that.
Other question, I used plt.axhline(y=0.005), I want to change with linestyle='-' because otherwise giving continues line. How can I put max and minimum threshold with '-' ?
Second question answer:
import matplotlib.pyplot as plt
plt.axhline(y=0.5, color='b', linestyle='--',linewidth=1)
plt.axhline(y=-0.5, color='b', linestyle='--',linewidth=1)
plt.show()` I solved my second question like this.
If NumPy is available you can use the interp function to generate your scaled values (docs):
import numpy as np
scaled_a = np.interp(a, (min(a), max(a)), c)
The scaled_a variable is a NumPy array that can be passed to matplotlib's plot function in place of the original a variable.
If NumPy is not available you'll have to do a bit of arithmetic to calculate the new values:
def scaler(x, old_min, old_max, new_min, new_max):
old_diff = old_max - old_min
new_diff = new_max - new_min
return ((x - old_min) * (new_diff / old_diff)) + new_min
old_min = min(a)
old_max = max(a)
scaled_a = [scaler(x, old_min, old_max, c[0], c[1]) for x in a]
The variable scaled_a is now a python list, but it can still be passed to the plot function.
I have an array with probability values stored in it. Some values are 0. I need to plot a histogram such that there are equal number of elements in each bin. I tried using matplotlibs hist function but that lets me decide number of bins. How do I go about plotting this?(Normal plot and hist work but its not what is needed)
I have 10000 entries. Only 200 have values greater than 0 and lie between 0.0005 and 0.2. This distribution isnt even as 0.2 only one element has whereas 2000 approx have value 0.0005. So plotting it was an issue as the bins had to be of unequal width with equal number of elements
The task does not make much sense to me, but the following code does, what i understood as the thing to do.
I also think the last lines of the code are what you really wanted to do. Using different bin-widths to improve visualization (but don't target the distribution of equal amount of samples within each bin)! I used astroml's hist with method='blocks' (astropy supports this too)
Code
# Python 3 -> beware the // operator!
import numpy as np
import matplotlib.pyplot as plt
from astroML import plotting as amlp
N_VALUES = 1000
N_BINS = 100
# Create fake data
prob_array = np.random.randn(N_VALUES)
prob_array /= np.max(np.abs(prob_array),axis=0) # scale a bit
# Sort array
prob_array = np.sort(prob_array)
# Calculate bin-borders,
bin_borders = [np.amin(prob_array)] + [prob_array[(N_VALUES // N_BINS) * i] for i in range(1, N_BINS)] + [np.amax(prob_array)]
print('SAMPLES: ', prob_array)
print('BIN-BORDERS: ', bin_borders)
# Plot hist
counts, x, y = plt.hist(prob_array, bins=bin_borders)
plt.xlim(bin_borders[0], bin_borders[-1] + 1e-2)
print('COUNTS: ', counts)
plt.show()
# And this is, what i think, what you really want
fig, (ax1, ax2) = plt.subplots(2)
left_blob = np.random.randn(N_VALUES/10) + 3
right_blob = np.random.randn(N_VALUES) + 110
both = np.hstack((left_blob, right_blob)) # data is hard to visualize with equal bin-widths
ax1.hist(both)
amlp.hist(both, bins='blocks', ax=ax2)
plt.show()
Output
Pretty much exactly what the question states, but a little context:
I'm creating a program to plot a large number of points (~10,000, but it will be more later on). This is being done using matplotlib's plt.scatter. This command is part of a loop that saves the figure, so I can later animate it.
What I want to be able to do is randomly select a small portion of these particles (say, maybe 100?) and give them a different marker than the rest, even though they're part of the same data set. This is so I can use them as placeholders to see the motion of individual particles, as well as the bulk material.
Is there a way to use a different marker for a small subset of the same data?
For reference, the particles are uniformly distributed just using the numpy random sampler, but my code for that is:
for i in range(N): # N number of particles
particle_position[i] = np.random.uniform(0, xmax) # Initialize in spatial domain
particle_velocity[i] = np.random.normal(0, 5) # Initialize in velocity space
for i in range(maxtime):
plt.scatter(particle_position, particle_velocity, s=1, c=norm_xvel, cmap=br_disc, lw=0)
The position and velocity change on each iteration of the main loop (there's quite a bit of code), but these are the main initialization and plotting routines.
I had an idea that perhaps I could randomly select a bunch of i values from range(N), and use an ax.scatter() command to plot them on the same axes?
Here is a possible solution to have a subset of your points identified with a different marker:
import matplotlib.pyplot as plt
import numpy as np
SIZE = 100
SAMPLE_SIZE = 10
def select_subset(seq, size):
"""selects a subset of the data using ...
"""
return seq[:size]
points_x = np.random.uniform(-1, 1, size=SIZE)
points_y = np.random.uniform(-1, 1, size=SIZE)
plt.scatter(points_x, points_y, marker=".", color="blue")
plt.scatter(select_subset(points_x, SAMPLE_SIZE),
select_subset(points_y, SAMPLE_SIZE),
marker="o", color="red")
plt.show()
It uses plt.scatter twice; once on the full data set, the other on the sample points.
You will have to decide how you want to select the sample of points - it is isolated in the select_subset function..
You could also extract the sample points from the data set to prevent marking them twice, but numpy is rather inefficient at deleting or resizing.
Maybe a better method is to use a mask? A mask has the advantage of leaving your original data intact and in order.
Here is a way to proceed with masks:
import matplotlib.pyplot as plt
import numpy as np
import random
SIZE = 100
SAMPLE_SIZE = 10
def make_mask(data_size, sample_size):
mask = np.array([True] * sample_size + [False ] * (data_size - sample_size))
np.random.shuffle(mask)
return mask
points_x = np.random.uniform(-1, 1, size=SIZE)
points_y = np.random.uniform(-1, 1, size=SIZE)
mask = make_mask(SIZE, SAMPLE_SIZE)
not_mask = np.invert(mask)
plt.scatter(points_x[not_mask], points_y[not_mask], marker=".", color="blue")
plt.scatter(points_x[mask], points_y[mask], marker="o", color="red")
plt.show()
As you see, scatter is called once on a subset of the data points (the ones not selected in the sample), and a second time on the sampled subset, and draws each subset with its own marker. It is efficient & leaves the original data intact.
The code below does what you want. I have selected a random set v_sub_index of N_sub indices in the correct range (0 to N) and draw those (with _sub suffix) from the larger samples particle_position and particle_velocity. Please note that you don't have to loop to generate random samples. Numpy has great functionality for that without having to use for loops.
import numpy as np
import matplotlib.pyplot as pl
N = 100
xmax = 1.
v_sigma = 2.5 / 2. # 95% of the samples contained within 0, 5
v_mean = 2.5 # mean at 2.5
N_sub = 10
v_sub_index = np.random.randint(0, N, N_sub)
particle_position = np.random.rand (N) * xmax
particle_velocity = np.random.randn(N)
particle_position_sub = np.array(particle_position[v_sub_index])
particle_velocity_sub = np.array(particle_velocity[v_sub_index])
particle_position_nosub = np.delete(particle_position, v_sub_index)
particle_velocity_nosub = np.delete(particle_velocity, v_sub_index)
pl.scatter(particle_position_nosub, particle_velocity_nosub, color='b', marker='o')
pl.scatter(particle_position_sub , particle_velocity_sub , color='r', marker='^')
pl.show()