Generation of a nested list with different ranges - python

Generation of a list of many lists each with different ranges
Isc_act = [0.1, 0.2, 0.3]
I_cel = []
a = []
for i in range(0,len(Isc_act)):
a = np.arange(0, Isc_act[i], 0.1*Isc_act[i])
I_cel[i].append(a)
print(I_cel)
Output is:
IndexError: list index out of range
My code is giving error. But, I want to get I_cel = [[0,0.01,..,0.1],[0,0.02,0.04,...,0.2],[0, 0.03, 0.06,...,0.3]]. Hence, the 'nested list' I_cel has three lists and each list has 10 values.

The simplest fix to your code, probably what you were intending to do:
Isc_act = [0.1, 0.2, 0.3]
I_cel = []
for i in range(0,len(Isc_act)):
a = np.arange(0, Isc_act[i], 0.1*Isc_act[i])
I_cel.append(a)
print(I_cel)
Note that the endpoint will be one step less than you wanted! For example row zero, you have to pick two of the below:
Steps of size 0.01
Start point 0.0 and end point 0.1
10 elements total
You can not have all three.
More numpythonic approach:
>>> Isc_act = [0.1, 0.2, 0.3]
>>> (np.linspace(0, 1, 11).reshape(11,1) # [Isc_act]).T
array([[0. , 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ],
[0. , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16, 0.18, 0.2 ],
[0. , 0.03, 0.06, 0.09, 0.12, 0.15, 0.18, 0.21, 0.24, 0.27, 0.3 ]])

linspace gives better control of the end point when dealing with floats:
In [84]: [np.linspace(0,x,11) for x in [.1,.2,.3]]
Out[84]:
[array([0. , 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ]),
array([0. , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16, 0.18, 0.2 ]),
array([0. , 0.03, 0.06, 0.09, 0.12, 0.15, 0.18, 0.21, 0.24, 0.27, 0.3 ])]
Or we could scale just one array (arange with integers is predictable):
In [86]: np.array([.1,.2,.3])[:,None]*np.arange(0,11)
Out[86]:
array([[0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ],
[0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. ],
[0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4, 2.7, 3. ]])

Related

How to reorder one set of data points to minimize error with another set of data points

I have the following set of 15 data points:
[0.287 , 0.0691, 0.856, 0.731, 0.895, 0.76, 0.496, 0.749, 0.77, 0.684, 0.667, 0.386, 0.4, 0.334, 0.346]
And I would like the order of these data points to be changed so to minimize the error with the following set of 15 data points:
[0.1, 0.3, 0.5, 0.7, 0.9, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.3, 0.2, 0.1]
I could just try all permutations of the first set of data points and see which one gives the smallest error but that would take forever...
I'm assuming by error you mean the summed absolute difference. It is not difficult to check that this error is minimized when a and b have the same rank order. The best reordering of a can thus be obtained using argsort
>>> a = np.array([0.287 , 0.0691, 0.856 , 0.731 , 0.895 , 0.76 , 0.496 , 0.749 , 0.77 , 0.684 , 0.667 , 0.386 , 0.4 , 0.334 , 0.346 ])
>>> b = np.array([0.1, 0.3, 0.5, 0.7, 0.9, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.3, 0.2, 0.1])
>>>
>>> best_shuffle = np.empty(a.size,int)
>>> best_shuffle[b.argsort(kind="stable")] = a.argsort(kind="stable")
>>>
>>> np.abs(b-a[best_shuffle]).sum()
1.3499000000000005

Non-uniform axis in matplotlib histogram

I would like to plot a histogram with a non-uniform x-axis using Matplotlib.
For example, consider the following histogram:
import matplotlib.pyplot as plt
values = [0.68, 0.28, 0.31, 0.5, 0.25, 0.5, 0.002, 0.13, 0.002, 0.2, 0.3, 0.45,
0.56, 0.53, 0.001, 0.44, 0.008, 0.26, 0., 0.37, 0.03, 0.002, 0.19, 0.18,
0.04, 0.31, 0.006, 0.6, 0.19, 0.3, 0., 0.46, 0.2, 0.004, 0.06, 0.]
plt.hist(values)
plt.show()
The first bin has high density, so I would like to zoom in there.
Ideally, I would like to change the values in the x-axis to something like [0, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1], keeping the bin widths constant within the graph (but not numerically, of course). Is there an easy way to achieve this?
Any comments or suggestions are welcome.
Using bins will solve the problems. The bins are the values to which you assign the values for example 0.28 will be assigned to bin 0.3. The code below provides you an example of using bins:
import matplotlib.pyplot as plt
values = [0.68, 0.28, 0.31, 0.5, 0.25, 0.5, 0.002, 0.13, 0.002, 0.2, 0.3, 0.45,
0.56, 0.53, 0.001, 0.44, 0.008, 0.26, 0., 0.37, 0.03, 0.002, 0.19, 0.18,
0.04, 0.31, 0.006, 0.6, 0.19, 0.3, 0., 0.46, 0.2, 0.004, 0.06, 0.]
plt.hist(values, bins=[0, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1])
plt.show()
To plot it in a more suitable way, it can be handy to convert the x axis into a logaritmic scale:
plt.hist(values, bins=[0, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1], log=True)
changes the log scale on the y axis. Adding the following line to your code will make a logaritmic x axis for your histogram:
plt.xscale('log')
The solution from André is nice, but the bin widths are not constant. Working with a log2 x-axis suits what I was looking for. I use np.logspace to make the bin widths constant in the graph.
That's what I ended up doing:
import matplotlib.pyplot as plt
values = [0.68, 0.28, 0.31, 0.5, 0.25, 0.5, 0.002, 0.13, 0.002, 0.2, 0.3, 0.45,
0.56, 0.53, 0.001, 0.44, 0.008, 0.26, 0., 0.37, 0.03, 0.002, 0.19, 0.18,
0.04, 0.31, 0.006, 0.6, 0.19, 0.3, 0., 0.46, 0.2, 0.004, 0.06, 0.]
bins = np.logspace(-10, 1, 20, base=2)
bins[0]=0
fig, ax = plt.subplots()
plt.hist(values, bins=bins)
ax.set_xscale('log', basex=2)
ax.set_xlim(2**-10, 1)
plt.show()

how to add value within certain intervals only python

I have a dataframe and values in a columns ranges from -1 to 1. I want to add 0.1 to all value between -1 to 0.6 only. Is it possible to do it?
suppose a is my list:
a = ([-1. , -0.5, 0.1 , 0.2, 0.45, 0.7, 0.64, 1])
and I want this:
([-0.9, -0.4, 0.2, 0.3, 0.55, 0.7, 0.74, 1])
Yes, it's possible:
a = [-1. , -0.5, 0.1 , 0.2, 0.45, 0.7, 0.64, 1]
a = [x + 0.1 if -1 <= x <= 0.6 else x for x in a]
print a
Results:
[-0.9, -0.4, 0.2, 0.3, 0.55, 0.7, 0.64, 1]

Numpy advanced indexing usage

Case 1 (solved): Array A has shape (say) (300,50). Array B is an indices array with the shape (300,5), such that B[i,j] indicate for the row i the index of another row to "concate" next to the row i. The end result is an array C with the shape (300,5,50), such that C[i,j,:] = A[B[i,j],:]. This can be done by calling A[B,:].
Here is small script example for case 1:
import numpy as np
## A is the data array
A = np.arange(20).reshape((5,4))
## B indicate for each row which rows to pull together
B = np.array([[0,2],[1,2],[2,0],[3,4],[4,1]])
A[B,:] #The desired result
Case 2 (unsolved): Same problem, only now A is shaped (100,300,50). If B is the indicies matrix shaped (100,300,5), the end result would be an array C with the shape (100,300,5,50) such that C[i,j,k,:] = A[i,B[i,j,k],:]. A[B,:] doesn't work anymore, because it result with a shape (100,300,5,300,50), due to broadcasting.
How should I approach this with indexing?
One approach would be reshaping to 2D keeping the number of columns intact and then indexing into the first axis with the flattened B indices and finally reshaping back to the desired one.
Thus, the implementation would be -
A.reshape(-1,A.shape[-1])[B.ravel()].reshape(100,300,5,50)
Those reshaping being merely views into the arrays, should be quite efficient.
This solves both cases. Here's a sample run for the case #1 -
1) Inputs :
In [667]: A = np.random.rand(3,4)
...: B = np.random.randint(0,3,(3,5))
...:
2) Original method :
In [668]: A[B,:]
Out[668]:
array([[[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.1 , 0.91, 0.1 , 0.98]],
[[ 0.45, 0.16, 0.02, 0.02],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.45, 0.16, 0.02, 0.02]],
[[ 0.48, 0.6 , 0.96, 0.21],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.45, 0.16, 0.02, 0.02]]])
3) Proposed method :
In [669]: A.reshape(-1,A.shape[-1])[B.ravel()].reshape(3,5,4)
Out[669]:
array([[[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.1 , 0.91, 0.1 , 0.98]],
[[ 0.45, 0.16, 0.02, 0.02],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.1 , 0.91, 0.1 , 0.98],
[ 0.45, 0.16, 0.02, 0.02]],
[[ 0.48, 0.6 , 0.96, 0.21],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.48, 0.6 , 0.96, 0.21],
[ 0.45, 0.16, 0.02, 0.02],
[ 0.45, 0.16, 0.02, 0.02]]])

How to compare two arrays and find the optimal match in Python?

I have two arrays X and Y, X is the base array and Y is operated in a loop. As the loop runs I want to compare the arrays to find the nearest value of Y to X or in other words where is Y most close to X. As an example I have attached the reproducible code:
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
x = np.array([[0.12, 0.11, 0.1, 0.09, 0.08],
[0.13, 0.12, 0.11, 0.1, 0.09],
[0.15, 0.14, 0.12, 0.11, 0.1],
[0.17, 0.15, 0.14, 0.12, 0.11],
[0.19, 0.17, 0.16, 0.14, 0.12],
[0.22, 0.19, 0.17, 0.15, 0.13],
[0.24, 0.22, 0.19, 0.16, 0.14],
[0.27, 0.24, 0.21, 0.18, 0.15],
[0.29, 0.26, 0.22, 0.19, 0.16]])
y = np.array([[0.07, 0.06, 0.05, 0.04, 0.03],
[0.08, 0.07, 0.06, 0.05, 0.04],
[0.10, 0.09, 0.07, 0.06, 0.05],
[0.14, 0.12, 0.11, 0.09, 0.08],
[0.16, 0.14, 0.13, 0.11, 0.09],
[0.19, 0.16, 0.14, 0.12, 0.10],
[0.22, 0.20, 0.17, 0.14, 0.12],
[0.25, 0.22, 0.19, 0.16, 0.13],
[0.27, 0.24, 0.20, 0.17, 0.14]])
for i in range(100):
y = y + (i / 10000)
I want to break the loop when the closest values have been found. By closest I mean, the values should be within ±10% of the original values or some other percentage. How can this be done in Python?
You can compute the Euclidean distance between the two matrices:
import numpy as np
import scipy.spatial.distance
import matplotlib.pyplot as plt
x = np.array([[0.12, 0.11, 0.1, 0.09, 0.08],
[0.13, 0.12, 0.11, 0.1, 0.09],
[0.15, 0.14, 0.12, 0.11, 0.1],
[0.17, 0.15, 0.14, 0.12, 0.11],
[0.19, 0.17, 0.16, 0.14, 0.12],
[0.22, 0.19, 0.17, 0.15, 0.13],
[0.24, 0.22, 0.19, 0.16, 0.14],
[0.27, 0.24, 0.21, 0.18, 0.15],
[0.29, 0.26, 0.22, 0.19, 0.16]])
y = np.array([[0.07, 0.06, 0.05, 0.04, 0.03],
[0.08, 0.07, 0.06, 0.05, 0.04],
[0.10, 0.09, 0.07, 0.06, 0.05],
[0.14, 0.12, 0.11, 0.09, 0.08],
[0.16, 0.14, 0.13, 0.11, 0.09],
[0.19, 0.16, 0.14, 0.12, 0.10],
[0.22, 0.20, 0.17, 0.14, 0.12],
[0.25, 0.22, 0.19, 0.16, 0.13],
[0.27, 0.24, 0.20, 0.17, 0.14]])
dists = []
for i in range(100):
y = y + (i / 10000.)
dists.append(scipy.spatial.distance.euclidean(x.flatten(), y.flatten()))
plt.plot(dists)
will return this graph, which is the evolution of the Euclidean distance between your 2 matrices:
To break the loop at the minimum, you can use:
dist = np.inf
for i in range(100):
y = y + (i / 10000.)
d = scipy.spatial.distance.euclidean(x.flatten(), y.flatten())
if d < dist:
dist = d
else:
break
print dist
# 0.0838525491562 #(the minimal distance)
print y
#[[ 0.1051 0.0951 0.0851 0.0751 0.0651]
#[ 0.1151 0.1051 0.0951 0.0851 0.0751]
#[ 0.1351 0.1251 0.1051 0.0951 0.0851]
#[ 0.1751 0.1551 0.1451 0.1251 0.1151]
#[ 0.1951 0.1751 0.1651 0.1451 0.1251]
#[ 0.2251 0.1951 0.1751 0.1551 0.1351]
#[ 0.2551 0.2351 0.2051 0.1751 0.1551]
#[ 0.2851 0.2551 0.2251 0.1951 0.1651]
#[ 0.3051 0.2751 0.2351 0.2051 0.1751]]

Categories