I have the following list of np.array:
dataset = [np.random.normal(r_mean/(p*t), r_vol/t/np.sqrt(p), n) \
for t in rule]
I want to transform it into an 2D np.array (ie. a matrix). I could use np.asarray, but (I believe) it would be inefficient.
Also, each np.random.normal(r_mean/(p*t), r_vol/t/np.sqrt(p), n) is meant to be a column of the resulting matrix, not a row (ie. I'd have to transpose np.asarray(dataset)).
What is the best way of achieving the result ?
You can use broadcasting to create dataset with a single call to numpy.random.normal. Instead of using a list comprehension, make rule a numpy array and use it where you have t in your expression, and request a sample with size (n, len(rule)):
In [66]: r_mean = 1.0
In [67]: r_vol = 3.0
In [68]: p = 2.0
In [69]: rule = np.array([1.0, 100.0, 10000.0])
In [70]: n = 8
In [71]: dataset = np.random.normal(r_mean/(p*rule), r_vol/rule/np.sqrt(p), size=(n, len(rule)))
In [72]: dataset
Out[72]:
array([[ 7.44295301e-01, -1.57786106e-03, -1.85518458e-04],
[ -2.37293991e+00, -2.27875859e-02, 3.38182239e-04],
[ 2.01362974e+00, 5.93566418e-02, -3.00178175e-04],
[ 2.52533022e+00, 8.15380813e-03, 1.82511343e-04],
[ 7.32980563e-01, 2.67511372e-02, -1.95965258e-04],
[ 2.91958598e+00, -1.36314059e-02, 2.45200175e-04],
[ -4.43329724e+00, -5.85052629e-02, -1.75796458e-04],
[ -2.45005431e-01, -1.68543495e-02, 1.69715542e-04]])
If you are unsure that the columns correctly match the parameters, we can test a large sample:
In [73]: n = 100000
Create mu and std so we can see the requested means and standard deviations:
In [74]: mu = r_mean/(p*rule)
In [75]: std = r_vol/rule/np.sqrt(p)
Generate the data:
In [76]: dataset = np.random.normal(mu, std, size=(n, len(rule)))
Here's the mu that we requested:
In [77]: mu
Out[77]: array([ 5.00000000e-01, 5.00000000e-03, 5.00000000e-05])
And here's what we got in the sample:
In [78]: dataset.mean(axis=0)
Out[78]: array([ 4.95672937e-01, 5.08624034e-03, 5.02922664e-05])
Here are the desired standard deviations:
In [79]: std
Out[79]: array([ 2.12132034e+00, 2.12132034e-02, 2.12132034e-04])
And here's what we got:
In [80]: dataset.std(axis=0)
Out[80]: array([ 2.11258192e+00, 2.12437161e-02, 2.11784163e-04])
ds = np.empty((dataset[0].size, len(dataset)), dtype=dataset[0].dtype)
for i in range(ds.shape[1]):
ds[:, i] = dataset[i]
but only do that if you must precompute the dataset list first.
Else use a generator:
ds = np.empty((n, len(rule)))
dataset = (np.random.normal(r_mean/(p*t), r_vol/t/np.sqrt(p), n) for t in rule)
for i, d in enumerate(dataset):
ds[:, i] = d
Related
If I have a MxN numpy array denoted arr, I wish to index over all elements and adjust the values like so
for m in range(arr.shape[0]):
for n in range(arr.shape[1]):
arr[m, n] += x**2 * np.cos(m) * np.sin(n)
Where x is a random float.
Is there a way to broadcast this over the entire array without needing to loop? Thus, speeding up the run time.
You are just adding zeros, because sin(2*pi*k) = 0 for integer k.
However, if you want to vectorize this, the function np.meshgrid could help you.
Check the following example, where I removed the 2 pi in the trigonometric functions to add something unequal zero.
x = 2
arr = np.arange(12, dtype=float).reshape(4, 3)
n, m = np.meshgrid(np.arange(arr.shape[1]), np.arange(arr.shape[0]), sparse=True)
arr += x**2 * np.cos(m) * np.sin(n)
arr
Edit: use the sparse argument to reduce memory consumption.
You can use nested generators of two-dimensional arrays:
import numpy as np
from random import random
x = random()
n, m = 10,20
arr = [[x**2 * np.cos(2*np.pi*j) * np.sin(2*np.pi*i) for j in range(m)] for i in range(n)]
In [156]: arr = np.ones((2, 3))
Replace the range with arange:
In [157]: m, n = np.arange(arr.shape[0]), np.arange(arr.shape[1])
And change the first array to (2,1) shape. A (2,1) array broadcasts with a (3,) to produce a (2,3) result.
In [158]: A = 0.23**2 * np.cos(m[:, None]) * np.sin(n)
In [159]: A
Out[159]:
array([[0. , 0.04451382, 0.04810183],
[0. , 0.02405092, 0.02598953]])
In [160]: arr + A
Out[160]:
array([[1. , 1.04451382, 1.04810183],
[1. , 1.02405092, 1.02598953]])
The meshgrid suggested in the accepted answer does the same thing:
In [161]: np.meshgrid(m, n, sparse=True, indexing="ij")
Out[161]:
[array([[0],
[1]]),
array([[0, 1, 2]])]
This broadcasting may be clearer with:
In [162]: m, n
Out[162]: (array([0, 1]), array([0, 1, 2]))
In [163]: m[:, None] * 10 + n
Out[163]:
array([[ 0, 1, 2],
[10, 11, 12]])
I am trying to create permutations of size 4 from a group of real numbers. After that, I'd like to know the position of the first element in a permutation after I sort it. Here is what I have tried so far. What's the best way to do this?
import numpy as np
from itertools import chain, permutations
N_PLAYERS = 4
N_STATES = 60
np.random.seed(0)
state_space = np.linspace(0.0, 1.0, num=N_STATES, retstep=True)[0].tolist()
perms = permutations(state_space, N_PLAYERS)
perms_arr = np.fromiter(chain(*perms),dtype=np.float16)
def loc(row):
return np.where(np.argsort(row) == 0)[0].tolist()[0]
locs = np.apply_along_axis(loc, 0, perms)
In [153]: N_PLAYERS = 4
...: N_STATES = 60
...: np.random.seed(0)
...: state_space = np.linspace(0.0, 1.0, num=N_STATES, retstep=True)[0].tolist()
...: perms = itertools.permutations(state_space, N_PLAYERS)
In [154]: alist = list(perms)
In [155]: len(alist)
Out[155]: 11703240
Simply making a list from the permuations produces a list of lists, with all sublists of length N_PLAYERS.
Making an array from that with chain flattens it:
In [156]: perms = itertools.permutations(state_space, N_PLAYERS)
In [158]: perms_arr = np.fromiter(itertools.chain(*perms),dtype=np.float16)
In [159]: perms_arr.shape
Out[159]: (46812960,)
In [160]: alist[0]
Which could be reshaped to (11703240,4).
Using apply on that 1d array doesn't work (or make sense):
In [170]: perms_arr.shape
Out[170]: (46812960,)
In [171]: locs = np.apply_along_axis(loc, 0, perms_arr)
In [172]: locs.shape
Out[172]: ()
Reshape to 4 columns:
In [173]: locs = np.apply_along_axis(loc, 0, perms_arr.reshape(-1,4))
In [174]: locs.shape
Out[174]: (4,)
In [175]: locs
Out[175]: array([ 0, 195054, 578037, 769366])
This applies loc to each column, returning one value for each. But loc has a row variable. Is that supposed to be significant?
I could switch the axis; this takes much longer, and al
In [176]: locs = np.apply_along_axis(loc, 1, perms_arr.reshape(-1,4))
In [177]: locs.shape
Out[177]: (11703240,)
list comprehension
This iteration does the same thing as your apply_along_axis, and I expect is faster (though I haven't timed it - it's too slow).
In [188]: locs1 = np.array([loc(row) for row in perms_arr.reshape(-1,4)])
In [189]: np.allclose(locs, locs1)
Out[189]: True
whole array sort
But argsort takes an axis, so I can sort all rows at once (instead of iterating):
In [185]: np.nonzero(np.argsort(perms_arr.reshape(-1,4), axis=1)==0)
Out[185]:
(array([ 0, 1, 2, ..., 11703237, 11703238, 11703239]),
array([0, 0, 0, ..., 3, 3, 3]))
In [186]: np.allclose(_[1],locs)
Out[186]: True
Or going the other direction: - cf with Out[175]
In [187]: np.nonzero(np.argsort(perms_arr.reshape(-1,4), axis=0)==0)
Out[187]: (array([ 0, 195054, 578037, 769366]), array([0, 1, 2, 3]))
I have this list of solutions from Sympy solver:
In [49]: sol
Out[49]:
[-1.20258344291917 - 0.e-23*I,
-0.835217129314554 + 0.e-23*I,
0.497800572233726 - 0.e-21*I]
In [50]: type(sol)
Out[50]: list
In [51]: type(sol[0])
Out[51]: sympy.core.add.Add
How can I convert this list to a numpy object with cells which are normal complex value?
You can call the builtin function complex on each element, and then pass the result to np.array().
For example,
In [22]: z
Out[22]: [1 + 3.5*I, 2 + 3*I, 4 - 5*I]
In [23]: type(z)
Out[23]: list
In [24]: [type(item) for item in z]
Out[24]: [sympy.core.add.Add, sympy.core.add.Add, sympy.core.add.Add]
Use a list comprehension and the builtin function complex to create a list of python complex values:
In [25]: [complex(item) for item in z]
Out[25]: [(1+3.5j), (2+3j), (4-5j)]
Use that same expression as the argument to numpy.array to create a complex numpy array:
In [26]: import numpy as np
In [27]: np.array([complex(item) for item in z])
Out[27]: array([ 1.+3.5j, 2.+3.j , 4.-5.j ])
Alternatively, you can use numpy.fromiter:
In [29]: np.fromiter(z, dtype=complex)
Out[29]: array([ 1.+3.5j, 2.+3.j , 4.-5.j ])
I am trying to use numpy.where with csr_matrix, which dose not work. I am asking is there some built in function equivalent to numpy.where for sparse matrix. Here is an example of what I would like to do without using Forloop or .todense()
import scipy.sparse as spa
import numpy as np
N = 100
A = np.zeros((N,N))
di = np.diag_indices((len(A[:,0])))
A[di] = 2.3
'''
adding some values to non-diagonal terms
for sake of example
'''
for k in range(0,len(A)-1):
for j in range(-1,3,2):
A[k,k+j] = 4.0
A[2,3] =0.1
A[3,3] = 0.1
A[0,4] = 0.2
A[0,2] = 3
'''
creating sparse matrix
'''
A = spa.csc_matrix((N,N))
B = spa.csc_matrix((N,N))
'''
Here I get
TypeError: unsupported operand type(s) for &: 'csc_matrix' and 'csc_matrix'
'''
ind1 = np.where((A>0.0) & (A<=1.0))
B[ind1] = (3.0-B[ind1])**5-6.0*(2.0-B[ind1])**5
How about working with underlying arrays for A and B, the data arrays
In [36]: ind2=np.where((A.data>0.0)&(A.data<=1.0))
In [37]: A.indices[ind2]
Out[37]: array([2, 3, 0])
In [38]: A.indptr[ind2]
Out[38]: array([28, 31, 37])
In [39]: A.data[ind2]
Out[39]: array([ 0.1, 0.1, 0.2])
In [41]: B.data[ind2]=(3.0-B.data[ind2])**5-6.0*(2.0-B.data[ind2])**5
In [42]: B.data[ind2]
Out[42]: array([ 56.54555, 56.54555, 58.7296 ])
To see what ind2 corresponds to in the dense version, convert the array to coo
In [53]: Ac=A.tocoo()
In [54]: (Ac.row[ind2], Ac.col[ind2])
Out[54]: (array([2, 3, 0]), array([3, 3, 4]))
where, for reference, the where on the dense array is:
In [57]: np.where((A.A>0.0) & (A.A<=1.0))
Out[57]: (array([0, 2, 3]), array([4, 3, 3]))
One important caution - working with A.data means you exclude all of the zero entries of the dense array.
I would like to append my numpy array in a loop. In the begining my numpy array is empty.
x = np.array([])
I would like to append x with 3 element long array in order to get Mx3 matrix, but my array is appending in one dimension... What's wrong?
In [166]: x = np.array([])
In [167]: a
Out[167]: array([248, 249, 250])
In [168]: x = np.append(x,a, axis=0)
In [169]: x
Out[169]: array([ 248., 249., 250.])
In [170]: x = np.append(x,a, axis=0)
In [171]: x
Out[171]: array([ 248., 249., 250., 248., 249., 250.])
Use vstack:
In [51]: x = np.array([])
In [52]: a= np.array([248, 249, 250])
In [53]: x = np.append(x,a, axis=0)
In [54]: np.vstack((x,a))
Out[54]:
array([[ 248., 249., 250.],
[ 248., 249., 250.]])
Not sure what way you are using this but I doubt you need to use np.append(x,a, axis=0) at all. Just set x=a then vstack.
What's wrong is that your initial x is one-dimensional. See:
z = np.array([])
z.shape
# (0,)
np.ndim(z)
# 1
So if you np.append to x you will always end up with a one-dimensional array, i.e. a vector. Note that in Numpy one-dimensional arrays are row-vectors.
To use np.append you could start with a 2D array like so. Also, the array you append must have the same number of dimensions as the array you append to.
z = np.array([]).reshape((0,3))
a = np.array(248, 249, 250)
a2d = a.reshape(1, 3)
# a2d = np.atleast_2d(a)
# a2d = a[None, :]
# a2d = a[np.newaxis, :]
z = np.append(z, a2d, axis=0)