How to efficiently generate matrix from vector in numpy? [duplicate] - python

This question already has answers here:
Most efficient way to map function over numpy array
(11 answers)
Numpy vectorize function with non-scalar output
(1 answer)
Closed 5 years ago.
I have a function f(x):[0,1]-> Rⁿ such as:
>>> f(0.54)
array([0.2, 0.3, 4.0, 5.2, ... , 1.0])
How can I efficiently apply that to a vector, in order to generate a matrix?
Example:
>>> f([0.54, 0.32, 0.56, 0.21])
array([0.2, 0.3, 4.0, 5.2, ... , 1.0],
[0.6, 0.1, 0.0, 2.3, ... , 4.7],
[0.1, 7.1, 0.2, 4.9, ... , 3.1],
[1.3, 2.8, 1.2, 1.1, ... , 5.3])
Note: numpy solutions are very welcome :)

Related

How to get partial cumulative sums (of positive and negative numbers) in an array? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
I have an array with positive and negative numbers and want to do a cumulative sum of numbers of the same sign until the next number carries an opposite sign. It starts again at 0. Maybe better explained with a sample.
Here is the original array:
np.array([0.2, 0.5, 1.3, 0.6, -0.3, -1.1, 0.2, -2.0, 0.7, 1.1, 0.0, -1.2])
And the output I expect without using a loop, of course:
np.array([0.0, 0.0, 0.0, 2.6, 0.0, -1.4, 0.2, -2.0, 0.0, 0.0, 1.8, -1.2])
Any efficient idea would help a lot...
One vectorial option:
a = np.array([0.2, 0.5, 1.3, 0.6, -0.3, -1.1, 0.2, -2.0, 0.7, 1.1, 0.0, -1.2])
cs = np.cumsum(a)
idx = np.nonzero(np.r_[np.diff(a>0), True])
out = np.zeros_like(a)
out[idx] = np.diff(np.r_[0, cs[idx]])
Output:
array([ 0. , 0. , 0. , 2.6, 0. , -1.4, 0.2, -2. , 0. , 1.8, 0. , -1.2])

How to pass argument of type char ** from Python to C API [duplicate]

As seen here How do I convert a Python list into a C array by using ctypes? this code will take a python array and transform it to a C array.
import ctypes
arr = (ctypes.c_int * len(pyarr))(*pyarr)
Which would the way of doing the same with a list of lists or a lists of lists of lists?
For example, for the following variable
list3d = [[[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]], [[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]], [[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]]]
I have tried the following with no luck:
([[ctypes.c_double * 4] *2]*3)(*list3d)
# *** TypeError: 'list' object is not callable
(ctypes.c_double * 4 *2 *3)(*list3d)
# *** TypeError: expected c_double_Array_4_Array_2 instance, got list
Thank you!
EDIT: Just to clarify, I am trying to get one object that contains the whole multidimensional array, not a list of objects. This object's reference will be an input to a C DLL that expects a 3D array.
It works with tuples if you don't mind doing a bit of conversion first:
from ctypes import *
list3d = [
[[0.0, 1.0, 2.0, 3.0], [4.0, 5.0, 6.0, 7.0]],
[[0.2, 1.2, 2.2, 3.2], [4.2, 5.2, 6.2, 7.2]],
[[0.4, 1.4, 2.4, 3.4], [4.4, 5.4, 6.4, 7.4]],
]
arr = (c_double * 4 * 2 * 3)(*(tuple(tuple(j) for j in i) for i in list3d))
Check that it's initialized correctly in row-major order:
>>> (c_double * 24).from_buffer(arr)[:]
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0,
0.2, 1.2, 2.2, 3.2, 4.2, 5.2, 6.2, 7.2,
0.4, 1.4, 2.4, 3.4, 4.4, 5.4, 6.4, 7.4]
Or you can create an empty array and initialize it using a loop. enumerate over the rows and columns of the list and assign the data to a slice:
arr = (c_double * 4 * 2 * 3)()
for i, row in enumerate(list3d):
for j, col in enumerate(row):
arr[i][j][:] = col
I made the change accordingly
a = [[[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]], [[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]], [[40.0, 1.2, 6.0, 0.3], [50.0, 4.2, 0, 0]]]
arr = (((ctypes.c_float * len(a[0][0])) * len(a[0])) * len(a))
arr_instance=arr()
for i in range(0,len(a)):
for j in range(0,len(a[0])):
for k in range(0,len(a[0][0])):
arr_instance[i][j][k]=a[i][j][k]
The arr_instance is what you want.

is there a parameter to set the precision for numpy.linspace?

I am trying to check if a numpy array contains a specific value:
>>> x = np.linspace(-5,5,101)
>>> x
array([-5. , -4.9, -4.8, -4.7, -4.6, -4.5, -4.4, -4.3, -4.2, -4.1, -4. ,
-3.9, -3.8, -3.7, -3.6, -3.5, -3.4, -3.3, -3.2, -3.1, -3. , -2.9,
-2.8, -2.7, -2.6, -2.5, -2.4, -2.3, -2.2, -2.1, -2. , -1.9, -1.8,
-1.7, -1.6, -1.5, -1.4, -1.3, -1.2, -1.1, -1. , -0.9, -0.8, -0.7,
-0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0. , 0.1, 0.2, 0.3, 0.4,
0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3, 1.4, 1.5,
1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6,
2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7,
3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8,
4.9, 5. ])
>>> -5. in x
True
>>> a = 0.2
>>> a
0.2
>>> a in x
False
I assigned a constant to variable a. It seems that the precision of a is not compatible with the elements in the numpy array generated by np.linspace().
I've searched the docs, but didn't find anything about this.
This is not a question of the precision of np.linspace, but rather of the type of the elements in the generated array.
np.linspace generates elements which, conceptually, equally divide the input range between them. However, these elements are then stored as floating point numbers with limited precision, which makes the generation process itself appear to lack precision.
By passing the dtype argument to np.linspace, you can specify the precision of the floating point type used to store its result, which can increase the apparent precision of the generation process.
Nevertheless, you should not use the equality operator to compare floating point numbers. Instead, use np.isclose in conjunction with np.ndarray.any, or some equivalent:
>>> floats_64 = np.linspace(-5, 5, 101, dtype='float64')
>>> floats_128 = np.linspace(-5, 5, 101, dtype='float128')
>>> print(0.2 in floats_64)
False
>>> print(floats_64[52])
0.20000000000000018
>>> print(np.isclose(0.2, floats_64).any()) # check if any element in floats_64 is close to 0.2
True
>>> print(0.2 in floats_128)
False
>>> print(floats_128[52])
0.20000000000000017764
>>> print(np.isclose(0.2, floats_128).any()) # check if any element in floats_128 is close to 0.2
True

Convert an array stored as a string, to a proper numpy array

For long and tedious reasons, I have lots of arrays that are stored as strings:
tmp = '[[1.0, 3.0, 0.4]\n [3.0, 4.0, -1.0]\n [3.0, 4.0, 0.1]\n [3.0, 4.0, 0.2]]'
Now I obviously do not want my arrays as long strings, I want them as proper numpy arrays so I can use them. Consequently, what is a good way to convert the above to:
tmp_np = np.array([[1.0, 3.0, 0.4]
[3.0, 4.0, -1.0]
[3.0, 4.0, 0.1]
[3.0, 4.0, 0.2]])
such that I can do simple things like tmp_np.shape = (4,3) or simple indexing tmp_np[0,:] = [1.0, 3.0, 0.4] etc.
Thanks
You can use ast.literal_eval, if you replace your \n characters with ,:
temp_np = np.array(ast.literal_eval(tmp.replace('\n', ',')))
Returns:
>>> tmp_np
array([[ 1. , 3. , 0.4],
[ 3. , 4. , -1. ],
[ 3. , 4. , 0.1],
[ 3. , 4. , 0.2]])

Convert pandas DataFrame into list of lists [duplicate]

This question already has answers here:
Pandas DataFrame to List of Lists
(14 answers)
Closed 3 years ago.
I have a pandas data frame like this:
admit gpa gre rank
0 3.61 380 3
1 3.67 660 3
1 3.19 640 4
0 2.93 520 4
Now I want to get a list of rows in pandas like:
[[0,3.61,380,3], [1,3.67,660,3], [1,3.19,640,4], [0,2.93,520,4]]
How can I do it?
There is a built in method which would be the fastest method also, calling tolist on the .values np array:
df.values.tolist()
[[0.0, 3.61, 380.0, 3.0],
[1.0, 3.67, 660.0, 3.0],
[1.0, 3.19, 640.0, 4.0],
[0.0, 2.93, 520.0, 4.0]]
you can do it like this:
map(list, df.values)
EDIT: as_matrix is deprecated since version 0.23.0
You can use the built in values or to_numpy (recommended option) method on the dataframe:
In [8]:
df.to_numpy()
Out[8]:
array([[ 0.9, 7. , 5.2, ..., 13.3, 13.5, 8.9],
[ 0.9, 7. , 5.2, ..., 13.3, 13.5, 8.9],
[ 0.8, 6.1, 5.4, ..., 15.9, 14.4, 8.6],
...,
[ 0.2, 1.3, 2.3, ..., 16.1, 16.1, 10.8],
[ 0.2, 1.3, 2.4, ..., 16.5, 15.9, 11.4],
[ 0.2, 1.3, 2.4, ..., 16.5, 15.9, 11.4]])
If you explicitly want lists and not a numpy array add .tolist():
df.to_numpy().tolist()

Categories