I created a script to generate a list:
import random
nota1 = range (5, 11)
nota2 = range (5, 11)
nota3 = range (5, 11)
nota4 = range (0, 2)
dados = []
for i in range(1000):
dados_dado = []
n1 = random.choice(nota1)
n2 = random.choice(nota2)
n3 = random.choice(nota3)
n4 = random.choice(nota4)
n1 = float (n1)
n2 = float (n2)
n3 = float (n3)
n4 = float (n4)
dados_dado.append (n1)
dados_dado.append (n2)
dados_dado.append (n3)
dados_dado.append (n4)
dados.append (dados_dado)
When i print type (dados) python return: <type 'list'>, a huge list that looks like this:
[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0], [6.0, 9.0, 8.0, 1.0]]
I need to transform it to <type 'numpy.ndarray'> so i made :
data = np.array(dados)
What i expected to return was something like this:
[[ 6.8 3.2 5.9 2.3]
[ 6.7 3.3 5.7 2.5]
[ 6.7 3. 5.2 2.3]
[ 6.3 2.5 5. 1.9]
[ 6.5 3. 5.2 2. ]
[ 6.2 3.4 5.4 2.3]
[ 5.9 3. 5.1 1.8]]
But, what i get instead is:
[[ 7. 10. 6. 1.]
[ 8. 6. 6. 1.]
[ 6. 9. 5. 0.]
...,
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
What am i doing wrong?
With your sample:
In [574]: dados=[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.
...: 0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0],
...: [6.0, 9.0, 8.0, 1.0]]
In [575]: print(dados)
[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0], [6.0, 9.0, 8.0, 1.0]]
convert it to an array, an see the whole thing. Your input didn't have decimals to numpy display omits those.
In [576]: print(np.array(dados))
[[ 5. 8. 10. 1.]
[ 8. 9. 9. 1.]
[ 7. 5. 6. 1.]
[ 5. 8. 7. 0.]
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
Replicate the list many times, and print display has this ..., rather than show 10,000 lines. That's nice isn't it?
In [577]: print(np.array(dados*1000))
[[ 5. 8. 10. 1.]
[ 8. 9. 9. 1.]
[ 7. 5. 6. 1.]
...,
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
The full array is still there
In [578]: np.array(dados*1000).shape
Out[578]: (7000, 4)
The default is for numpy to add the ellipsis when the total number of entries is 1000. Do you really need to see all those lines?
That print standard can be changed, but I question whether you need to do that.
Your array is fine. NumPy just suppresses display of the whole array for large arrays by default.
(If you actually were expecting your array to be short enough not to trigger this behavior, or if you were actually expecting it to have non-integer entries, you'll have to explain why you expected that.)
numpy.set_printoptions(precision=20)
Will give you more displayabilty, set precision as you desire.
Related
How can I determine whether one graph lies within another?
My algorithm works on the following matrix:
import numpy as np
A = np.zeros((9,9))
for i in np.arange(1,8):
for j in np.arange(1,8):
A[i,j] = 1
for i in np.arange(2,4):
for j in np.arange(2,4):
A[i,j] = 2
print(A)
yields the matrix:
[[-1. -1. -1. -1. -1. -1. -1. -1. -1.]
[-1. 1. 1. 1. 1. 1. 1. 1. -1.]
[-1. 1. 2. 2. 1. 1. 1. 1. -1.]
[-1. 1. 2. 2. 1. 1. 1. 1. -1.]
[-1. 1. 1. 1. 1. 1. 1. 1. -1.]
[-1. 1. 1. 1. 1. 1. 1. 1. -1.]
[-1. 1. 1. 1. 1. 1. 1. 1. -1.]
[-1. 1. 1. 1. 1. 1. 1. 1. -1.]
[-1. -1. -1. -1. -1. -1. -1. -1. -1.]]
To create two graphs:
With vertices:
V1 = [[(2.0, 1.333333), (1.333333, 3.0), (1.333333, 2.0), (2.0, 3.666667), (3.0, 3.666667), (3.666667, 3.0), (3.666667, 2.0), (3.0, 1.333333)]]
V2 = [[(1.0, 0.5), (0.5, 2.0), (0.5, 1.0), (0.5, 3.0), (0.5, 4.0), (0.5, 5.0), (0.5, 6.0), (0.5, 7.0), (1.0, 7.5), (2.0, 7.5), (3.0, 7.5), (4.0, 7.5), (5.0, 7.5), (6.0, 7.5), (7.0, 7.5), (7.5, 7.0), (7.5, 6.0), (7.5, 5.0), (7.5, 4.0), (7.5, 3.0), (7.5, 2.0), (7.5, 1.0), (7.0, 0.5), (6.0, 0.5), (5.0, 0.5), (4.0, 0.5), (3.0, 0.5), (2.0, 0.5)]]
And edge lists:
e1 = [[[1.333333, 2.0], [2.0, 1.333333]], [[1.333333, 3.0], [1.333333, 2.0]], [[2.0, 3.666667], [1.333333, 3.0]], [[2.0, 1.333333], [3.0, 1.333333]], [[2.0, 3.666667], [3.0, 3.666667]], [[3.0, 1.333333], [3.666667, 2.0]], [[3.666667, 3.0], [3.666667, 2.0]], [[3.0, 3.666667], [3.666667, 3.0]]]
e2 = [[[0.5, 1.0], [1.0, 0.5]], [[0.5, 2.0], [0.5, 1.0]], [[0.5, 3.0], [0.5, 2.0]], [[0.5, 4.0], [0.5, 3.0]], [[0.5, 5.0], [0.5, 4.0]], [[0.5, 6.0], [0.5, 5.0]], [[0.5, 7.0], [0.5, 6.0]], [[1.0, 7.5], [0.5, 7.0]], [[1.0, 0.5], [2.0, 0.5]], [[1.0, 7.5], [2.0, 7.5]], [[2.0, 0.5], [3.0, 0.5]], [[2.0, 7.5], [3.0, 7.5]], [[3.0, 0.5], [4.0, 0.5]], [[3.0, 7.5], [4.0, 7.5]], [[4.0, 0.5], [5.0, 0.5]], [[4.0, 7.5], [5.0, 7.5]], [[5.0, 0.5], [6.0, 0.5]], [[5.0, 7.5], [6.0, 7.5]], [[6.0, 0.5], [7.0, 0.5]], [[6.0, 7.5], [7.0, 7.5]], [[7.0, 0.5], [7.5, 1.0]], [[7.5, 2.0], [7.5, 1.0]], [[7.5, 3.0], [7.5, 2.0]], [[7.5, 4.0], [7.5, 3.0]], [[7.5, 5.0], [7.5, 4.0]], [[7.5, 6.0], [7.5, 5.0]], [[7.5, 7.0], [7.5,
6.0]], [[7.0, 7.5], [7.5, 7.0]]]
As Prune suggests, the shapely package has what you need. While your line loops can be thought of as a graph, it's more useful to consider them as polygons embedded in the 2D plane.
By creating Polygon objects from your points and edge segments, you can use the contains method that all shapely objects have to test if one is inside the other.
You'll need to sort the edge segments into order. Clockwise or anti-clockwise probably doesn't matter as shapely likely detects inside and outside by constructing a point at infinity and ensuring that is 'outside'.
Here's a full example with the orignal pair of squares from your post:
from shapely.geometry import Polygon
p1 = Polygon([(0,0), (0,8), (8,8), (8,0)])
p2 = Polygon([(2,2), (2,4), (4,4), (4,2)])
print(p1.contains(p2))
Documentation for the Polygon object is at https://shapely.readthedocs.io/en/latest/manual.html#Polygon
and for the contains method at https://shapely.readthedocs.io/en/latest/manual.html#object.contains
For this problem, I got the 8 vertices of a box that i need to shrink, with a given size that is an integer which I need to shrink every side with. For example, if the size of the box I need to shrink is 8*8*8 and the shrinking size is 2, I need to return a list of all the vertices of the 4*4*4 boxes that fill the big box in a 3D coordinate system.
I thought about having a for loop that runs in range of the size of the box, but than I thought if I want to eventually seperate the box into a lot more boxes that are smaller and I want to fill the big box i would have to write an amount of code that I wouldn't be able to write. How to get this list of vertices without writing this much code?
I'm not sure if this is what you want, but here is a simple way to compute vertices in a grid with NumPy:
import numpy as np
def make_grid(x_size, y_size, z_size, shrink_factor):
n = (shrink_factor + 1) * 1j
xx, yy, zz = np.mgrid[:x_size:n, :y_size:n, :z_size:n]
return np.stack([xx.ravel(), yy.ravel(), zz.ravel()], axis=1)
print(make_grid(8, 8, 8, 2))
Output:
[[0. 0. 0.]
[0. 0. 4.]
[0. 0. 8.]
[0. 4. 0.]
[0. 4. 4.]
[0. 4. 8.]
[0. 8. 0.]
[0. 8. 4.]
[0. 8. 8.]
[4. 0. 0.]
[4. 0. 4.]
[4. 0. 8.]
[4. 4. 0.]
[4. 4. 4.]
[4. 4. 8.]
[4. 8. 0.]
[4. 8. 4.]
[4. 8. 8.]
[8. 0. 0.]
[8. 0. 4.]
[8. 0. 8.]
[8. 4. 0.]
[8. 4. 4.]
[8. 4. 8.]
[8. 8. 0.]
[8. 8. 4.]
[8. 8. 8.]]
Otherwise with itertools:
from itertools import product
def make_grid(x_size, y_size, z_size, shrink_factor):
return [(x * x_size, y * y_size, z * z_size)
for x, y, z in product((i / shrink_factor
for i in range(shrink_factor + 1)), repeat=3)]
print(*make_grid(8, 8, 8, 2), sep='\n')
Output:
(0.0, 0.0, 0.0)
(0.0, 0.0, 4.0)
(0.0, 0.0, 8.0)
(0.0, 4.0, 0.0)
(0.0, 4.0, 4.0)
(0.0, 4.0, 8.0)
(0.0, 8.0, 0.0)
(0.0, 8.0, 4.0)
(0.0, 8.0, 8.0)
(4.0, 0.0, 0.0)
(4.0, 0.0, 4.0)
(4.0, 0.0, 8.0)
(4.0, 4.0, 0.0)
(4.0, 4.0, 4.0)
(4.0, 4.0, 8.0)
(4.0, 8.0, 0.0)
(4.0, 8.0, 4.0)
(4.0, 8.0, 8.0)
(8.0, 0.0, 0.0)
(8.0, 0.0, 4.0)
(8.0, 0.0, 8.0)
(8.0, 4.0, 0.0)
(8.0, 4.0, 4.0)
(8.0, 4.0, 8.0)
(8.0, 8.0, 0.0)
(8.0, 8.0, 4.0)
(8.0, 8.0, 8.0)
A solution using numpy, which allows easy bloc manipulation.
First I choose to represent a cube with an origin and three vectors : the unit cube is represented with orig=np.array([0,0,0]) and vects=np.array([[1,0,0],[0,1,0],[0,0,1]]).
Now a numpy function to generate the eight vertices:
import numpy as np
def cube(origin,edges):
for e in edges:
origin = np.vstack((origin,origin+e))
return origin
cube(orig,vects)
array([[0, 0, 0],
[1, 0, 0],
[0, 1, 0],
[1, 1, 0],
[0, 0, 1],
[1, 0, 1],
[0, 1, 1],
[1, 1, 1]])
Then an other to span minicubes in 3D :
def split(origin,edges,k):
minicube=cube(origin,edges/k)
for e in edges/k:
minicube =np.vstack([minicube + i*e for i in range(k) ])
return minicube.reshape(k**3,8,3)
split (orig,vects,2)
array([[[ 0. , 0. , 0. ],
[ 0.5, 0. , 0. ],
[ 0. , 0.5, 0. ],
[ 0.5, 0.5, 0. ],
[ 0. , 0. , 0.5],
[ 0.5, 0. , 0.5],
[ 0. , 0.5, 0.5],
[ 0.5, 0.5, 0.5]],
...
[[ 0.5, 0.5, 0.5],
[ 1. , 0.5, 0.5],
[ 0.5, 1. , 0.5],
[ 1. , 1. , 0.5],
[ 0.5, 0.5, 1. ],
[ 1. , 0.5, 1. ],
[ 0.5, 1. , 1. ],
[ 1. , 1. , 1. ]]])
My example below will work on a generic box and assumes integer coordinates.
import numpy as np
def create_cube(start_x, start_y, start_z, size):
return np.array([
[x,y,z]
for z in [start_z, start_z+size]
for y in [start_y, start_y+size]
for x in [start_x, start_x+size]
])
def subdivide(box, scale):
start = np.min(box, axis=0)
end = np.max(box, axis=0) - scale
return np.array([
create_cube(x, y, z, scale)
for z in range(start[2], end[2]+1)
for y in range(start[1], end[1]+1)
for x in range(start[0], end[0]+1)
])
cube = create_cube(1, 3, 2, 8)
Cube will look like:
array([[ 1, 3, 2],
[ 9, 3, 2],
[ 1, 11, 2],
[ 9, 11, 2],
[ 1, 3, 10],
[ 9, 3, 10],
[ 1, 11, 10],
[ 9, 11, 10]])
Running the following subdivide:
subcubes = subdivide(cube, 2)
The subdivide function creates an nparray with a shape: (343, 8, 3). You would expect to have 343 subcubes moving the 2x2 cube evenly over an 8x8 cube.
For long and tedious reasons, I have lots of arrays that are stored as strings:
tmp = '[[1.0, 3.0, 0.4]\n [3.0, 4.0, -1.0]\n [3.0, 4.0, 0.1]\n [3.0, 4.0, 0.2]]'
Now I obviously do not want my arrays as long strings, I want them as proper numpy arrays so I can use them. Consequently, what is a good way to convert the above to:
tmp_np = np.array([[1.0, 3.0, 0.4]
[3.0, 4.0, -1.0]
[3.0, 4.0, 0.1]
[3.0, 4.0, 0.2]])
such that I can do simple things like tmp_np.shape = (4,3) or simple indexing tmp_np[0,:] = [1.0, 3.0, 0.4] etc.
Thanks
You can use ast.literal_eval, if you replace your \n characters with ,:
temp_np = np.array(ast.literal_eval(tmp.replace('\n', ',')))
Returns:
>>> tmp_np
array([[ 1. , 3. , 0.4],
[ 3. , 4. , -1. ],
[ 3. , 4. , 0.1],
[ 3. , 4. , 0.2]])
Suppose I have the following code:
import numpy as np
import pandas as pd
x = np.array([1.0, 1.1, 1.2, 1.3, 1.4])
s = pd.Series(x, index=[1, 2, 3, 4, 5])
This produces the following s:
1 1.0
2 1.1
3 1.2
4 1.3
5 1.4
Now what I want to create is a rolling window of size n, but I don't want to take the mean or standard deviation of each window, I just want the arrays. So, suppose n = 3. I want a transformation that outputs the following series given the input s:
1 array([1.0, nan, nan])
2 array([1.1, 1.0, nan])
3 array([1.2, 1.1, 1.0])
4 array([1.3, 1.2, 1.1])
5 array([1.4, 1.3, 1.2])
How do I do this?
Here's one way to do it
In [294]: arr = [s.shift(x).values[::-1][:3] for x in range(len(s))[::-1]]
In [295]: arr
Out[295]:
[array([ 1., nan, nan]),
array([ 1.1, 1. , nan]),
array([ 1.2, 1.1, 1. ]),
array([ 1.3, 1.2, 1.1]),
array([ 1.4, 1.3, 1.2])]
In [296]: pd.Series(arr, index=s.index)
Out[296]:
1 [1.0, nan, nan]
2 [1.1, 1.0, nan]
3 [1.2, 1.1, 1.0]
4 [1.3, 1.2, 1.1]
5 [1.4, 1.3, 1.2]
dtype: object
Here's a vectorized approach using NumPy broadcasting -
n = 3 # window length
idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1
out = s.get_values()[idx]
out[idx<0] = np.nan
This gets you the output as a 2D array.
To get a series with each element holding each window as a list -
In [40]: pd.Series(out.tolist())
Out[40]:
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
If you wish to have a list of 1D arrays split arrays, you can use np.split on the output, like so -
out_split = np.split(out,out.shape[0],axis=0)
Sample run -
In [100]: s
Out[100]:
1 1.0
2 1.1
3 1.2
4 1.3
5 1.4
dtype: float64
In [101]: n = 3
In [102]: idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1
...: out = s.get_values()[idx]
...: out[idx<0] = np.nan
...:
In [103]: out
Out[103]:
array([[ 1. , nan, nan],
[ 1.1, 1. , nan],
[ 1.2, 1.1, 1. ],
[ 1.3, 1.2, 1.1],
[ 1.4, 1.3, 1.2]])
In [104]: np.split(out,out.shape[0],axis=0)
Out[104]:
[array([[ 1., nan, nan]]),
array([[ 1.1, 1. , nan]]),
array([[ 1.2, 1.1, 1. ]]),
array([[ 1.3, 1.2, 1.1]]),
array([[ 1.4, 1.3, 1.2]])]
Memory-efficiency with strides
For memory efficiency, we can use a strided one - strided_axis0, similar to #B. M.'s solution, but a bit more generic one.
So, to get 2D array of values with NaNs precedding the first element -
In [35]: strided_axis0(s.values, fillval=np.nan, L=3)
Out[35]:
array([[nan, nan, 1. ],
[nan, 1. , 1.1],
[1. , 1.1, 1.2],
[1.1, 1.2, 1.3],
[1.2, 1.3, 1.4]])
To get 2D array of values with NaNs as fillers coming after the original elements in each row and the order of elements being flipped, as stated in the problem -
In [36]: strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1]
Out[36]:
array([[1. , nan, nan],
[1.1, 1. , nan],
[1.2, 1.1, 1. ],
[1.3, 1.2, 1.1],
[1.4, 1.3, 1.2]])
To get a series with each element holding each window as a list, simply wrap the earlier methods with pd.Series(out.tolist()) with out being the 2D array outputs -
In [38]: pd.Series(strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1].tolist())
Out[38]:
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
Your data look like a strided array :
data=np.lib.stride_tricks.as_strided(np.concatenate(([NaN]*2,s))[2:],(5,3),(8,-8))
"""
array([[ 1. , nan, nan],
[ 1.1, 1. , nan],
[ 1.2, 1.1, 1. ],
[ 1.3, 1.2, 1.1],
[ 1.4, 1.3, 1.2]])
"""
Then transform in Series :
pd.Series(map(list,data))
""""
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
""""
If you attach the missing nans at the beginning and the end of the series, you use a simple window
def wndw(s,size=3):
stretched = np.hstack([
np.array([np.nan]*(size-1)),
s.values.T,
np.array([np.nan]*size)
])
for begin in range(len(stretched)-size):
end = begin+size
yield stretched[begin:end][::-1]
for arr in wndw(s, 3):
print arr
Following numpy command:
c = np.matrix('1,0,0,0;0,1,0,0;0,0,1,0;-6.6,1.0,-2.8, 1.0')
creates a matrix Outupt:
[[ 1. 0. 0. 0. ]
[ 0. 1. 0. 0. ]
[ 0. 0. 1. 0. ]
[-6.6 1. -2.8 1. ]]
However my Input is a comma-separated array of floats :
[1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, -6.604560409595856, 1.0, -2.81542864114781, 1.0]
Is there a simple way of getting those floats, easily into a numpy matrix by defining the shape in before as a 4 x 4 matrix?
np.array([1.0, 0.0,..., -2.81542864114781, 1.0]).reshape((4, 4))