Reshaping Sequence Numpy Pandas python - python

I have tried this:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.read_csv("test.csv")
>>> df
input1 input2 input3 input4 input5 input6 input7 input8 output
0 1 2 3 4 5 6 7 8 1
1 2 3 4 5 6 7 8 9 0
2 3 4 5 6 7 8 9 10 -1
3 4 5 6 7 8 9 10 11 -1
4 5 6 7 8 9 10 11 12 1
5 6 7 8 9 10 11 12 13 0
6 7 8 9 10 11 12 13 14 1
>>> seq_len=3
>>> data = []
>>> data_raw = df.values
>>> for index in range(len(data_raw) - seq_len + 1):
... data.append(data_raw[index: index + seq_len])
...
>>> data
[array([[ 1, 2, 3, 4, 5, 6, 7, 8, 1],
[ 2, 3, 4, 5, 6, 7, 8, 9, 0],
[ 3, 4, 5, 6, 7, 8, 9, 10, -1]], dtype=int64), array([[ 2, 3, 4, 5, 6, 7, 8, 9, 0],
[ 3, 4, 5, 6, 7, 8, 9, 10, -1],
[ 4, 5, 6, 7, 8, 9, 10, 11, -1]], dtype=int64), array([[ 3, 4, 5, 6, 7, 8, 9, 10, -1],
[ 4, 5, 6, 7, 8, 9, 10, 11, -1],
[ 5, 6, 7, 8, 9, 10, 11, 12, 1]], dtype=int64), array([[ 4, 5, 6, 7, 8, 9, 10, 11, -1],
[ 5, 6, 7, 8, 9, 10, 11, 12, 1],
[ 6, 7, 8, 9, 10, 11, 12, 13, 0]], dtype=int64), array([[ 5, 6, 7, 8, 9, 10, 11, 12, 1],
[ 6, 7, 8, 9, 10, 11, 12, 13, 0],
[ 7, 8, 9, 10, 11, 12, 13, 14, 1]], dtype=int64)]
>>> data = np.asarray(data)
>>> data.shape
(5, 3, 9)
>>> data_reshape = data.reshape(5,9,3)
>>> data_reshape
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 1],
[ 2, 3, 4],
[ 5, 6, 7],
[ 8, 9, 0],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, -1]],
[[ 2, 3, 4],
[ 5, 6, 7],
[ 8, 9, 0],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, -1],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, -1]],
[[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, -1],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, -1],
[ 5, 6, 7],
[ 8, 9, 10],
[11, 12, 1]],
[[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, -1],
[ 5, 6, 7],
[ 8, 9, 10],
[11, 12, 1],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 0]],
[[ 5, 6, 7],
[ 8, 9, 10],
[11, 12, 1],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 0],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 1]]], dtype=int64)
I was willing to have the series as:
array([[[1,2,3],
[2,3,4],
[3,4,5],
[4,5,6],
[5,6,7],
[6,7,8],
[7,8,9],
[8,9,10],
[1,0,-1]],
[[2,3,4],
[3,4,5],
[4,5,6],
[5,6,7],
[6,7,8],
[7,8,9],
[8,9,10],
[9,10,11],
[0,-1,-1]],
[[3,4,5],
[4,5,6],
[5,6,7],
[6,7,8],
[7,8,9],
[8,9,10],
[9,10,11],
[10,11,12],
[-1,-1,1]],
[[4,5,6],
[5,6,7],
[6,7,8],
[7,8,9],
[8,9,10],
[9,10,11],
[10,11,12],
[11,12,13],
[-1,1,0]],
[[5,6,7],
[6,7,8],
[7,8,9],
[8,9,10],
[9,10,11],
[10,11,12],
[11,12,13],
[12,13,14],
[1,0,1]]], dtype=int64)
Kindly, help me achieve this.

I have tried the data you have supplied in the question. I understood what you wanted to have. See the following:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.read_csv("test.csv")
>>> df
input1 input2 input3 input4 input5 input6 input7 input8 output
0 1 2 3 4 5 6 7 8 1
1 2 3 4 5 6 7 8 9 0
2 3 4 5 6 7 8 9 10 -1
3 4 5 6 7 8 9 10 11 -1
4 5 6 7 8 9 10 11 12 1
5 6 7 8 9 10 11 12 13 0
6 7 8 9 10 11 12 13 14 1
>>> seq_len=3
>>> data = []
>>> data_raw = df.values
>>> for index in range(len(data_raw) - seq_len + 1):
... data.append(data_raw[index: index + seq_len].T)
...
>>> data
[array([[ 1, 2, 3],
[ 2, 3, 4],
[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10],
[ 1, 0, -1]], dtype=int64), array([[ 2, 3, 4],
[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10],
[ 9, 10, 11],
[ 0, -1, -1]], dtype=int64), array([[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10],
[ 9, 10, 11],
[10, 11, 12],
[-1, -1, 1]], dtype=int64), array([[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10],
[ 9, 10, 11],
[10, 11, 12],
[11, 12, 13],
[-1, 1, 0]], dtype=int64), array([[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10],
[ 9, 10, 11],
[10, 11, 12],
[11, 12, 13],
[12, 13, 14],
[ 1, 0, 1]], dtype=int64)]
>>> data = np.asarray(data)
>>> data.shape
(5, 9, 3)
Hope this is what you wanted to achieve. :)

Related

How to make a 2d numpy array from an empty numpy array by adding 1d numpy arrays?

So I'm trying to start an empty numpy array with a = np.array([]), but when i append other numpy arrays (like [1, 2, 3, 4, 5, 6, 7, 8] and [9, 10, 11, 12, 13, 14, 15, 16] to this array, then the result im basically getting is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16].
But what i want as result is: [[1, 2, 3, 4, 5, 6, 7, 8], [9, 10, 11, 12, 13, 14, 15, 16]]
IIUC you want to keep adding lists to your np.array. In that case, you can use something like np.vstack to "append" the new lists to the array.
a = np.array([[1, 2, 3],[4, 5, 6]])
np.vstack([a, [7, 8, 9]])
>>> array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
You can also use np.c_[], especially if a and b are already 1D arrays (but it also works with lists):
a = [1, 2, 3, 4, 5, 6, 7, 8]
b = [9, 10, 11, 12, 13, 14, 15, 16]
>>> np.c_[a, b]
array([[ 1, 9],
[ 2, 10],
[ 3, 11],
[ 4, 12],
[ 5, 13],
[ 6, 14],
[ 7, 15],
[ 8, 16]])
It also works "multiple times":
>>> np.c_[np.c_[a, b], a, b]
array([[ 1, 9, 1, 9],
[ 2, 10, 2, 10],
[ 3, 11, 3, 11],
[ 4, 12, 4, 12],
[ 5, 13, 5, 13],
[ 6, 14, 6, 14],
[ 7, 15, 7, 15],
[ 8, 16, 8, 16]])

set values of 2D numpy array by parameters

I need to create 2D np array by the known shape and parameters in this way:
import numpy as np
rows = 9
cols = 8
xrows = 3
xcols = 2
the wanted results:
([[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 3, 3, 6, 6, 9, 9, 12, 12],
[ 3, 3, 6, 6, 9, 9, 12, 12],
[ 3, 3, 6, 6, 9, 9, 12, 12]])
rows%xrows = 0
cols%xcols = 0
rows is the number of rows of the array, cols is the number of columns of the array, xrows is the number of rows in each slice, xcols is the number of column in each slice.
the answer should be general not for this parameters
IIUC, you can use:
((np.arange(rows//xrows*cols//xcols, dtype=int)+1)
.reshape((rows//xrows,cols//xcols), order='F')
.repeat(xrows,0)
.repeat(xcols,1)
)
Output:
array([[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 1, 1, 4, 4, 7, 7, 10, 10],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 2, 2, 5, 5, 8, 8, 11, 11],
[ 3, 3, 6, 6, 9, 9, 12, 12],
[ 3, 3, 6, 6, 9, 9, 12, 12],
[ 3, 3, 6, 6, 9, 9, 12, 12]])

Get ndarray from pandas column when cell elements are list

I have a pandas data frame that looks like this:
>>> df = pd.DataFrame({'a': list(range(10))})
>>> df['a'] = df.a.apply(lambda x: x*np.array([1,2,3]))
>>>df.head()
a
0 [0, 0, 0]
1 [1, 2, 3]
2 [2, 4, 6]
3 [3, 6, 9]
4 [4, 8, 12]
I would like to get column a from the df as a ndarray. But when I do that I get an array of arrays
>>> df.a.values
array([array([0, 0, 0]), array([1, 2, 3]), array([2, 4, 6]),
array([3, 6, 9]), array([ 4, 8, 12]), array([ 5, 10, 15]),
array([ 6, 12, 18]), array([ 7, 14, 21]), array([ 8, 16, 24]),
array([ 9, 18, 27])], dtype=object)
How can I get the returnd output to be
array([[ 0, 0, 0],
[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12],
# ...
])
Using pandas,
df.a.apply(pd.Series).values
Using numpy,
np.vstack(df.a.values)
You get
array([[ 0, 0, 0],
[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18],
[ 7, 14, 21],
[ 8, 16, 24],
[ 9, 18, 27]])
Check
np.array(df['a'].tolist())
array([[ 0, 0, 0],
[ 1, 2, 3],
[ 2, 4, 6],
[ 3, 6, 9],
[ 4, 8, 12],
[ 5, 10, 15],
[ 6, 12, 18],
[ 7, 14, 21],
[ 8, 16, 24],
[ 9, 18, 27]], dtype=int64)

Numpy: efficiently expand a matrix from submatrix

Given a square array D of dimension(4x4):
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
how can you, for each sub-matrix d of size (2x2) create another set of matrices D*, such that
D* = [...,
[[ 1, 2, 1, 2],
[ 3, 4, 3, 4],
[ 1, 2, 1, 2],
[ 3, 4, 3, 4]]
]
I then wish to build another square array out of D*, D**, such that:
D** = [[ 1, 2, 1, 2, 5, 6, 5, 6],
[ 3, 4, 3, 4, 7, 8, 7, 8],
[ 1, 2, 1, 2, 5, 6, 5, 6],
[ 3, 4, 3, 4, 7, 8, 7, 8],
[ 9, 10, 9, 10, 13, 14, 13, 14],
[ 11, 12, 11, 12, 15, 16, 15, 16],
[ 9, 10, 9, 10, 13, 14, 13, 14],
[ 11, 12, 11, 12, 15, 16, 15, 16]]]
My actual starting matrix D's dimension is 184x184, so I found that for-loops were too slow to achieve this. Is this too computationally intensive for numpy? Or is there a way to achieve this elegantly an efficiently?
Here is an example of the foor-loop pseudo-code:
segments = [(0,0), (2,2), (0,2), (2, 0)]
for seg in segments:
actual_seg = D[seg[0]:seg[0]+2, seg[1]:seg[1]+2]
D*.append(numpy.kron(numpy.ones((2, 2), dtype=int), actual_seg))
Given D and looking to expand each such (2x2) submatrix, one approach using a combination of np.tile and np.repeat would be -
m,n = D.shape
out = np.repeat(np.tile(D.reshape(m//2,2,n//2,2),2),2,axis=0).reshape(2*m,2*n)
Sample run -
In [116]: D
Out[116]:
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
In [117]: m,n = D.shape
In [118]: np.repeat(np.tile(D.reshape(m//2,2,n//2,2),2),2,axis=0).reshape(2*m,2*n)
Out[118]:
array([[ 1, 2, 1, 2, 5, 6, 5, 6],
[ 3, 4, 3, 4, 7, 8, 7, 8],
[ 1, 2, 1, 2, 5, 6, 5, 6],
[ 3, 4, 3, 4, 7, 8, 7, 8],
[ 9, 10, 9, 10, 13, 14, 13, 14],
[11, 12, 11, 12, 15, 16, 15, 16],
[ 9, 10, 9, 10, 13, 14, 13, 14],
[11, 12, 11, 12, 15, 16, 15, 16]])
from __future__ import print_function
import numpy as np
def fn(m_):
rows, cols = m_.shape
assert rows%2==0 and cols%2==0
return np.reshape(
np.transpose(
np.tile(
np.transpose(
np.reshape(
m_,(rows//2,2, cols//2,2)
),(0,2,1,3)),(2,2)), (0,2,1,3)
),
(rows*2,cols*2)
)
m = np.array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
print(fn(m))
output:
[[ 1 2 1 2 5 6 5 6]
[ 3 4 3 4 7 8 7 8]
[ 1 2 1 2 5 6 5 6]
[ 3 4 3 4 7 8 7 8]
[ 9 10 9 10 13 14 13 14]
[11 12 11 12 15 16 15 16]
[ 9 10 9 10 13 14 13 14]
[11 12 11 12 15 16 15 16]]

Numpy index, get bands of width 2

I am wondering if there is a way it index/slice a numpy array, such that one can get every other band of 2 elements. In other words, given:
test = np.array([[1,2,3,4,5,6,7,8],[9,10,11,12,13,14,15,16]])
I would like to get the array:
[[1, 2, 5, 6],
[9, 10, 13, 14]]
Thoughts on how this can be accomplished with slicing/indexing?
Not that difficult with a few smart reshapes :)
test.reshape((4, 4))[:, :2].reshape((2, 4))
Given:
>>> test
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16]])
You can do:
>>> test.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
Which works even for different shapes of initial arrays:
>>> test2
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
>>> test2.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
>>> test3
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
>>> test3.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
How it works:
1. Reshape into two columns by however many rows:
>>> test.reshape(-1,2)
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12],
[13, 14],
[15, 16]])
2. Stride the array by stepping every second element
>>> test.reshape(-1,2)[::2]
array([[ 1, 2],
[ 5, 6],
[ 9, 10],
[13, 14]])
3. Set the shape you want of 4 columns, however many rows:
>>> test.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])

Categories