If I have a large 2D numpy array and 2 arrays which correspond to the x and y indices I want to extract, It's easy enough:
h = np.arange(49).reshape(7,7)
# h = [[0, 1, 2, 3, 4, 5, 6],
# [7, 8, 9, 10, 11, 12, 13],
# [14, 15, 16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25, 26, 27],
# [28, 29, 30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39, 40, 41],
# [42, 43, 44, 45, 46, 47, 48]]
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
reduced_h = h[x_indices, y_indices]
#reduced_h = [ 9, 24, 33]
However, I would like to, for each x,y pair cut out a square (denoted by 'a' - the number of indices in each direction from the centre) surrounding this 'coordinate' and return an array of these little 2D arrays.
For example, for h, x,y_indices as above and a=1:
reduced_h = [[[1,2,3],[8,9,10],[15,16,17]], [[16,17,18],[23,24,25],[30,31,32]], [[25,26,27],[32,33,34],[39,40,41]]]
i.e one 3x3 array for each x-y index pair corresponding to the 3x3 square of elements centred on the x-y index. In general, this should return a numpy array which has shape (len(x_indices),2a+1, 2a+1)
By analogy to reduced_h[0] = h[x_indices[0]-1:x_indices[0]+1 , y_indices[0]-1:y_indices[0]+1] = h[1-1:1+1 , 2-1:2+1] = h[0:2, 1:3] my first try was the following:
h[x_indices-a : x_indices+a, y_indices-a : y_indices+a]
However, perhaps unsurprisingly, slicing between the arrays fails.
So the obvious next thing to try is to create this slice manually. np.arange seems to struggle with this but linspace works:
a=1
xrange = np.linspace(x_indices-a, x_indices+a, 2*a+1, dtype=int)
# xrange = [ [0, 2, 3], [1, 3, 4], [2, 4, 5] ]
yrange = np.linspace(y_indices-a, y_indices+a, 2*a+1, dtype=int)
Now can try h[xrange,yrange] but this unsurprisingly does this element-wise meaning I get only one (2a+1)x(2a+1) array (the same dimensions as xrange and yrange). It there a way to, for every index, take the right slices from these ranges (without loops)? Or is there a way to make the broadcast work initially without having to set up linspace explicitly? Thanks
You can index np.lib.stride_tricks.sliding_window_view using your x and y indices:
import numpy as np
h = np.arange(49).reshape(7,7)
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
a = 1
window = (2*a+1, 2*a+1)
out = np.lib.stride_tricks.sliding_window_view(h, window)[x_indices-a, y_indices-a]
out:
array([[[ 1, 2, 3],
[ 8, 9, 10],
[15, 16, 17]],
[[16, 17, 18],
[23, 24, 25],
[30, 31, 32]],
[[25, 26, 27],
[32, 33, 34],
[39, 40, 41]]])
Note that you may need to pad h first to handle windows around your coordinates that reach "outside" h.
I'm struggling with tables for matplotlib (blume). The table is for an automation project that will produce 22 different maps. The code below produce a table with 49 rows. Some figures will only have 6 rows. When the number of rows exceeds 25 I would like to use two columns.
import pandas as pd
import matplotlib.pyplot as plt
from blume.table import table
# Dataframe
df=pd.DataFrame({'nr': [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
'KnNamn': ['Härryda', 'Partille', 'Öckerö', 'Stenungsund', 'Tjörn', 'Orust',
'Sotenäs', 'Munkedal', 'Tanum', 'Dals-Ed', 'Färgelanda', 'Ale',
'Lerum', 'Vårgårda', 'Bollebygd', 'Grästorp', 'Essunga',
'Karlsborg', 'Gullspång', 'Tranemo', 'Bengtsfors', 'Mellerud',
'Lilla Edet', 'Mark', 'Svenljunga', 'Herrljunga', 'Vara', 'Götene',
'Tibro', 'Töreboda', 'Göteborg', 'Mölndal', 'Kungälv', 'Lysekil',
'Uddevalla', 'Strömstad', 'Vänersborg', 'Trollhättan', 'Alingsås',
'Borås', 'Ulricehamn', 'Åmål', 'Mariestad', 'Lidköping', 'Skara',
'Skövde', 'Hjo', 'Tidaholm', 'Falköping'],
'rel': [0.03650425, 0.05022105, 0.03009109, 0.03966735, 0.02793296,
0.03690838, 0.04757161, 0.05607283, 0.0546372 , 0.05452821,
0.06640368, 0.04252673, 0.03677577, 0.05385784, 0.0407173 ,
0.04024881, 0.05613226, 0.04476127, 0.08543165, 0.04070175,
0.09281077, 0.08711656, 0.06111578, 0.04564958, 0.05058988,
0.04618078, 0.04640402, 0.04826498, 0.08514253, 0.07799246,
0.07829886, 0.04249149, 0.03909206, 0.06835601, 0.08027622,
0.07087295, 0.09013876, 0.1040369 , 0.05004451, 0.06584845,
0.04338739, 0.10570863, 0.0553109 , 0.05024871, 0.06531729,
0.05565605, 0.05041816, 0.04885198, 0.07954831]})
# Table
fig,ax = plt.subplots(1, figsize=(10, 7))
val =[]
ax.axis('off')
for i, j, k in zip(df.nr, df.KnNamn, df.rel):
k = k*100
k = round(k,2)
k= (str(k) + ' %')
temp=str(i)+'. ' +str(j)+': ' + str(k)
val.append(temp)
val=[[el] for el in val]
#val=val[0] + val[1]
tab=table(ax,cellText=val,
#rowLabels=row_lab,
colLabels=['Relativ arbetslöshet'], loc='left', colWidths=[0.3], cellLoc='left')
plt.show()
As I understands it, if I want a table with two columns, my val object should be structured in a different way. In the case above, val is a nested list with 49 lists inside. I need to merge lists, I figure. I tried this pairwise for loop but that didn't work with range?
I'm sure there is a simple solution to this problem I have. Help would be much appreciated.
for i, j in zip(range(len(val)), range(len(val))[1:] + range(len(val))[:1]):
print(i, j)
I don't know if it is what you need but you could use zip() or better itertools.zip_longest() with val[:25], val[25:]
two_columns = []
for col1, col2 in itertools.zip_longest(values[:25], values[25:], fillvalue=''):
#print(f'{col1:25} | {col2}')
two_columns.append([col1, col2])
Full working example
import pandas as pd
import matplotlib.pyplot as plt
from blume.table import table
import itertools
df = pd.DataFrame({
'nr': [
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49
],
'KnNamn': [
'Härryda', 'Partille', 'Öckerö', 'Stenungsund', 'Tjörn', 'Orust',
'Sotenäs', 'Munkedal', 'Tanum', 'Dals-Ed', 'Färgelanda', 'Ale',
'Lerum', 'Vårgårda', 'Bollebygd', 'Grästorp', 'Essunga',
'Karlsborg', 'Gullspång', 'Tranemo', 'Bengtsfors', 'Mellerud',
'Lilla Edet', 'Mark', 'Svenljunga', 'Herrljunga', 'Vara', 'Götene',
'Tibro', 'Töreboda', 'Göteborg', 'Mölndal', 'Kungälv', 'Lysekil',
'Uddevalla', 'Strömstad', 'Vänersborg', 'Trollhättan', 'Alingsås',
'Borås', 'Ulricehamn', 'Åmål', 'Mariestad', 'Lidköping', 'Skara',
'Skövde', 'Hjo', 'Tidaholm', 'Falköping'
],
'rel': [
0.03650425, 0.05022105, 0.03009109, 0.03966735, 0.02793296,
0.03690838, 0.04757161, 0.05607283, 0.0546372 , 0.05452821,
0.06640368, 0.04252673, 0.03677577, 0.05385784, 0.0407173 ,
0.04024881, 0.05613226, 0.04476127, 0.08543165, 0.04070175,
0.09281077, 0.08711656, 0.06111578, 0.04564958, 0.05058988,
0.04618078, 0.04640402, 0.04826498, 0.08514253, 0.07799246,
0.07829886, 0.04249149, 0.03909206, 0.06835601, 0.08027622,
0.07087295, 0.09013876, 0.1040369 , 0.05004451, 0.06584845,
0.04338739, 0.10570863, 0.0553109 , 0.05024871, 0.06531729,
0.05565605, 0.05041816, 0.04885198, 0.07954831
]
})
# df = df[:25] # test for 25 rows
# ---
fig, ax = plt.subplots(1, figsize=(10, 7))
ax.axis('off')
# --- values ---
#values = []
#for number, name, rel in zip(df.nr, df.KnNamn, df.rel):
# text = f'{number}. {name}: {rel*100:.2} %'
# values.append(text)
values = df.apply(lambda row: f'{row["nr"]}. {row["KnNamn"]}: {row["rel"]*100:.2} %', axis=1).values
# --- columns ---
if len(values) > 25:
two_columns = []
for col1, col2 in itertools.zip_longest(values[:25], values[25:], fillvalue=''):
#print(f'{col1:25} | {col2}')
two_columns.append([col1, col2])
tab = table(ax, cellText=two_columns,
#rowLabels=row_lab,
colLabels=['Col1', 'Col2'], colWidths=[0.3, 0.3], loc=-100, cellLoc='left')
else:
one_column = [[item] for item in values]
tab = table(ax, cellText=one_column,
#rowLabels=row_lab,
colLabels=['Col1'], colWidths=[0.3], loc=-100, cellLoc='left')
# --- plot ---
plt.show()
Result:
EDIT:
More universal version which can create many columns.
Example automatically create 3 columns for ROWS = 20.
import pandas as pd
import matplotlib.pyplot as plt
from blume.table import table
import itertools
df = pd.DataFrame({
'nr': [
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49
],
'KnNamn': [
'Härryda', 'Partille', 'Öckerö', 'Stenungsund', 'Tjörn', 'Orust',
'Sotenäs', 'Munkedal', 'Tanum', 'Dals-Ed', 'Färgelanda', 'Ale',
'Lerum', 'Vårgårda', 'Bollebygd', 'Grästorp', 'Essunga',
'Karlsborg', 'Gullspång', 'Tranemo', 'Bengtsfors', 'Mellerud',
'Lilla Edet', 'Mark', 'Svenljunga', 'Herrljunga', 'Vara', 'Götene',
'Tibro', 'Töreboda', 'Göteborg', 'Mölndal', 'Kungälv', 'Lysekil',
'Uddevalla', 'Strömstad', 'Vänersborg', 'Trollhättan', 'Alingsås',
'Borås', 'Ulricehamn', 'Åmål', 'Mariestad', 'Lidköping', 'Skara',
'Skövde', 'Hjo', 'Tidaholm', 'Falköping'
],
'rel': [
0.03650425, 0.05022105, 0.03009109, 0.03966735, 0.02793296,
0.03690838, 0.04757161, 0.05607283, 0.0546372 , 0.05452821,
0.06640368, 0.04252673, 0.03677577, 0.05385784, 0.0407173 ,
0.04024881, 0.05613226, 0.04476127, 0.08543165, 0.04070175,
0.09281077, 0.08711656, 0.06111578, 0.04564958, 0.05058988,
0.04618078, 0.04640402, 0.04826498, 0.08514253, 0.07799246,
0.07829886, 0.04249149, 0.03909206, 0.06835601, 0.08027622,
0.07087295, 0.09013876, 0.1040369 , 0.05004451, 0.06584845,
0.04338739, 0.10570863, 0.0553109 , 0.05024871, 0.06531729,
0.05565605, 0.05041816, 0.04885198, 0.07954831
]
})
#df = df[:25] # test for 25 rows
# ---
fig, ax = plt.subplots(1, figsize=(10, 7))
ax.axis('off')
# --- values ---
def convert(row):
return f'{row["nr"]}. {row["KnNamn"]}: {row["rel"]*100:.2} %'
values = df.apply(convert, axis=1).values
# --- columns ---
ROWS = 20
#ROWS = 25
columns = []
for idx in range(0, len(values), ROWS):
columns.append(values[idx:idx+ROWS])
columns_widths = [0.3] * len(columns)
columns_labels = [f'Col{i}' for i in range(1, len(columns)+1)]
rows = list(itertools.zip_longest(*columns, fillvalue=''))
# --- plot ---
tab = table(ax,
cellText=rows,
#rowLabels=row_lab,
colLabels=columns_labels,
colWidths=columns_widths,
loc=-100,
cellLoc='left')
plt.show()
Result:
I have created a code in which from my lists I create an array, which must be vertical, like a vector, the problem is that using the reshape method I don't get anything.
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
index = []
for i in range(len(data)):
index.append([data[i][0], data[i][1], data[i][2],
data[i][3], data[i][4], data[i][5]])
y = np.array([index[i]])
# y.reshape(6,1)
Is there any solution for these cases? Thank you.
I'm looking for something like this to remain:
If you want to view each row as a column, transpose the array in any one of the following ways:
index = data.T
index = np.transpose(data)
index = data.transpose()
index = np.swapaxes(data, 0, 1)
index = np.moveaxis(data, 1, 0)
...
Each column of index will be a row of data. If you just want to access one column at a time, you can do that too. For example, to get row 3 (4th row) of the original array, any of the following would work:
y = data[3, :]
y = data[3]
y = index[:, 3]
You can get a column vector from the result by explicitly reshaping it to one:
y = y.reshape(-1, 1)
y = np.reshape(y, (-1, 1))
y = np.expand_dims(y, 1)
Remember that reshaping creates a new array object which views the same data as the original. The only way I know to reshape an array in-place is to assign to its shape attribute:
y.shape = (y.size, 1)
You can use flatten() from numpy https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html
(if you want a copy of the original array without modifying the original)
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
data = np.array(data).flatten()
print(data.shape)
(30,)
You can also use ravel()
(if you don't want a copy)
data = np.array(data).ravel()
If your array always has 2-d, this also works,
data = data.reshape(-1)
So I have an array of 5 integers v and another of 10 integers v.
I have a 5 by 10 matrix P that I would want to fill so that (P)ij = v[i] + u[j]
I tried:
P = np.empty((len(asset_grid),len(asset_grid)))
for i in range(asset_grid):
for j in range(asset_grid):
P[i,j] = asset_grid[i] + asset_grid[j]
but it gives me an error
TypeError: only integer arrays with one element can be converted to an index
How should I be able to do this in Python. I apologize if my approach is too naive, I am used to Matlab and now slowly learning Python. Any help is appreciated.
Broadcasting is what you want to do. Although for small arrays such as yours, it doesn't make a difference, it makes a significant difference with larger arrays:
>>> arr1 = np.arange(5)
>>> arr2 = np.arange(10,20)
>>> arr1[:,None] + arr2
array([[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
[13, 14, 15, 16, 17, 18, 19, 20, 21, 22],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
Generally with numpy you want to avoid iteration over rows and columns and use vectorized/broadcasted operations. This is where speed improvements actually come from.
So, elaborating based on your comment:
Say P_ij is ith element of x raised to the 4th power minus jth element of y raised to 2nd power
In general, Python supports most arithmetical operations you would want in a vectorized way, using the usual Python operators:
>>> arr1[:, None]**4 - arr2**2
array([[-100, -121, -144, -169, -196, -225, -256, -289, -324, -361],
[ -99, -120, -143, -168, -195, -224, -255, -288, -323, -360],
[ -84, -105, -128, -153, -180, -209, -240, -273, -308, -345],
[ -19, -40, -63, -88, -115, -144, -175, -208, -243, -280],
[ 156, 135, 112, 87, 60, 31, 0, -33, -68, -105]])