Assume I have a 2D array in Python and I add some padding. How can I iterate over the new padded area only?
For example
1 2 3
4 5 6
7 8 9
Becomes
x x x x x x x
x x x x x x x
x x 1 2 3 x x
x x 4 5 6 x x
x x 7 8 9 x x
x x x x x x x
x x x x x x x
How can I loop over only the x's?
Not sure if I understand what you are trying to do, but if you are using numpy, you can use masks:
import numpy as np
arr = np.array(np.arange(1,10)).reshape(3,3)
# mask full of True's
mask = np.ones((7,7),dtype=bool)
# setting the interior of the mask as False
mask[2:-2,2:-2] = False
# using zero padding as example
pad_arr = np.zeros((7,7))
pad_arr[2:-2,2:-2] = arr
print(pad_arr)
# loop for elements of the padding, where mask == True
for value in pad_arr[mask]:
print(value)
Returns:
[[0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0.]
[0. 0. 1. 2. 3. 0. 0.]
[0. 0. 4. 5. 6. 0. 0.]
[0. 0. 7. 8. 9. 0. 0.]
[0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0.]]
and 0.0 40 times (the padded values)
Related
import numpy as np
x = np.ones((5,5))
print(x)
x[1:-1,1:-1] = 0
print(x)
I am getting the output as shown below:
[[1. 1. 1. 1. 1.]
[1. 0. 0. 0. 1.]
[1. 0. 0. 0. 1.]
[1. 0. 0. 0. 1.]
[1. 1. 1. 1. 1.]]
You can do it using astype, setting it to int:
print(x.astype(int))
Result:
[[1 1 1 1 1]
[1 0 0 0 1]
[1 0 0 0 1]
[1 0 0 0 1]
[1 1 1 1 1]]
I think you refer to 1. When you see a dot sign, you know that that number is float type.
If you don't want floats, you should cast your list to integer:
x.astype(int)
Other things you should do in python console to understand things a little:
print(type(1))
print(type(1.))
print(x.dtype)
print(x.astype(int).dtype)
This question already has answers here:
Why does my original list change? [duplicate]
(2 answers)
Closed 3 years ago.
I want to create two matrices. Then make the second matrix numbers changed depending on the numbers in the first matrix. So I generate an If statement about my first matrix and if true this will induce a change in my second matrix. However, it induces a change in both matrices?
My code works perfectly with single digit objects. It only occurs when I try to apply it with matrices.
import numpy as np
n = 3
matr = np.zeros((n,n))
matr[0][0] = 1
matr2 = matr
print(matr)
[[1. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
print(matr2)
[[1. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
if matr[0][0] == 1:
matr2[0][0] = 9
print(matr)
[[9. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
print(matr2)
[[9. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
Because "matr" doesn't occur as a subject in my if statement it shouldn't be altered right?
x = 1
y = x
if x == 1:
y = 9
print(x)
1
print(y)
9
Those 2 variables are just two references to the same matrix, not 2 different matrices; matr2 = matr just creates a new reference to the same matrix.
The statement matr2[0][0] = 9 modifies the one and only matrix that exists in your example, and it is exactly the same as using matr[0][0] = 9.
Say I have two options for generating the Adjacency Matrix of a network: nx.adjacency_matrix() and my own code. I wanted to test the correctness of my code and came up with some strange inequalities.
Example: a 3x3 lattice network.
import networkx as nx
N=3
G=nx.grid_2d_graph(N,N)
pos = dict( (n, n) for n in G.nodes() )
labels = dict( ((i,j), i + (N-1-j) * N ) for i, j in G.nodes() )
nx.relabel_nodes(G,labels,False)
inds=labels.keys()
vals=labels.values()
inds.sort()
vals.sort()
pos2=dict(zip(vals,inds))
plt.figure()
nx.draw_networkx(G, pos=pos2, with_labels=True, node_size = 200)
This is the visualization:
The adjacency matrix with nx.adjacency_matrix():
B=nx.adjacency_matrix(G)
B1=B.todense()
[[0 0 0 0 0 1 0 0 1]
[0 0 0 1 0 1 0 0 0]
[0 0 0 1 0 1 0 1 1]
[0 1 1 0 0 0 1 0 0]
[0 0 0 0 0 0 0 1 1]
[1 1 1 0 0 0 0 0 0]
[0 0 0 1 0 0 0 1 0]
[0 0 1 0 1 0 1 0 0]
[1 0 1 0 1 0 0 0 0]]
According to it, node 0 (entire 1st row and entire 1st column) is connected to nodes 5 and 8. But if you look at the image above this is wrong, as it connects to nodes 1 and 3.
Now my code (to be run in in the same script as the above):
import numpy
import math
P=3
def nodes_connected(i, j):
try:
if i in G.neighbors(j):
return 1
except nx.NetworkXError:
return False
A=numpy.zeros((P*P,P*P))
for i in range(0,P*P,1):
for j in range(0,P*P,1):
if i not in G.nodes():
A[i][:]=0
A[:][i]=0
elif i in G.nodes():
A[i][j]=nodes_connected(i,j)
A[j][i]=A[i][j]
for i in range(0,P*P,1):
for j in range(0,P*P,1):
if math.isnan(A[i][j]):
A[i][j]=0
print(A)
This yields:
[[ 0. 1. 0. 1. 0. 0. 0. 0. 0.]
[ 1. 0. 1. 0. 1. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 1. 0. 0. 0.]
[ 1. 0. 0. 0. 1. 0. 1. 0. 0.]
[ 0. 1. 0. 1. 0. 1. 0. 1. 0.]
[ 0. 0. 1. 0. 1. 0. 0. 0. 1.]
[ 0. 0. 0. 1. 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 1. 0. 1. 0. 1.]
[ 0. 0. 0. 0. 0. 1. 0. 1. 0.]]
which says that node 0 is connected to nodes 1 and 3. Why does such difference exist? What is wrong in this situation?
Networkx doesn't know what order you want the nodes to be in.
Here is how to call it: adjacency_matrix(G, nodelist=None, weight='weight').
If you want a specific order, set nodelist to be a list in that order.
So for example adjacency_matrix(G, nodelist=range(9)) should get what you want.
Why is this? Well, because a graph can have just about anything as its nodes (anything hashable). One of your nodes could have been "parrot" or (1,2). So it stores the nodes as keys in a dict, rather than assuming it's the non-negative integers starting at 0. Dict keys have an arbitrary order.
A more general solution, if your nodes have some logical ordering as is the case if you generate a graph using G=nx.grid_2d_graph(3,3) (which returns tupples from (0,0) to (2,2), or in your example would be to use:
adjacency_matrix(G,nodelist=sorted(G.nodes()))
This sorts the returned list of nodes of G and passes it as the nodelist
When using scipy, I was able to transform my data in the following format:
(row, col) (weight)
(0, 0) 5
(0, 47) 5
(0, 144) 5
(0, 253) 4
(0, 513) 5
...
(6039, 3107) 5
(6039, 3115) 3
(6039, 3130) 4
(6039, 3132) 2
How can I transform this into an array or sparse matrix with zeros for missing weight values as such? (based on the data above, column 1 to 46 should be filled with zeros, and so on...)
0 1 2 3 ... 47 48 49 50
1 [0 0 0 0 ... 5 0 0 0 0
2 2 0 1 0 ... 4 0 5 0 0
3 3 1 0 5 ... 1 0 0 4 2
4 0 0 0 4 ... 5 0 1 3 0
5 5 1 5 4 ... 0 0 3 0 1]
I know it is better in terms of memory to keep the data in the format above, but I need it as a matrix for experimentation.
scipy.sparse does it for you.
import numpy as np
from scipy.sparse import dok_matrix
your_data = [((2, 7), 1)]
XDIM, YDIM = 10, 10 # Replace with your values
dct = {}
for (row, col), weight in your_data:
dct[(row, col)] = weight
smat = dok_matrix((XDIM, YDIM))
smat.update(dct)
dense = smat.toarray()
print dense
'''
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
'''
I prefer python over R for my work. From time to time, I need to use R
functions, and I start to try Rpy2 for that purpose.
I tried but failed to find out how to replicate following with Rpy2
design <- model.matrix(~Subject+Treat)
I have gone as far as this:
import rpy2.robjects as robjects
fmla = robjects.Formula('~subject+treatment')
env = fmla.environment
env['subject'] = sbj_group
env['treatment'] = trt_group
from what I saw here.
But I could not find how to perform model.matrix. I tried a couple of different ways:
robjects.r.model_matrix(fmla)
robjects.r('model.matrix(%s)' %fmla.r_repr())
As you can see none of them is right.
I am new to Rpy2, and fairly inexperienced in R. Any help would be appreciated!
You could evaluate strings as R code:
import numpy as np
import rpy2.robjects as ro
import rpy2.robjects.numpy2ri
ro.numpy2ri.activate()
R = ro.r
subject = np.repeat([1,2,3], 4)
treatment = np.tile([1,2,3,4], 3)
R.assign('subject', subject)
R.assign('treatment', treatment)
R('subject <- as.factor(subject)')
R('treatment <- as.factor(treatment)')
R('design <- model.matrix(~subject+treatment)')
R('print(design)')
yields
(Intercept) subject2 subject3 treatment2 treatment3 treatment4
1 1 0 0 0 0 0
2 1 0 0 1 0 0
3 1 0 0 0 1 0
4 1 0 0 0 0 1
5 1 1 0 0 0 0
6 1 1 0 1 0 0
7 1 1 0 0 1 0
8 1 1 0 0 0 1
9 1 0 1 0 0 0
10 1 0 1 1 0 0
11 1 0 1 0 1 0
12 1 0 1 0 0 1
attr(,"assign")
[1] 0 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$subject
[1] "contr.treatment"
attr(,"contrasts")$treatment
[1] "contr.treatment"
R(...) returns objects which you can manipulate on the Python side.
For example,
design = R('model.matrix(~subject+treatment)')
assigns a rpy2.robjects.vectors.Matrix to design.
arr = np.array(design)
makes arr the NumPy array
[[ 1. 0. 0. 0. 0. 0.]
[ 1. 0. 0. 1. 0. 0.]
[ 1. 0. 0. 0. 1. 0.]
[ 1. 0. 0. 0. 0. 1.]
[ 1. 1. 0. 0. 0. 0.]
[ 1. 1. 0. 1. 0. 0.]
[ 1. 1. 0. 0. 1. 0.]
[ 1. 1. 0. 0. 0. 1.]
[ 1. 0. 1. 0. 0. 0.]
[ 1. 0. 1. 1. 0. 0.]
[ 1. 0. 1. 0. 1. 0.]
[ 1. 0. 1. 0. 0. 1.]]
The column names can be accessed with
np.array(design.colnames)
# array(['(Intercept)', 'subject2', 'subject3', 'treatment2', 'treatment3',
# 'treatment4'],
# dtype='|S11')