best matrix full of same number - python

I would like to create a n x n matrix with n being a large number, what is the fastest way to do this in sage ?
I would like a matrix like for example n = 3 == [(3,3,3)(3,3,3),(3,3,3)]
I currently do this with
ones_matrix(n) * somenumber
However this takes a while with a big n, is there a faster way to do this in sage?
Thx for helping

You can use the numpy.full() function like so:
>>> import numpy as np
>>> arr = np.full((3, 3), 3)
>>> arr
[[3 3 3]
[3 3 3]
[3 3 3]]
>>> arr2 = np.full((3, 3), 7)
>>> arr2
[[7 7 7]
[7 7 7]
[7 7 7]]

A shortcut way would be:
n=int(input())
tup=tuple((n,))*n
#gives (n,n,n,...…,n) n times
ar=list((tup,)*n)
#would give ar=[(n,n,.....,n),(n,n,n,.....,n),...…..(n,n,n,...…,n)]
or doing in a single stroke:
ar=list((tuple((n,))*n,)*n)

If you want to work with Sage matrices, you can do this:
sage: M = MatrixSpace(ZZ, 10000, 10000)
sage: a = M.matrix([3]*10000*10000)
On my computer, this takes about 6 seconds, the same time as ones_matrix(10000), faster than 3 * ones_matrix(10000). Not nearly as fast as the numpy solution, but then the result is a Sage matrix. Note that if you want to work with noninteger entries, you should change the ZZ to the appropriate ring.

Related

Get part of np array with parameters

I am using python and numpy. I am using n dimensional array.
I want to select all elements with index like
arr[a,b,:,c]
but I want to be able to select slice position like parameter. For example if the parameter
#pos =2
arr[a,b,:,c]
#pos =1
arr[a,:,b,c]
I would move the axis of interest (at pos) to the front with numpy.moveaxis(array,pos,0)[1] and then simply slice with [:,a,b,c].
There is also numpy.take[2], but in your case you would still need to loop over each dimension a,b,c, so I think moveaxis is more convenient. Maybe there is an even more direct way to do this.
The idea of moving the slicing axis to one end is a good one. Various numpy functions use that idea.
In [171]: arr = np.ones((2,3,4,5),int)
In [172]: arr[0,0,:,0].shape
Out[172]: (4,)
In [173]: arr[0,:,0,0].shape
Out[173]: (3,)
Another idea is to build a indexing tuple:
In [176]: idx = (0,0,slice(None),0)
In [177]: arr[idx].shape
Out[177]: (4,)
In [178]: idx = (0,slice(None),0,0)
In [179]: arr[idx].shape
Out[179]: (3,)
To do this programmatically it may be easier to start with a list or array that can be modified, and then convert it to a tuple for indexing. Details will vary depending on how you prefer to specify the axis and variables.
If any of a,b,c are arrays (or lists), you may get some shape surprises, since it's a case of mixing advanced and basic indexing. But as long as they are scalars, that's not an issue.
You could np.transpose the array arr based on your preferences before you try to slice it, since you move your axis of interest (i.e. the :) "to the back". This way, you can rearrange arr, s.t. you can always call arr[a,b,c].
Example with only a and b:
import numpy as np
a = 0
b = 2
target_axis = 1
# Generate some random data
arr = np.random.randint(10, size=[3, 3, 3], dtype=int)
print(arr)
#[[[0 8 2]
# [3 9 4]
# [0 3 6]]
#
# [[8 5 4]
# [9 8 5]
# [8 6 1]]
#
# [[2 2 5]
# [5 3 3]
# [9 1 8]]]
# Define transpose s.t. target_axis is the last axis
transposed_shape = np.arange(arr.ndim)
transposed_shape = np.delete(transposed_shape, target_axis)
transposed_shape = np.append(transposed_shape, target_axis)
print(transposed_shape)
#[0 2 1]
# Caution! These 0 and 2 above do not come from a or b.
# Instead they are the indices of the axes.
# Transpose arr
arr_T = np.transpose(arr, transposed_shape)
print(arr_T)
#[[[0 3 0]
# [8 9 3]
# [2 4 6]]
#
# [[8 9 8]
# [5 8 6]
# [4 5 1]]
#
# [[2 5 9]
# [2 3 1]
# [5 3 8]]]
print(arr_T[a,b])
#[2 4 6]

Is there any Python equivalent to sort([op ;op+1])

As title, I have Matlab
op = [1 3 5]
[op ;op+1]
opf = sort([op ;op+1])
opf = [1 2 3 4 5 6]
and reading the doc, I've discovered that ; could signify end of row. However I don't know if that's the case since it's in square brackets and in Matlab more often than not there are way too many ways of doing most of the things. Not that it's something bad to be honest, but it's at least a bit confusing for students like me.
To replicate in Python I did
opf = np.sort([op,op+1])
but opf shape is wrong. I get in fact [[0 2 4] [1 3 5]] which is (2,3). opf instead should be [1 2 3 4 5 6] and its shape accordingly (6,1)
This is possible in Python using regular lists:
import numpy as np
op = np.array([1, 3, 5])
opf = [[i, i+1] for i in op]
opf = [i for j in opf for i in j]
print(np.asarray(opf))
Returning:
[1 2 3 4 5 6]
If your op array is unordered, you could do:
import numpy as np
op = np.array([1, 5, 3])
opf = [[i, i+1] for i in sorted(op)]
opf = [i for j in opf for i in j]
print(np.sort(np.asarray(opf)))
Again returning:
[1 2 3 4 5 6]
When working with numpy arrays, you cannot concatenate like you did on matlab.
The way to do it, is via np.concatenate(). Also, np.concatenate() takes in a tuple as its argument, so you must use double parenthesis (one to indicate the function, and the other for the tuple of the arrays you want to concatenate).
Your complete example would look something like this
import numpy as np
a = [1, 3, 5]
op = np.array(a)
result = np.sort(np.concatenate((op, op + 1)))
print(result)
Try:
np.reshape(6,1)
If you are using numpy you can reshape it yourself

What's the most efficient way to split up a Numpy ndarray using percentage?

Hi I'm new to Python & Numpy and I'd like to ask what is the most efficient way to split a ndarray into 3 parts: 20%, 60% and 20%
import numpy as np
row_indices = np.random.permutation(10)
Let's assume the ndarray has 10 items: [7 9 3 1 2 4 5 6 0 8]
The expected results are the ndarray separated into 3 parts like part1, part2 and part3.
part1: [7 9]
part2: [3 1 2 4 5]
part3: [0 8]
Here's one way -
# data array
In [85]: a = np.array([7, 9, 3, 1, 2, 4, 5, 6, 0, 8])
# percentages (ratios) array
In [86]: p = np.array([0.2,0.6,0.2]) # must sum upto 1
In [87]: np.split(a,(len(a)*p[:-1].cumsum()).astype(int))
Out[87]: [array([7, 9]), array([3, 1, 2, 4, 5, 6]), array([0, 8])]
Alternative to np.split :
np.split could be slower when working with large data, so, we could alternatively use a loop there -
split_idx = np.r_[0,(len(a)*p.cumsum()).astype(int)]
out = [a[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
I normally just go for the most obvious solution, although there are much fancier ways to do the same. It takes a second to implement and doesn't even require debugging (since it's extremely simple)
part1 = [a[i, ...] for i in range(int(a.shape[0] * 0.2))]
part2 = [a[i, ...] for i in range(int(a.shape[0] * 0.2), int(len(a) * 0.6))]
part3 = [a[i, ...] for i in range(int(a.shape[0] * 0.6), len(a))]
A few things to notice though
This is rounded and therefore you could get something which is only roughly a 20-60-20 split
You get back a list of element so you might have to re-numpyfy them with np.asarray()
You can use this method for indexing multiple objects (e.g. labels and inputs) for the same elements
If you get the indices once before the splits (indices = list(range(a.shape[0]))) you could also shuffle them thus taking care of data shuffling at the same time

which numpy command could I use to subtract vectors with different dimensions many times?

i have to write this function:
in which x is a vector with dimensions [150,2] and c is [N,2] (lets suppose N=20). From each component xi (i=1,2) I have to subtract the components of c in this way ([x11-c11,x12-c12])...([x11-cN1, x12-cN2])for all the 150 sample.
I've trasformed them in a way I have the same dimensions and I can subtract them, but the result of the function should be a vector. Maybe How can I write this in numpy?
Thank you
Ok, lets suppose x=(5,2) and c=(3,2)
this is what I have obtained transforming dimensions of the two arrays. the problem is that, I have to do this but with a iteration "for loop" because the exp function should give me as a result a vector. so I have to obtain a sort of matrix divided in N blocks.
From what I understand of the issue, the problem seems to be in the way you are calculating the vector norm, not in the subtraction. Using your example, but calculating exp(-||x-c||), try:
x = np.linspace(8,17,10).reshape((5,2))
c = np.linspace(1,6,6).reshape((3,2))
sub = np.linalg.norm(x[:,None] - c, axis=-1)
np.exp(-sub)
array([[ 5.02000299e-05, 8.49325705e-04, 1.43695961e-02],
[ 2.96711024e-06, 5.02000299e-05, 8.49325705e-04],
[ 1.75373266e-07, 2.96711024e-06, 5.02000299e-05],
[ 1.03655678e-08, 1.75373266e-07, 2.96711024e-06],
[ 6.12664624e-10, 1.03655678e-08, 1.75373266e-07]])
np.exp(-sub).shape
(5, 3)
numpy.linalg.norm will try to return some kind of matrix norm across all the dimensions of its input unless you tell it explicitly which axis represents the vector components.
I I understand, try if this give the expected result, but there is still the problem that the result has the same shape of x:
import numpy as np
x = np.arange(10).reshape(5,2)
c = np.arange(6).reshape(3,2)
c_col_sum = np.sum(c, axis=0)
for (h,k), value in np.ndenumerate(x):
x[h,k] = c.shape[0] * x[h,k] - c_col_sum[k]
Initially x is:
[[0 1]
[2 3]
[4 5]
[6 7]
[8 9]]
And c is:
[[0 1]
[2 3]
[4 5]]
After the function x becomes:
[[-6 -6]
[ 0 0]
[ 6 6]
[12 12]
[18 18]]

Iterate over columns of a NumPy array and elements of another one?

I am trying to replicate the behaviour of zip(a, b) in order to be able to loop simultaneously along two NumPy arrays. In particular, I have two arrays a and b:
a.shape=(n,m)
b.shape=(m,)
I would like to get for every loop a column of a and an element of b.
So far, I have tried the following:
for a_column, b_element in np.nditer([a, b]):
print(a_column)
However, I get printed the element a[0,0] rather than the column a[0,:], which I want.
How can I solve this?
You can still use zip on numpy arrays, because they are iterables.
In your case, you'd need to transpose a first, to make it an array of shape (m,n), i.e. an iterable of length m:
for a_column, b_element in zip(a.T, b):
...
Adapting my answer in shallow iteration with nditer,
nditer and ndindex can be used to iterate over rows or columns by generating indexes.
In [19]: n,m=3,4
In [20]: a=np.arange(n*m).reshape(n,m)
In [21]: b=np.arange(m)
In [22]: it=np.nditer(b)
In [23]: for i in it: print a[:,i],b[i]
[0 4 8] 0
[1 5 9] 1
[ 2 6 10] 2
[ 3 7 11] 3
In [24]: for i in np.ndindex(m):print a[:,i],b[i]
[[0]
[4]
[8]] 0
[[1]
[5]
[9]] 1
[[ 2]
[ 6]
[10]] 2
[[ 3]
[ 7]
[11]] 3
In [25]:
ndindex uses an iterator like: it = np.nditer(b, flags=['multi_index'].
For iteration over a single dimension like this, for i in range(m): works just as well.
Also from the other thread, here's a trick using order to iterate without the indexes:
In [28]: for i,j in np.nditer([a,b],order='F',flags=['external_loop']):
print i,j
[0 4 8] [0 0 0]
[1 5 9] [1 1 1]
[ 2 6 10] [2 2 2]
[ 3 7 11] [3 3 3]
Usually, because of NumPy's ability to broadcast arrays, it is not necessary to iterate over the columns of an array one-by-one. For example, if a has shape (n,m) and b has shape (m,) then you can add a+b and b will broadcast itself to shape (n, m) automatically.
Moreover, your calculation will complete much faster if it can be expressed through operations on the whole array, a, rather than through operations on pieces of a (such as on columns) using a Python for-loop.
Having said that, the easiest way to loop through the columns of a is to iterate over the index:
for i in np.arange(b.shape[0]):
a_column, b_element = a[:, i], b[i]
print(a_column)

Categories