how does numpy reshape work with negative variable as second parameter - python

I am trying to play with negative variable as second parameter
a = np.array([[1,2,3], [4,5,6]])
print(np.reshape(a, (3,-1)) )
print("___________________________________")
print(np.reshape(a, (3,-2)) )
print("___________________________________")
print(np.reshape(a, (3,-3)) )
print("___________________________________")
print(np.reshape(a, (3,2)) )
All the four types of reshaping above basically gives the same result as the output.
[[1 2]
[3 4]
[5 6]]
___________________________________
[[1 2]
[3 4]
[5 6]]
___________________________________
[[1 2]
[3 4]
[5 6]]
___________________________________
[[1 2]
[3 4]
[5 6]]
I am just trying to understand what is the difference between the above? Can -1 and 2 be used interchangeably?

The parameters to reshape can contain one unknown dimension which represented by a negative number, the value is inferred from the length of the array and remaining dimensions.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html#numpy.reshape
for example
a = np.array([[1,2,3, 4], [5,6,7,8]])
print(np.reshape(a, (-2)) )
print("___________________________________")
print(np.reshape(a, (2, 2,-2)) )
print("___________________________________")
print(np.reshape(a, (2, -1,-2)) )
Output
[1 2 3 4 5 6 7 8]
___________________________________
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
___________________________________
...
ValueError: can only specify one unknown dimension

Reshaping with a negative number is no magic. As stated in the answer above the number after the negative sign does not really matter.
Here is a function demonstrating how reshaping is done. Note that this is purely demonstrative, not an actual implementation taken from source code or anything like that.
def computeNegativeDim(arr):
givenDims = list(arr.shape)
knownDims = [d for d in givenDims if d > 0]
val = 1
for k in knownDims:
val *= k
dimOfPreviouslyUnknown = arr.size / val
for g in givenDims:
if g < 0:
g = dimOfPreviouslyUnknown
newarr = arr.reshape(givenDims)
Or somewhere along the above.

Related

Is it possible to define function that creates an N-dimensional list of lists easily?

I have searched and found these questions: How to create a multi-dimensional list and N dimensional array in python which hint toward what I am looking for, but they seem to only make 2D arrays and not ND arrays.
Problem:
My problem is that I am able to create an n-dimensional list of lists for a known n, but I am not sure how it can be generalized to work with all values of n.
Consider this example:
def makeList(n):
return [[[n for _ in range(n)]
for _ in range(n)]
for _ in range(n)]
print(makeList(3))
Output:
[[[3, 3, 3],
[3, 3, 3],
[3, 3, 3]],
[[3, 3, 3],
[3, 3, 3],
[3, 3, 3]],
[[3, 3, 3],
[3, 3, 3],
[3, 3, 3]]]
This will create a list of lists that is a 3x3x3 array of 3's which is the intended result, but if we use a different n:
print(makeList(2))
Output:
[[[2, 2],
[2, 2]],
[[2, 2],
[2, 2]]]
This will create a list of lists that is a 2x2x0 array of 2's and this is not the intended result. Instead the results should be a list of lists that is a 2x2 array of 2's.
Likewise if we set n = 4:
print(makeList(4))
This will give a list of lists that is 4x4x4 when it should give a 4x4x4x4 list of lists.
The main issue is that the number of for loops must change depending on the input, but obviously the code can't "come to life" and recode itself magically, hence my issue.
I am looking for a way to get this result that is simple. I am sure I could continue developing ideas for solutions, but I have not been able to think of anything that is concise.
What I have tried:
The first idea I thought of was to use recursion, and this my simple approach:
def makeList(n, output = []):
if not output:
output = [n for _ in range(n)]
else:
output = [output for _ in range(n)]
if len(output) == n:
return output
else:
return makeList(n, output)
This obviously will not work because if len(output) == n: will execute the first iteration since the length of the inner loops is equal to the length of the outermost loop. However, even if there was a condition that properly terminated the function, this solution would still not be ideal because I could run into maximum recursion errors with large values of n. Furthermore, even if these issues were resolved this code is still quite long as well as time consuming.
The other potential perspective I considered (and the solution that I found that works) is using a dict. My solution saves the intermediate lists of lists in a dict so it can be used for the next iteration:
def makeList(n):
d = {str(i): None for i in range(n)}
for i in range(n):
if i == 0:
d[str(i)] = [n for _ in range(n)]
else:
d[str(i)] = [d[str(i-1)] for _ in range(n)]
return d[str(n-1)]
But again, this is very long and doesn't seem very pythonic.
Solution requirements:
The ideal solution would be much more concise than mine with an efficient time complexity that only uses built-in functions.
Other options that are close to these requirements would also be helpful as long as the spirit of the answer is trying to best meet the requirements.
Being concise is the most important aspect, however, which is why I am not fully happy with my current attempts.
Using tuple(n for _ in range(n)) to get the dimension that you want, and numpy to generate a multidimensional array:
import numpy as np
def make_array(n):
return np.full(tuple(n for _ in range(n)), n)
for n in range(1,5):
print(make_array(n))
Output:
[1]
[[2 2]
[2 2]]
[[[3 3 3]
[3 3 3]
[3 3 3]]
[[3 3 3]
[3 3 3]
[3 3 3]]
[[3 3 3]
[3 3 3]
[3 3 3]]]
[[[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]]
[[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]]
[[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]]
[[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]
[[4 4 4 4]
[4 4 4 4]
[4 4 4 4]
[4 4 4 4]]]]
Your BEST plan is to use numpy for this. You can create an arbitrary array of all zeros with np.zeros( (4,4,4) ), or an array of 1s with np.ones( (4,4,4) ). If you're going to be working with arrays very much at all, you will certainly want to use numpy.
I've sketched out a recursive function that might suit:
def nd_list(n, x=None):
if x is None:
x = n
if x == 1:
return [n] * n
return [nd_list(n, x-1)] * n
In two lines using no external libraries!
def nd_list(x, z):
return [z] * z if x == 1 else [nd_list(x-1, z)] * z
In two lines (just) and without the list of lists surprising behaviour:
def nd_list(x, z):
return [0 for _ in range(z)] if x == 1 else [nd_list(x-1, z) for _ in range(z)]
I mean, I like numpy and all, but if I want to do general dynamic programming I don't want to have to pip install a massive library. Would also suck hard for technical interviews requiring complex DP.
Here's a simple, readable function that takes advantage of the built-in itertools module in Python (it uses repeat) and the copy module for making deep copies of lists (otherwise, you get that "surprising" list behavior where modifying one entry modifies many). It's also pretty sane in terms of the API: you can call it with a tuple of dimensions (like numpy's array's shape field):
from itertools import repeat
from copy import deepcopy
def ndarray(shape, val=None):
base = val
for dim in reversed(shape):
base = list(map(deepcopy, repeat(base, times=dim)))
return base
This is what is made:
>>> import pprint
>>> pprint.pprint(ndarray((2, 3, 4), val=1))
[[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]],
[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]]
For your original question, you could easily wrap makeList(n) around this:
def makeList(n, val=None):
return ndarray([n]*n, val=val)
Using python support for pattern matching
def nlist(shape: tuple[int, ...], fill_with=None):
match shape:
case (n,):
return [fill_with] * n
case (n, *rest):
return [nlist(rest, fill_with) for _ in range(n)]
case _:
raise ValueError(f'Invalid value shape={shape}')
def makeList(n):
return nlist((n,)*n, n)

How to insert tensor array into tensor matrix after every second position

I have a tensor array a and a tensor matrix m. Now I want to insert a into m after every second position started at index 0 ending with len(m)-2. Let's make an equivalent example using numpy and plain python:
# define m
m = np.array([[3,7,6],[4,3,1],[8,4,2],[2,8,7]])
print(m)
#[[3 7 6]
# [4 3 1]
# [8 4 2]
# [2 8 7]]
# define a
a = np.array([1,2,3])
#[1 2 3]
# insert a into m
result = []
for i in range(len(m)):
result.append(a)
result.append(m[i])
print(np.array(result))
#[[1 2 3]
# [3 7 6]
# [1 2 3]
# [4 3 1]
# [1 2 3]
# [8 4 2]
# [1 2 3]
# [2 8 7]]
I am looking for a solution in tensorflow. I am convinced that there is a solution that doesn't need a loop but I am not able to find one. I hope someone can help me out with this!
You can concatenate your target vector at the beginning of each line of your matrix, and then reshape it.
import tensorflow as tf
initial_array = tf.constant([
[3, 7, 6],
[4, 3, 1],
[8, 4, 2],
[2, 8, 7],
])
vector_to_add = [1, 2, 3]
concat = tf.concat([[vector_to_add] * initial_array.shape[0], initial_array], axis=1) # Concatenate vector_to_add to each vector of initial_array
output = tf.reshape(concat, (2 * initial_array.shape[0], initial_array.shape[1])) # Reshape
This should work,
np.ravel(np.column_stack((m, np.tile(a, (4, 1))))).reshape(8, 3)
For idea, please refer to Interweaving two numpy arrays. Apply any solution described there, and reshape.

Tensorflow: stack all row pairs from a tensor

Given a tensor t=[[1,2], [3,4]], I need to produce ts=[[1,2,1,2], [1,2,3,4], [3,4,1,2], [3,4,3,4]]. That is, I need to stack together all row pairs.
Important: the tensor has dimension [None, 2], ie. the first dimension is variable.
I have tried:
Using a tf.while_loop to generate a list of indices idx=[[0, 0], [0, 1], [1, 0], [1, 1]], then tf.gather(ts, idx). This works but is messy and I don't know what to do about gradients.
2 for loops iterating over tf.unstack(t), adding stacked rows to a buffer, then tf.stack(buffer). This does not work if the first dimension is variable.
To look for inspiration in broadcasting. For instance, given x=t.expand_dims(t, 0), y=t.expand_dims(t, 1), s=tf.reshape(tf.add(x, y), [-1, 2]) s will be [[2, 4], [4, 6], [4, 6], [6, 8]], ie. the sum of every row combination. But how can I do stacking instead of sum? I've been failing for 2 days :)
Solution with tf.meshgrid() and some reshaping:
import tensorflow as tf
import numpy as np
t = tf.placeholder(tf.int32, [None, 2])
num_rows, size_row = tf.shape(t)[0], tf.shape(t)[1] # actual dynamic dimensions
# Getting pair indices using tf.meshgrid:
idx_range = tf.range(num_rows)
pair_indices = tf.stack(tf.meshgrid(*[idx_range, idx_range]))
pair_indices = tf.transpose(pair_indices, perm=[1, 2, 0])
# Finally gathering the rows accordingly:
res = tf.reshape(tf.gather(t, pair_indices), (-1, size_row * 2))
with tf.Session() as sess:
print(sess.run(res, feed_dict={t: np.array([[1,2], [3,4], [5,6]])}))
# [[1 2 1 2]
# [3 4 1 2]
# [5 6 1 2]
# [1 2 3 4]
# [3 4 3 4]
# [5 6 3 4]
# [1 2 5 6]
# [3 4 5 6]
# [5 6 5 6]]
Solution using cartesian product:
import tensorflow as tf
import numpy as np
t = tf.placeholder(tf.int32, [None, 2])
num_rows, size_row = tf.shape(t)[0], tf.shape(t)[1] # actual dynamic dimensions
# Getting pair indices by computing the indices cartesian product:
row_idx = tf.range(num_rows)
row_idx_a = tf.expand_dims(tf.tile(tf.expand_dims(row_idx, 1), [1, num_rows]), 2)
row_idx_b = tf.expand_dims(tf.tile(tf.expand_dims(row_idx, 0), [num_rows, 1]), 2)
pair_indices = tf.concat([row_idx_a, row_idx_b], axis=2)
# Finally gathering the rows accordingly:
res = tf.reshape(tf.gather(t, pair_indices), (-1, size_row * 2))
with tf.Session() as sess:
print(sess.run(res, feed_dict={t: np.array([[1,2], [3,4], [5,6]])}))
# [[1 2 1 2]
# [1 2 3 4]
# [1 2 5 6]
# [3 4 1 2]
# [3 4 3 4]
# [3 4 5 6]
# [5 6 1 2]
# [5 6 3 4]
# [5 6 5 6]]
Can be achieved by:
tf.concat([tf.tile(tf.expand_dims(t,1), [1, tf.shape(t)[0], 1]), tf.tile(tf.expand_dims(t,0), [tf.shape(t)[0], 1, 1])], axis=2)
Detailed steps:
t = tf.placeholder(tf.int32, shape=[None, 2])
#repeat each row of t
d = tf.tile(tf.expand_dims(t,1), [1, tf.shape(t)[0], 1])
#Output:
#[[[1 2] [1 2]]
# [[3 4] [3 4]]]
#repeat the entire input t
e = tf.tile(tf.expand_dims(t,0), [tf.shape(t)[0], 1, 1])
#Output:
#[[[1 2] [3 4]]
# [[1 2] [3 4]]]
#concat
f = tf.concat([d, e], axis=2)
with tf.Session() as sess:
print(sess.run(f, {t:np.asarray([[1,2],[3,4]])}))
#Output
#[[[1 2 1 2]
#[1 2 3 4]]
#[[3 4 1 2]
#[3 4 3 4]]]

Difference between `tf.reshape(a, [m, n])` and `tf.transpose(tf.reshape(a, [n, m]))`?

Actually, I'm doing the homework "Art Generation with Neural Style Transfer" of deeplearning.ai on coursera. In the function compute_layer_style_cost(a_S, a_G):
a_S = tf.reshape(a_S, [n_H*n_W, n_C])
a_G = tf.reshape(a_G, [n_H*n_W, n_C])
GS = gram_matrix(tf.transpose(a_S))
GG = gram_matrix(tf.transpose(a_G))
Why does this code give the right answer, however, the following doesn't:
a_S = tf.reshape(a_S, [n_C, n_H*n_W])
a_G = tf.reshape(a_G, [n_C, n_H*n_W])
GS = gram_matrix(a_S)
GG = gram_matrix(a_G)
Here's a trivial example that shows the difference between these two expressions:
import tensorflow as tf
tf.InteractiveSession()
x = tf.range(0, 6)
a = tf.reshape(x, [3, 2])
b = tf.transpose(tf.reshape(x, [2, 3]))
print(x.eval())
print(a.eval())
print(b.eval())
The result:
[0 1 2 3 4 5]
[[0 1]
[2 3]
[4 5]]
[[0 3]
[1 4]
[2 5]]
As you can notice, a and b are different, though have the same shape. That's because the first reshaping "splits" x into [0 1], [2 3] and [4 5], while the second reshaping into [0 1 2] and [3 4 5].

numpy count elements across axis 0 matching values from another array

Given a 3D array such as:
array = np.random.randint(1, 6, (3, 3, 3))
and an array of maximum values across axis 0:
max_array = array.max(axis=0)
Is there a vectorised way to count the number of elements in axis 0 of array which are equal to the value of the matching index in max_array? For example, if array contains [1, 3, 3] in one axis 0 position, the output is 2, and so on for the other 8 positions, returning an array with the counts.
To count the number of values in x which equal the corresponding value in xmax, you could use:
(x == xmax).sum(axis=0)
Note that since x has shape (3,3,3) and xmax has shape (3,3), the expression x == xmax causes NumPy to broadcast xmax up to shape (3,3,3) where the new axis is added on the left.
For example,
import numpy as np
np.random.seed(2015)
x = np.random.randint(1, 6, (3,3,3))
print(x)
# [[[3 5 5]
# [3 2 1]
# [3 4 1]]
# [[1 5 4]
# [1 4 1]
# [2 3 4]]
# [[2 3 3]
# [2 1 1]
# [5 1 2]]]
xmax = x.max(axis=0)
print(xmax)
# [[3 5 5]
# [3 4 1]
# [5 4 4]]
count = (x == xmax).sum(axis=0)
print(count)
# [[1 2 1]
# [1 1 3]
# [1 1 1]]

Categories