4 x 4 Floats to numpy Matrix - python

Following numpy command:
c = np.matrix('1,0,0,0;0,1,0,0;0,0,1,0;-6.6,1.0,-2.8, 1.0')
creates a matrix Outupt:
[[ 1. 0. 0. 0. ]
[ 0. 1. 0. 0. ]
[ 0. 0. 1. 0. ]
[-6.6 1. -2.8 1. ]]
However my Input is a comma-separated array of floats :
[1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, -6.604560409595856, 1.0, -2.81542864114781, 1.0]
Is there a simple way of getting those floats, easily into a numpy matrix by defining the shape in before as a 4 x 4 matrix?

np.array([1.0, 0.0,..., -2.81542864114781, 1.0]).reshape((4, 4))

Related

Whitespaces after addition to numpy array

Why when I'm executing code below I get those weird whitespaces in output?
import numpy as np
str = 'a a b c a a d a g a'
string_array = np.array(str.split(" "))
char_indices = np.where(string_array == 'a')
array = char_indices[0]
print(array)
array += 2
print(array)
output:
[0 1 4 5 7 9]
[ 2 3 6 7 9 11]
That's just numpy's way of displaying data to make it appear aligned and more readable.
The alignment between your two lists changes
[0 1 4 5 7 9]
[ 2 3 6 7 9 11]
because there is a two-digit element in the second list.
In vectors it is more difficult to appreciate, but it is very useful when we have more dimensions:
>>> a = np.random.uniform(0,1,(5,5))
>>> a[a>0.5] = 0
>>> print(a)
[[0. 0. 0.00460074 0.22880318 0.46584641]
[0.0455245 0. 0. 0. 0. ]
[0. 0.07891556 0.21795357 0.14944522 0.20732431]
[0. 0. 0. 0.3381172 0.08182367]
[0. 0. 0.10734559 0. 0.31228533]]
>>> print(a.tolist())
[[0.0, 0.0, 0.0046007414146133074, 0.22880318354923768, 0.4658464110307319], [0.04552450444387102, 0.0, 0.0, 0.0, 0.0], [0.0, 0.07891556038021574, 0.21795356574892966, 0.1494452184954096, 0.2073243102108967], [0.0, 0.0, 0.0, 0.33811719550156627, 0.08182367499758836], [0.0, 0.0, 0.10734558995972832, 0.0, 0.31228532775003903]]

I have 8 vertices of a box on a 3D terrain, and i need to seperate the box to a few smaller ones

For this problem, I got the 8 vertices of a box that i need to shrink, with a given size that is an integer which I need to shrink every side with. For example, if the size of the box I need to shrink is 8*8*8 and the shrinking size is 2, I need to return a list of all the vertices of the 4*4*4 boxes that fill the big box in a 3D coordinate system.
I thought about having a for loop that runs in range of the size of the box, but than I thought if I want to eventually seperate the box into a lot more boxes that are smaller and I want to fill the big box i would have to write an amount of code that I wouldn't be able to write. How to get this list of vertices without writing this much code?
I'm not sure if this is what you want, but here is a simple way to compute vertices in a grid with NumPy:
import numpy as np
def make_grid(x_size, y_size, z_size, shrink_factor):
n = (shrink_factor + 1) * 1j
xx, yy, zz = np.mgrid[:x_size:n, :y_size:n, :z_size:n]
return np.stack([xx.ravel(), yy.ravel(), zz.ravel()], axis=1)
print(make_grid(8, 8, 8, 2))
Output:
[[0. 0. 0.]
[0. 0. 4.]
[0. 0. 8.]
[0. 4. 0.]
[0. 4. 4.]
[0. 4. 8.]
[0. 8. 0.]
[0. 8. 4.]
[0. 8. 8.]
[4. 0. 0.]
[4. 0. 4.]
[4. 0. 8.]
[4. 4. 0.]
[4. 4. 4.]
[4. 4. 8.]
[4. 8. 0.]
[4. 8. 4.]
[4. 8. 8.]
[8. 0. 0.]
[8. 0. 4.]
[8. 0. 8.]
[8. 4. 0.]
[8. 4. 4.]
[8. 4. 8.]
[8. 8. 0.]
[8. 8. 4.]
[8. 8. 8.]]
Otherwise with itertools:
from itertools import product
def make_grid(x_size, y_size, z_size, shrink_factor):
return [(x * x_size, y * y_size, z * z_size)
for x, y, z in product((i / shrink_factor
for i in range(shrink_factor + 1)), repeat=3)]
print(*make_grid(8, 8, 8, 2), sep='\n')
Output:
(0.0, 0.0, 0.0)
(0.0, 0.0, 4.0)
(0.0, 0.0, 8.0)
(0.0, 4.0, 0.0)
(0.0, 4.0, 4.0)
(0.0, 4.0, 8.0)
(0.0, 8.0, 0.0)
(0.0, 8.0, 4.0)
(0.0, 8.0, 8.0)
(4.0, 0.0, 0.0)
(4.0, 0.0, 4.0)
(4.0, 0.0, 8.0)
(4.0, 4.0, 0.0)
(4.0, 4.0, 4.0)
(4.0, 4.0, 8.0)
(4.0, 8.0, 0.0)
(4.0, 8.0, 4.0)
(4.0, 8.0, 8.0)
(8.0, 0.0, 0.0)
(8.0, 0.0, 4.0)
(8.0, 0.0, 8.0)
(8.0, 4.0, 0.0)
(8.0, 4.0, 4.0)
(8.0, 4.0, 8.0)
(8.0, 8.0, 0.0)
(8.0, 8.0, 4.0)
(8.0, 8.0, 8.0)
A solution using numpy, which allows easy bloc manipulation.
First I choose to represent a cube with an origin and three vectors : the unit cube is represented with orig=np.array([0,0,0]) and vects=np.array([[1,0,0],[0,1,0],[0,0,1]]).
Now a numpy function to generate the eight vertices:
import numpy as np
def cube(origin,edges):
for e in edges:
origin = np.vstack((origin,origin+e))
return origin
cube(orig,vects)
array([[0, 0, 0],
[1, 0, 0],
[0, 1, 0],
[1, 1, 0],
[0, 0, 1],
[1, 0, 1],
[0, 1, 1],
[1, 1, 1]])
Then an other to span minicubes in 3D :
def split(origin,edges,k):
minicube=cube(origin,edges/k)
for e in edges/k:
minicube =np.vstack([minicube + i*e for i in range(k) ])
return minicube.reshape(k**3,8,3)
split (orig,vects,2)
array([[[ 0. , 0. , 0. ],
[ 0.5, 0. , 0. ],
[ 0. , 0.5, 0. ],
[ 0.5, 0.5, 0. ],
[ 0. , 0. , 0.5],
[ 0.5, 0. , 0.5],
[ 0. , 0.5, 0.5],
[ 0.5, 0.5, 0.5]],
...
[[ 0.5, 0.5, 0.5],
[ 1. , 0.5, 0.5],
[ 0.5, 1. , 0.5],
[ 1. , 1. , 0.5],
[ 0.5, 0.5, 1. ],
[ 1. , 0.5, 1. ],
[ 0.5, 1. , 1. ],
[ 1. , 1. , 1. ]]])
My example below will work on a generic box and assumes integer coordinates.
import numpy as np
def create_cube(start_x, start_y, start_z, size):
return np.array([
[x,y,z]
for z in [start_z, start_z+size]
for y in [start_y, start_y+size]
for x in [start_x, start_x+size]
])
def subdivide(box, scale):
start = np.min(box, axis=0)
end = np.max(box, axis=0) - scale
return np.array([
create_cube(x, y, z, scale)
for z in range(start[2], end[2]+1)
for y in range(start[1], end[1]+1)
for x in range(start[0], end[0]+1)
])
cube = create_cube(1, 3, 2, 8)
Cube will look like:
array([[ 1, 3, 2],
[ 9, 3, 2],
[ 1, 11, 2],
[ 9, 11, 2],
[ 1, 3, 10],
[ 9, 3, 10],
[ 1, 11, 10],
[ 9, 11, 10]])
Running the following subdivide:
subcubes = subdivide(cube, 2)
The subdivide function creates an nparray with a shape: (343, 8, 3). You would expect to have 343 subcubes moving the 2x2 cube evenly over an 8x8 cube.

How to input 2d numpy array into Tensorflow? (also on how to get matrix input and output working with TF)

I'm new to Tensorflow and I'm trying to understand how it processes data. Currently, this is what I want to have as my input. My full code is up on github should you want to download it.
print (y_train[0])
>>> [0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 0.0, 0.0]
# list of 80 elements
print (np.array(y_train))
>>> [[0. 0. 1. ... 1. 0. 0.]
[0. 1. 0. ... 0. 0. 0.]
[1. 0. 1. ... 0. 0. 0.]
...
[0. 0. 1. ... 1. 1. 0.]
[1. 0. 0. ... 0. 0. 1.]
[0. 0. 0. ... 1. 0. 1.]]
print (np.array(y_train).shape)
>>> (11645, 80)
print (x_train[0])
>>> [1.0, 4.0, 5.0, 2.0, 5.0, 3.0, 5.0, 3.0, 4.0, 5.0, 3.0, 5.0, 4.0, 3.0,
3.0, 4.0, 5.0, 4.0, 4.0, 5.0, 4.0, 3.0, 3.0, 4.0, 4.0, 5.0]
print (np.array(x_train)/5)
>>> [[0.2 0.8 1. ... 0.8 0.8 1. ]
[0.6 0.8 1. ... 1. 1. 0.8]
[0.8 0.4 1. ... 1. 0.6 1. ]
...
[1. 0.6 0.8 ... 0.4 0.8 0.6]
[1. 0.8 0.8 ... 0.4 0.6 1. ]
[0.6 0.8 0.8 ... 1. 0.8 0.6]]
print (np.array(x_train).shape)
>>> (11645, 26)
So basically I have 11645 pieces of data in my dataset. For the input, I wish to have 26 inputs normalized from 0 to 1. For the output, I wish to have 80 binary outputs. I don't think TF can give binary outputs, so I probably will use a sigmoid activation function.
How do I get Tensorflow to understand that I have 11645 pieces of data I want to process and that the input shape should be 26x1 and the output 80x1? There are some pieces of Tensorflow and Keras that I don't understand how they fit together. For instance, if I want Tensorflow to understand that my input should be 1x26 and not some other input shape, should I use x_train = tf.reshape(x_train, [-1,1*26]) and y_train = tf.reshape(y_train, [-1,1*80])? From the documentations it seems like it will shape x_train into a tensor of only 1 row and 26 columns, and I will have 11645 of those. But does that specify to Tensorflow that the input should only be 1x26 and it won't go off grabbing some other number (eg. 26x2). Or do I have to do something more explicit like this where I specify the input shape into the model? model.add(tf.keras.layers.Dense(26, activation=keras.activations.relu, input_shape=(26,)))?
Again, for my output, I want to have a 1x80 tensor that I can reshape and stuff. Do I have to specify to tensorflow explicitly? Or will something like model.add(tf.keras.layers.Dense(80, activation=keras.activations.sigmoid)) be enough to tell Tensorflow that I want a 1x80 matrix, and (for eg, using the sigmoid function) that it should compare every piece of data in that predicted 1x80 with the 1x80 matrix I have in y_train to calculate the loss function?
Basically, I am confused as to how Tensorflow 'knows' what data to accept as an individual input and output. Is there a way to specify it or is it a step one can omit?
EDIT: Based on the answers, I have used the code:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(26, input_dim=26,activation='relu'))
model.add(tf.keras.layers.Dense(80, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=32)
I'm getting the following matrix:
[0.38176608 0.34900635 0.36545524 0.36806932 0.36692804 0.37398493
0.36821148 0.35577637 0.38441166 0.3676901 0.41162464 0.40428266
0.41464344 0.4040607 0.39316037 0.428753 0.3547327 0.35693064
0.3422352 0.36919317 0.36431065 0.3515264 0.3889933 0.33974153
0.37329385 0.35898593 0.3891792 0.42334762 0.40694237 0.41910493
0.39983115 0.47813386 0.37625512 0.35567597 0.36811477 0.38242644
0.36549032 0.35696995 0.37058106 0.3556903 0.37096408 0.34965912
0.4247738 0.41512045 0.41622216 0.38645518 0.40850884 0.43454456
0.3655926 0.34644917 0.36782715 0.34224963 0.35035127 0.3502
0.3607877 0.38218996 0.37265536 0.3653391 0.41620222 0.41124558
0.3916335 0.41291553 0.39959764 0.4649614 0.34603494 0.36731967
0.34146535 0.34573284 0.33941117 0.35885242 0.3493014 0.35866526
0.37188208 0.34971312 0.38165745 0.3962399 0.38913697 0.4078925
0.38799426 0.4709055 ]
This is a far cry from the 0 and 1 matrix I want. What should I do to get closer to that? I've tried Googling my problem, but to no avail. Should I simply apply a threshold to this (eg. 0.4?) and convert it to a binary matrix that way?
Usually in tensorflow we specify placeholders for when we create the graph. These specify the datatypes, shape, and sometimes name of the input data. A basic example that matches your code:
x = tf.placeholder(tf.float32,[None,26])
y = tf.placeholder(tf.float32,[None,80])
W = tf.get_variable('W',shape=[26,80],initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.01))
output = tf.matmul(x,W)
cost = tf.losses.sigmoid_cross_entropy(y,outputs,reduction=tf.losses.Reduction.MEAN)
with tf.Session() as sess:
loss = sess.run(cost,feed_dict={x:your_input_here,y:your_output_here})
So, tensorflow knows how big your input is because you specified it to be so, and uses this to calculate the output shape of each of the subsequent layers. The batch dimension (the first dimension) doesn't matter because this can be a variable, as if your input is size [50x26], your output would be size [50,80]. The number of data samples is irrelevant because you can feed them into the model as you please.
But in keras, it's a bit simpler:
model = Sequential()
model.add(Dense(32, input_dim=26,activation='relu'))
model.add(Dense(80,activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(data, labels, epochs=10, batch_size=32)
You can see that we have to specify the input dimensions in the first layer, and again, batch size does not need to be specified. The output layer can then be specified to be the same shape as your expected number of outputs.
Also, as a side note I would recommend that you split your data into batches (anywhere between 10 and 200 samples, depending on memory/performance), and not put in the entire 11k samples at once!

Python - Break numpy array into positive and negative components

I have numpy arrays of shape (600,600,3), where the values are [-1.0, 1.0]. I would like to expand the array to (600,600,6), where the original values are split into the amounts above and below 0. Some examples (1,1,3) arrays, where th function foo() does the trick:
>>> a = [-0.5, 0.2, 0.9]
>>> foo(a)
[0.0, 0.5, 0.2, 0.0, 0.9, 0.0] # [positive component, negative component, ...]
>>> b = [1.0, 0.0, -0.3] # notice the behavior of 0.0
>>> foo(b)
[1.0, 0.0, 0.0, 0.0, 0.0, 0.3]
Use slicing to assign the min/max to different parts of the output array
In [33]: a = np.around(np.random.random((2,2,3))-0.5, 1)
In [34]: a
Out[34]:
array([[[-0.1, 0.3, 0.3],
[ 0.3, -0.2, -0.1]],
[[-0. , -0.2, 0.3],
[-0.1, -0. , 0.1]]])
In [35]: out = np.zeros((2,2,6))
In [36]: out[:,:,::2] = np.maximum(a, 0)
In [37]: out[:,:,1::2] = np.maximum(-a, 0)
In [38]: out
Out[38]:
array([[[ 0. , 0.1, 0.3, 0. , 0.3, 0. ],
[ 0.3, 0. , 0. , 0.2, 0. , 0.1]],
[[-0. , 0. , 0. , 0.2, 0.3, 0. ],
[ 0. , 0.1, -0. , 0. , 0.1, 0. ]]])

np.array returning numpy.ndarray with "..."

I created a script to generate a list:
import random
nota1 = range (5, 11)
nota2 = range (5, 11)
nota3 = range (5, 11)
nota4 = range (0, 2)
dados = []
for i in range(1000):
dados_dado = []
n1 = random.choice(nota1)
n2 = random.choice(nota2)
n3 = random.choice(nota3)
n4 = random.choice(nota4)
n1 = float (n1)
n2 = float (n2)
n3 = float (n3)
n4 = float (n4)
dados_dado.append (n1)
dados_dado.append (n2)
dados_dado.append (n3)
dados_dado.append (n4)
dados.append (dados_dado)
When i print type (dados) python return: <type 'list'>, a huge list that looks like this:
[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0], [6.0, 9.0, 8.0, 1.0]]
I need to transform it to <type 'numpy.ndarray'> so i made :
data = np.array(dados)
What i expected to return was something like this:
[[ 6.8 3.2 5.9 2.3]
[ 6.7 3.3 5.7 2.5]
[ 6.7 3. 5.2 2.3]
[ 6.3 2.5 5. 1.9]
[ 6.5 3. 5.2 2. ]
[ 6.2 3.4 5.4 2.3]
[ 5.9 3. 5.1 1.8]]
But, what i get instead is:
[[ 7. 10. 6. 1.]
[ 8. 6. 6. 1.]
[ 6. 9. 5. 0.]
...,
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
What am i doing wrong?
With your sample:
In [574]: dados=[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.
...: 0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0],
...: [6.0, 9.0, 8.0, 1.0]]
In [575]: print(dados)
[[5.0, 8.0, 10.0, 1.0], [8.0, 9.0, 9.0, 1.0], [7.0, 5.0, 6.0, 1.0], [5.0, 8.0, 7.0, 0.0], [9.0, 7.0, 10.0, 0.0], [6.0, 7.0, 9.0, 1.0], [6.0, 9.0, 8.0, 1.0]]
convert it to an array, an see the whole thing. Your input didn't have decimals to numpy display omits those.
In [576]: print(np.array(dados))
[[ 5. 8. 10. 1.]
[ 8. 9. 9. 1.]
[ 7. 5. 6. 1.]
[ 5. 8. 7. 0.]
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
Replicate the list many times, and print display has this ..., rather than show 10,000 lines. That's nice isn't it?
In [577]: print(np.array(dados*1000))
[[ 5. 8. 10. 1.]
[ 8. 9. 9. 1.]
[ 7. 5. 6. 1.]
...,
[ 9. 7. 10. 0.]
[ 6. 7. 9. 1.]
[ 6. 9. 8. 1.]]
The full array is still there
In [578]: np.array(dados*1000).shape
Out[578]: (7000, 4)
The default is for numpy to add the ellipsis when the total number of entries is 1000. Do you really need to see all those lines?
That print standard can be changed, but I question whether you need to do that.
Your array is fine. NumPy just suppresses display of the whole array for large arrays by default.
(If you actually were expecting your array to be short enough not to trigger this behavior, or if you were actually expecting it to have non-integer entries, you'll have to explain why you expected that.)
numpy.set_printoptions(precision=20)
Will give you more displayabilty, set precision as you desire.

Categories