Transforming a DataFrame into a Keras dataset

Transforming a DataFrame into a Keras dataset - python

I have a Pandas DataFrame containing about 10k 100x100 images stored in grayscale (as a (100,100,1) numpy array), along with a label for each of them (as a string, which is someone's name). I want to turn it into a Keras dataset.
I read that I can simply use dataset = tensorflow.data.Dataset.from_tensor_slices(dict(dataframe)), but it doesn't work, maybe because it's a 3D numpy array? Should I reshape each 100x100 image into a 10,000 long array?
I'm trying to construct my dataset similar to the CIFAR10.

If you can seperate out the labels from the images and then pass the images dataframe and the labels dataframe separately in a tuple to the from_tensor_slices function, it should work...
dataset = tf.data.Dataset.from_tensor_slices((images,labels))
I tried to recreate your issue for myself by generating random values in the shape (10000,100,100,1) as the image dataset and it seemed to work just fine on my end.

Related

Numpy Extract Data from Compressed Sparse Column Format

I have a mat file with sparse data for around 7000 images with 512x512 dimensions stored in a flattened format (so rows of 262144) and I’m using scipy’s loadmat method to turn this sparse information into a Compressed Sparse Column format. The data inside of these images is a smaller image that’s usually around 25x25 pixels somewhere inside of the 512x512 region , though the actual size of the smaller image is not consitant and changes for each image. I want to get the sparse information from this format and turn it into a numpy array with only the data in the smaller image; so if I have an image that’s 512x512 but there’s a circle in a 20x20 area in the center I want to just get the 20x20 area with the circle and not get the rest of the 512x512 image. I know that I can use .A to turn the image into a non-sparse format and get a 512x512 numpy array, but this option isn’t ideal for my RAM.
Is there a way to extract the smaller images stored in a sparse format without turning the sparse data into dense data?
I tried to turn the sparse data into dense data, reshape it into a 512x512 image, and then I wrote a program to find the top, bottom, left, and right edges of the image by checking for the first occurrence of data from the top, bottom, left, and right but this whole processes seemed horribly inefficient.

Sorry about the little amount of information I provided; I ended up figuring it out.Scipy's loadmat function when used to extract sparse data from a mat file returns a csc_matrix, which I then converted to numpy's compressed sparse column format. Numpy's format has a method .nonzero() that will return the index of every non_zero element in that matrix. I then reshaped the numpy csc matrix into 512x512, and then used .nonzero() to get the non-zero elements in 2D, then used used those indexes to figure out the max height and width of my image I was interested in. Then I created a numpy matrix of zeros the size of the image I wanted, and set the elements in that numpy matrix to the elements to the pixels I wanted by indexing into my numpy csc matrix (after I called .tocsr() on it)

resize images in a numpy array of 4 dimensions

I am working on Gan networks, and it happens that I have a numpy array of images with size (26600,256,256,3). is there any method that can resize the images inside that numpy array to have in the output a numpy array of (26600,64,64,3).
PS. don't take in consideration the real sizes I provided I just wanted to know the method without resizing the images manually. thank you.

Resize CSV data using Python and Keras

I have CSV files that I need to feed to a Deep-Learning network. Currently my CSV files are of size 360*480, but the network restricts them to be of size 224*224. I am using Python and Keras for the deep-learning part. So how can I resize the matrices?
I was thinking that since aspect ratio is 3:4, so if I resize them to 224:(224*4/3) = 224:299, and then crop the width of the matrix to 224, it could serve the purpose. But I cannot find a suitable function to do that. Please suggest.

I think you're looking for cv.resize() if you're using images.
If not, try numpy.ndarray.resize()

Image processing
If you want to do nontrivial alterations to the data as images (i.e. interpolating between pixel values, assuming that they represent photographs) then you might want to use proper image processing libraries for that. You'd need to treat them not as raw matrixes (csv of numbers) but convert them to rgb images, do the transformations you desire, and convert them back to a numpy matrix.
OpenCV (https://docs.opencv.org/3.4/da/d6e/tutorial_py_geometric_transformations.html)
or Pillow (https://pillow.readthedocs.io/en/3.1.x/reference/Image.html) might be useful to do that.

I found a short and simple way to solve this. This uses the Python Image Library/Pillow.
import numpy as np
import pylab as pl
from PIL import Image
matrix = np.array(list(csv.reader(open('./path/mat.csv', "r"), delimiter=","))).astype("uint8") #read csv
imgObj = Image.fromarray(matrix) #convert matrix to Image object
resized_imgObj = img.resize((224,224)) #resize Image object
imgObj.show()
resized_imgObj.show()
resized_matrix = np.asarray(img) #convert Image object to matrix
While numpy module also has a resize function, but it is not as useful as the aforementioned way.
When I tried it, the resized matrix had lost all the intricacies and aesthetic aspect of the original matrix. This is probably due to the fact that numpy.ndarray.resize doesn't interpolate and missing entries are filled with zeros.
So, for this case Image.resize() is more useful.

You could also convert the csv file to a list, truncate the list, and then convert the list to a numpy array and then use np.reshape.

Loading 3D Model but getting 2D Array in Python

I`ve downloaded a sample .stl file from here: [https://www.thingiverse.com/thing:156207]
Then I've used this code to get a numpy array for further image processing with matplotlib:
import numpy as np
from stl import mesh
np.set_printoptions(threshold=np.nan)
# Using an existing stl file:
your_mesh = mesh.Mesh.from_file('300_polygon_sphere_100mm.stl')
data = np.array(your_mesh)
print(data.shape)
Unfortunately, this is an array with only two dimensions. I've checked the .stl file with my editor and there are three dimensions.
Can someone help me? My goal is to create a code with that i can slice 3D models to get acces to the sliced 2d images.
Thanks.
EDIT: I've tried to reshape it:
data_reshaped = np.reshape(data, (550, 3, 3))
But i guess this totally wrong. And i don't know if the pattern is (Z, X, Y).
I want to do some slicing operations on the 3d array to get XY images like this guy is very easily doing https://www.youtube.com/watch?v=5jQVQE6yfio&list=PLT66ZlnovHPYzny9TYM1mx02k5Xnw_kjw&t=215s&index=3

You won't be able to just load the .stl file into a numpy array and perform slicing as shown in the video you linked. In the video, they load a model that is stored as a 3D numpy array.
However, the model you are trying to load consists of a polygonal mesh. This means you only have the coordinate values of the vertices. You can open the .stl file in a text editor to see its contents. (By converting the loaded mesh into a numpy array you just extract those coordinate values. You can actually compare the values in the numpy array and the text file, they are the same.) The resulting numpy array has shape (550, 9). The first dimension is defined by the number of faces in the model (in this case, the model has 550 faces). As each face has three vertices, which have three coordinate values each, hence you have 9 numbers per face. So the third dimension is not lost. It's just stored in a different manner.
Simply reshaping the array won't create you a model of which you can get slices of, as shown in the video. To achieve this, you have to convert the meshed model into a rasterized one. You could do this by initializing an empty 3D array that contains the whole model and then determining for each pixel if it intersects with the geometry of the mesh you loaded.

Numpy Concatenate Images into Array

I have a bunch of images that I want to store into an array.
The problem is that all my images are different sizes and I don't want to necessarily change their size, because some will be square and some aren't.
I tried using np.concatenate but someone online said it was better to construct a zero matrix and fill it.
However, using
image = misc.imread(filename)
from the scipy library. The image is returned as a 3 dimensional array. How should I construct my numpy ndarray if I want to store all the images in it?

If I'm understanding the question correctly, you are trying to store a bunch of images of different sizes that are each stored as separate numpy arrays. If your images are gray scale (meaning 2D, as opposed to RGB which are 3D - a channel for R, G and B), you could store the images as the third dimension, filling in the absent pixels with 0s. But the best way would be to just use a python list (or tupple maybe) that stores a list of your numpy array images. That way they can be different sizes. i.e.: img_list = img1, img2, img3, etc.

storing them in a list may be easier, the list will store them as array() objects and size wont matter, when you do operations on them, just reference the list elements.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Transforming a DataFrame into a Keras dataset - python

Related

Numpy Extract Data from Compressed Sparse Column Format

resize images in a numpy array of 4 dimensions

Resize CSV data using Python and Keras

Loading 3D Model but getting 2D Array in Python

Numpy Concatenate Images into Array

Categories

Resources