Save Image dataset into CSV - python

I have an Image dataset consisting of 90k images of size [64,64,3].
I have done some preprocessing to the images, which takes a lot of time if I have to do it from scratch.
Now, how do I store these images/ images as a numpy array for shape[90000,64,64,3] into a csv file, as integers, along with their labels?
Is there any other way (other file type) to store this data?
P.S: I tried np.savetxt but, when I read back the data, I get strings with dots and a lot of the values are lost.
Thank you.

Found it!!
We can use
np.save()
to save the array in a .npy format and load the file using
np.load()
Also, multiple numpy arrays can be saved using
np.savez()
and
np.savez_compressed()
to save them in .npz and a compressed .npz format.
COOL

Related

How to save feature values extracted from images in csv format using python?

I have extracted some features from images and now I want to save these values in csv file. I have searched some links like (When using OpenCV's 2D feature detection on an image, how do I export the data as a CSV for external use?) and (how to save feature matrix as csv file), but not getting the solution to my problem.
How to make csv of these features in python?

How to extract individual JPEG images from a HDF5 file

I have a big HDF5 file with the images and its corresponding ground truth density map.
I want to put them into the network CRSNet and it requires the images in separate files.
How can I achieve that? Thank you very much.
-- Basic info I have a HDF5 file with two keys "images" and "density_maps". Their shapes are (300, 380, 676, 1).
300 stands for the number of images, 380 and 676 refer to the height and width respectively.
-- What I need to put into the CRSNet network are the images (jpg) with their corresponding HDF5 files. The shape of them would be (572, 945).
Thanks a lot for any comment and discussion!
For starters, a quick clarification on h5py and HDF5. h5py is a Python package to read HDF5 files. You can also read HDF5 files with the PyTables package (and with other languages: C, C++, FORTRAN).
I'm not entirely sure what you mean by "the images (jpg) with their corresponding h5py (HDF5) files" As I understand all of your data is in 1 HDF5 file. Also, I don't understand what you mean by: "The shape of them would be (572, 945)." This is different from the image data, right? Please update your post to clarify these items.
It's relatively easy to extract data from a dataset. This is how you can get the "images" as NumPy arrays and and use cv2 to write as individual jpg files. See code below:
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
img_arr = h5f['images'][i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.jpg',img_arr)
Before you start coding, are you sure you need the images as individual image files, or individual image data (usually NumPy arrays)? I ask because the first step in most CNN processes is reading the images and converting them to arrays for downstream processing. You already have the arrays in the HDF5 file. All you may need to do is read each array and save to the appropriate data structure for CRSNet to process them. For example, here is the code to create a list of arrays (used by TensorFlow and Keras):
image_list = []
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
image_list.append( h5f['images'][i,:] ) # gets slice [i,:,:,:]

Convert folder with .npz file to one .npz

I have a folder containing multiple subfolders containing .npz files each corresponding to a song track. I want to reshape this into one single .npz or .npy file in the shape of (n_of_songs, x, y, z). I can't seem to find a way. Am I viewing this wrong like, should I just iterate the whole folder then concatenate the .npz files(kind of a brute force), or is there a way to convert the directory itself to a .npz. Thanks
With savez_compressed you are able to save multiple numpy arrays into a single file. See the documentation.

How do you open .NPY files?

How do I open .NPY files in python so that I can read them?
I've been trying to run some code I've found but it outputs in .NPY files so I can't tell if its working.
*.npy files are binary files to store numpy arrays. They are created with
import numpy as np
data = np.random.normal(0, 1, 100)
np.save('data.npy', data)
And read in like
import numpy as np
data = np.load('data.npy')
Late reply but I think NPYViewer is a tool that can help you, as it allows you to quickly visualize the contents of .npy files without having to write code. It also has options to visualize 2D .npy arrays as grayscale images as well as 3D point clouds.
Reference: https://github.com/csmailis/NPYViewer

Loading and saving raw images

I'm looking to be able to read in pixel values as captured in a raw NEF image, process the data for noise removal, and then save the new values back into the raw image format maintaining all the metadata for later use. I've seen dcraw can read in raw format and output the Bayer pattern data as a tiff or other image but I can't save it back to my NEF. I've also been attempting to read in and save the image with simple python file open or numpy memmap but have no clue how to handle the binary data. Any help would be appreciated. Thanks!

Categories