How do I open .NPY files in python so that I can read them?
I've been trying to run some code I've found but it outputs in .NPY files so I can't tell if its working.
*.npy files are binary files to store numpy arrays. They are created with
import numpy as np
data = np.random.normal(0, 1, 100)
np.save('data.npy', data)
And read in like
import numpy as np
data = np.load('data.npy')
Late reply but I think NPYViewer is a tool that can help you, as it allows you to quickly visualize the contents of .npy files without having to write code. It also has options to visualize 2D .npy arrays as grayscale images as well as 3D point clouds.
Reference: https://github.com/csmailis/NPYViewer
Related
I have a big HDF5 file with the images and its corresponding ground truth density map.
I want to put them into the network CRSNet and it requires the images in separate files.
How can I achieve that? Thank you very much.
-- Basic info I have a HDF5 file with two keys "images" and "density_maps". Their shapes are (300, 380, 676, 1).
300 stands for the number of images, 380 and 676 refer to the height and width respectively.
-- What I need to put into the CRSNet network are the images (jpg) with their corresponding HDF5 files. The shape of them would be (572, 945).
Thanks a lot for any comment and discussion!
For starters, a quick clarification on h5py and HDF5. h5py is a Python package to read HDF5 files. You can also read HDF5 files with the PyTables package (and with other languages: C, C++, FORTRAN).
I'm not entirely sure what you mean by "the images (jpg) with their corresponding h5py (HDF5) files" As I understand all of your data is in 1 HDF5 file. Also, I don't understand what you mean by: "The shape of them would be (572, 945)." This is different from the image data, right? Please update your post to clarify these items.
It's relatively easy to extract data from a dataset. This is how you can get the "images" as NumPy arrays and and use cv2 to write as individual jpg files. See code below:
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
img_arr = h5f['images'][i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.jpg',img_arr)
Before you start coding, are you sure you need the images as individual image files, or individual image data (usually NumPy arrays)? I ask because the first step in most CNN processes is reading the images and converting them to arrays for downstream processing. You already have the arrays in the HDF5 file. All you may need to do is read each array and save to the appropriate data structure for CRSNet to process them. For example, here is the code to create a list of arrays (used by TensorFlow and Keras):
image_list = []
with h5py.File('yourfile.h5','r') as h5f:
for i in range(h5f['images'].shape[0]):
image_list.append( h5f['images'][i,:] ) # gets slice [i,:,:,:]
I have an Image dataset consisting of 90k images of size [64,64,3].
I have done some preprocessing to the images, which takes a lot of time if I have to do it from scratch.
Now, how do I store these images/ images as a numpy array for shape[90000,64,64,3] into a csv file, as integers, along with their labels?
Is there any other way (other file type) to store this data?
P.S: I tried np.savetxt but, when I read back the data, I get strings with dots and a lot of the values are lost.
Thank you.
Found it!!
We can use
np.save()
to save the array in a .npy format and load the file using
np.load()
Also, multiple numpy arrays can be saved using
np.savez()
and
np.savez_compressed()
to save them in .npz and a compressed .npz format.
COOL
I am trying to implement the package:
https://pyradiomics.readthedocs.io/en/latest/usage.html
It looks super simple, but they expect .nrrd files.
My files are .nii.gz. How do I solve this?
Also, have anyone tried to apply PyRadiomics on TCIA data? if so, can I see your github or Jupyter Notebook?
Thanks a lot.
You could turn NII into numpy array firstly and then turn it into NRRD with using:
nrrd and nibabel
import numpy as np
import nibabel as nib
import nrrd
# Download NII
example_filename = "image.nii.gz"
image = nib.load(example_filename)
# Turn into numpy array
array = np.array(img.dataobj)
# Save NRRD
nrrd_path_to = "image.nrrd"
nrrd.write(image_path_to, array)
Although the examples are in .nrrd, PyRadiomics uses SimpleITK for image operations. This allows PyRadiomics to support a whole range of image formats, including .nii.gz. You don't have to convert them.
The DWIConverter converts diffusion-weighted MR images in DICOM series into nrrd format for analysis in Slicer. It parses the DICOM header to extract necessary information about measurement frame, diffusion weighting directions, b-values, etc, and write out a nrrd image. For non-diffusion weighted DICOM images, it loads in an entire DICOM series and writes out a single DICOM volume in a .nhdr/.raw pair.
So that trying to convert your .nii.gz inside DICOM files for the nrrd format is a possibility by using this tools. Also, you can look at the SlicerDMRI that is a similar module.
We are using TensorFlow and python to create a custom CNN that will classify images into one of several categories. We have created our CNN based on this tutorial: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/layers/cnn_mnist.py
Instead of reading in a pre-existing dataset like the MNIST dataset used in the tutorial, we would like to read in all images from multiple folders. The name of each folder is the label associated with all the images in that folder. Unfortunately we're very new to python and TensorFlow, could someone point us in the right direction, either with a tutorial or some basic code?
Thank you so much!
consider using the glob package. it allows you to import multiple files in subdirectories easily using patterns. https://docs.python.org/2/library/glob.html
import glob
import matplotlib.pyplot as plt
import numpy as np
images = glob.glob(<file pattern>)
img_list = [plt.imread(image) for image in images]
img_array = np.stack(tuple(img_list))
I haven't tested this so there might be errors but it should make a 3-d numpy array of images (each image a 2-d array). Is this the format you were looking for?
I am working on a project where I need to extract the Mel-Cepstral Frequency Coefficients (MFCC) from audio signals. The first step for this process is to read the audio file into Python.
The audio files I have are stored in a .sph format. I am unable to find a method to read these files directly into Python. I would like to have the sampling rate, and a NumPy array with the data, similar to how wav read works.
Since the audio files I will be dealing with are large in size, I would prefer not to convert to .wav format for reading. Could you please suggest a possible method to do so?
I was against converting to a .wav file as I assumed it would take a lot of time. That is not the case. So, converting using SoX suited my needs.
The following script when run in a windows folder converts all the files in that folder to a .wav file.
cd %~dp0
for %%a in (*.sph) do sox "%%~a" "%%~na.wav"
pause
After this, the following command can be used to read the file.
import scipy.io.wavfile as wav
(rate,sig) = wav.read("file.wav")
Based on The answer of ben, I was able to read a .sph file with librosa, as it can read everything that audioread and ffmpeg can read.
import librosa
import librosa.display # You need this in librosa to be able to plot
import matplotlib.pyplot as plt
clip_dir = os.path.join("..","babel","LDC2016S10.sph")
audio,sr = librosa.load(clip_dir,sr=16000) # audio is a numpy array
fig, ax = plt.subplots(figsize=(15,8))
librosa.display.waveplot(audio, sr=sr, ax=ax)
ax.set(title="LDC2016S10.sph waveform")
You can read sph files via audioreadwith ffmpeg codecs.