I am working on EEG signal data stored in a ".gdf" file, for my university project. My aim is to open that file using Python. Till now, I am able to open that file using MNE package. The code is :
import os
import numpy as np
import mne
raw=mne.io.read_raw_gdf('1.gdf')
print(raw.info)
As a result, I am getting :
Extracting EDF parameters from C:\Users\Gamer\Desktop\1.gdf...
GDF file detected
Setting channel info structure...
Creating raw.info structure...
<Info | 7 non-empty values
bads: []
ch_names: AFz, F3, F1, Fz, F2, F4, FFC5h, FFC3h, FFC1h, FFC2h, FFC4h, ...
chs: 64 EEG
custom_ref_applied: False
highpass: 0.0 Hz
lowpass: 128.0 Hz
meas_date: 2017-04-04 12:50:01 UTC
nchan: 64
projs: []
sfreq: 256.0 Hz
>
Now, my questions are:
how can I get the values in tabular form?
How do I know the dimension of the dataset using Python?
Is there any way to convert the .gdf file into .csv file or any other format (like pandas dataframe)?
The data set description is available at http://bnci-horizon-2020.eu/database/data-sets/001-2019/dataset_description_v1-1.pdf
Welcome to stack overflow and mne-python! :)
How do I know the dimension of the dataset using Python?
If you try to print the raw file that you've read (instead of just its info attribute) you should be able to see the dimensions. Raw files are always stored in mne python channels-first, so the dimensions of the data array are channels x samples.
how can I get the values in tabular form?
If you are fine with an array you can get it via .get_data() method (see the docs here). If you prefer pandas dataframe you can get it by raw.to_data_frame() (docs).
But before getting the data array/table you might want to perform filtering (for example raw.filter(1, None)), annotate bad data segments (tutorial), interpolate bad channels (tutorial) and perform ICA (tutorial). In general, it is likely that the analysis you are trying to conduct is either implemented in mne or easier to perform using mne objects.
Make sure to see the many rich tutorials and examples in mne docs.
If you have any further problems we use Discourse now: https://mne.discourse.group/
Related
I'm trying to write a script to simplify my everyday life in the lab. I operate one ThermoFisher / FEI scanning electron microscope and I save all my pictures in the TIFF format.
The microscope software is adding an extensive custom TiffTag (code 34682) containing all the microscope / image parameters.
In my script, I would like to open an image, perform some manipulations and then save the data in a new file, including the original FEI metadata. To do so, I would like to use a python script using the tifffile module.
I can open the image file and perform the needed manipulations without problems. Retrieving the FEI metadata from the input file is also working fine.
I was thinking to use the imwrite function to save the output file and using the extratags optional argument to transfer to the output file the original FEI metadata.
This is an extract of the tifffile documentation about the extratags:
extratags : sequence of tuples
Additional tags as [(code, dtype, count, value, writeonce)].
code : int
The TIFF tag Id.
dtype : int or str
Data type of items in 'value'. One of TIFF.DATATYPES.
count : int
Number of data values. Not used for string or bytes values.
value : sequence
'Count' values compatible with 'dtype'.
Bytes must contain count values of dtype packed as binary data.
writeonce : bool
If True, the tag is written to the first page of a series only.
Here is a snippet of my code.
my_extratags = [(input_tags['FEI_HELIOS'].code,
input_tags['FEI_HELIOS'].dtype,
input_tags['FEI_HELIOS'].count,
input_tags['FEI_HELIOS'].value, True)]
tifffile.imwrite('output.tif', data, extratags = my_extratags)
This code is not working and complaining that the value of the extra tag should be ASCII 7-bit encoded. This looks already very strange to me because I haven't touched the metadata and I am just copying it to the output file.
If I convert the metadata tag value in a string as below:
my_extratags = [(input_tags['FEI_HELIOS'].code,
input_tags['FEI_HELIOS'].dtype,
input_tags['FEI_HELIOS'].count,
str(input_tags['FEI_HELIOS'].value), True)]
tifffile.imwrite('output.tif', data, extratags = my_extratags)
the code is working, the image is saved, the metadata corresponding to 'FEI_HELIOS' is created but it is empty!
Can you help me in finding what I am doing wrongly?
I don't need to use tifffile, but I would prefer to use python rather than ImageJ because I have already several other python scripts and I would like to integrate this new one with the others.
Thanks a lot in advance!
toto
ps. I'm a frequent user of stackoverflow, but this is actually my first question!
In principle the approach is correct. However, tifffile parses the raw values of certain tags, including FEI_HELIOS, to dictionaries or other Python types. To get the raw tag value for rewriting, it needs to be read from file again. In these cases, use the internal TiffTag._astuple function to get an extratag compatible tuple of the tag, e.g.:
import tifffile
with tifffile.TiffFile('FEI_SEM.tif') as tif:
assert tif.is_fei
page = tif.pages[0]
image = page.asarray()
... # process image
with tifffile.TiffWriter('copy1.tif') as out:
out.write(
image,
photometric=page.photometric,
compression=page.compression,
planarconfig=page.planarconfig,
rowsperstrip=page.rowsperstrip,
resolution=(
page.tags['XResolution'].value,
page.tags['YResolution'].value,
page.tags['ResolutionUnit'].value,
),
extratags=[page.tags['FEI_HELIOS']._astuple()],
)
This approach does not preserve Exif metadata, which tifffile cannot write.
Another approach, since FEI files seem to be written uncompressed, is to directly memory map the image data in the file to a numpy array and manipulate that array:
import shutil
import tifffile
shutil.copyfile('FEI_SEM.tif', 'copy2.tif')
image = tifffile.memmap('copy2.tif')
... # process image
image.flush()
Finally, consider tifftools for rewriting TIFF files where tifffile is currently failing, e.g. Exif metadata.
I am working with AVIRIS Classic data which has an interleave of BIP, or Band Interleaved by Pixel. I want to convert the datatype to BIL (Band Interleaved by Line). In the image processing language IDL, you can do this using the function CONVERT_DOIT, but this uses a proprietary software. Are there any python libraries that have a function to carry out this task?
I am completely unfamiliar with AVIRIS data and its processing, so there may be much simpler or better methods of accessing it of which I am unaware. However, I found and downloaded a smallish sample from the linked website as follows:
Reading the .hdr file (which is ASCII fortunately), I was able to work out that the data are signed 16-bit integers, band-interleaved-by-pixel of 224 bands, and 735 samples/line and 2017 lines. So, I can then load the image and process it with Numpy as follows:
import numpy as np
from PIL import Image
# Define datafile parameters
channels, samples, lines = 224, 735, 2017
# Load file and reshape
im = np.fromfile('f090710t01p00r11rdn_b_ort_img', dtype=np.int16).reshape(lines,samples,channels)
The data are signed integers in the range -32768...+32767, so if we add 32768 the data will be 0..65536 and then multiply by 255/65535 we should get a viewable, but not radiometrically correct, image to prove the reading from file:
# That's kind of all - just do crude scaling to show we have it correctly
a = (im.astype(np.float)+32768.0)*255.0/65535.0
Now select band 0, and save (using PIL, but we could use OpenCV or tifffile):
Image.fromarray(a[:,:,0].astype(np.uint8)).save('result.png')
Presumably you can now arrange the data however you like with Numpy as we have read it successfully. So, say you want line 7 of band 4, you could use:
a[7,:,4]
Or line 0, all bands:
a[0,:,:]
If you wanted to make a false colour composite from 3 of the 224 bands, you can use np.dstack() like this - my bands were chosen at random:
FalseColour = np.dstack((a[...,37], a[...,164], a[...,200])).astype(np.uint8)
Image.fromarray(FalseColour).save('result.png')
Keywords: Python, AVIRIS, hyperspectral image processing, hyper-spectral, band-interleaved by pixel, by line, planar, interlace.
I am trying to convert multiple NetCDF4 files to GeoTIFF rasters while downsampling the data from daily frequency to monthly frequency (ignoring NaNs). Unfortunately, I get errors when I get to the step when I want to write the data into geotiffs.
It was easy to open and downsample files using xarray (awesome functionalities if compared to R, which is what I am used to working with) keeping just one variable, as I wanted and calculating monthly means from daily values but, I got stuck trying to export/convert my results into multiple *.tif files.
Even tried converting it to multiple NetCDF3 files, since I can convert those to geotiff easily, but also failed.
Input data are 4018 ESA soil moisture ".nc4" files in daily frequency, covering the whole globe but incomplete at each step (only swaths for each day are filled with data, the rest is empty/NA), so I want to ignore NaNs for calculating the monthly means.
In total, these 4018 days tally up to 132 months (11 years) and my xarray dataset produced from the code below, seems to show that, as expected.
import xarray as xr
# opening the files into an array
mfdataDIR = 'C:/full_path_here/*.nc'
DS = xr.open_mfdataset(mfdataDIR)
# downsampling it from daily to monthly means while keeping attributes and ignoring NAs
monthly_data = DS.sm.resample(time="1M").mean(skipna= True, keep_attrs=True)
I got the following warning here: "Default reduction dimension will be changed to the grouped dimension after xarray 0.12. To silence this warning, pass dim=xarray.ALL_DIMS explicitly. skipna=skipna, allow_lazy=True, **kwargs)"
Not sure if it matters but results seem ok when I print the monthly_data xarray.
# Now, trying to convert to GeoTIFFs using Robin Wilson's rasterio_to_xarray (and vice-versa) script (full script can be found on link below).
import sys
sys.path.insert(0, 'C:/path_to_py_script/')
import xarray_to_rasterio as xarrast
xarrast.xarray_to_rasterio_by_band(monthly_data, 'C:/path_here/%s.tif', dim= 'time')
Now I got this error: "IndexError: tuple index out of range"
I think the mistake is in how I am trying to use it and not on the script although I can't see where.
The script made by Robin Wilson, which would be ideal for my objectives, was gotten here: https://github.com/robintw/XArrayAndRasterio/blob/master/rasterio_to_xarray.py
I see errors traced to line 84:
82 if len(xa.shape) == 2:
83 count = 1
84 height = xa.shape[0]
85 width = xa.shape[1]
86 band_indicies = 1
87 else:
and line 122:
122 xarray_to_rasterio(data, filename)
So, it seems my time dim, which I hoped would count as bands, are not what the script expects and so, it fails. And also I am messing up with "filename" parameter, apparently.
Don't know how... Can I use this script or modify it to do what I want?
(to save 132 ".tif" files corresponding to 132 levels of the time dim)
# Second option, not ideal but could still help me, if it hadn't also failed:
mDS = monthly_data.to_dataset()
paths = ['C:/path_here/%s.nc' ]
xr.save_mfdataset(mDS, paths, format='NETCDF3_CLASSIC')
got this error: "TypeError: save_mfdataset only supports writing Dataset objects, received type <class 'str'>"
I think my lack of python knowledge is hampering me as I keep getting errors in both procedures (xarray to GeoTIFF or xarrayDataset to NetCDF3 classic).
When I inspect the monthly_data I see what I expected: only the "sm" variable, all three dimensions and 132 time "values".
Can anyone help please?
I work in a lab where we acquire electrophysiological recordings (across 4 recording channels) using custom Labview VIs, which save the acquired data as a .DAT (binary) file. The analysis of these files can then be continued in more Labview VIs, however I would like to analyse all my recordings in Python. First, I need walk through all of my files and convert them out of binary!
I have tried numpy.fromfile (filename), but the numbers I get out make no sense to me:
array([ 3.44316221e-282, 1.58456331e+029, 1.73060724e-077, ...,
4.15038967e+262, -1.56447362e-090, 1.80454329e+070])
To try and get further I looked up the .DAT header format to understand how to grab the bytes and translate them - how many bytes the data is saved in etc:
http://zone.ni.com/reference/en-XX/help/370859J-01/header/header/headerallghd_allgemein/
But I can't work out what to do. When I type "head filename" into terminal, below is what I see.
e.g. >> head 2014_04_10c1slice2_rest.DAT
DTL?
0???? ##????
empty array
PF?c ƀ????l?N?"1'.+?K13:13:27;0.00010000-08??t???DY
??N?t?x???D?
?uv?tgD?~I??
??N?t?x>?n??
????t?x>?n??
????t???D?
????t?x???D?
????t?x?~I??
????tgD>?n??
??N?tgD???D?
??N?t?x???D?
????t????DY
??N?t?x>?n??
??N?t????DY
?Kn$?t?????DY
??N?t??>?n??
??N?tgD>?n??
????t?x?~I??
????tgD>?n??
??N?tgD>?n??
??N?tgD???DY
????t?x???D?
????t???~I??
??N?tgD???DY
??N?tgD???D?
??N?t?~I??
??N?t?x???DY
??N?tF>?n??
??N?t?x??%Y
Any help or suggestions on what to do next would be really appreciated.
Thanks.
P.s. There is an old (broken) matlab file that seems to have been intended to convert these files. I think this could probably be helpful, but having spent a couple of days trying to understand it I am still stuck. http://www.mathworks.co.uk/matlabcentral/fileexchange/27195-load-labview-binary-data
Based on this link it looks like the following should do the trick:
binaryFile = open('Measurement_4.bin', mode='rb')
(data.offset,) = struct.unpack('>d', binaryFile.read(8))
Note that mode is set to 'rb' for binary.
With numpy you can directly do this as
data = numpy.fromfile('Measurement_4.bin', dtype='>d')
Please note that if you are just using Python as an intermediate and want to go back to LabVIEW with the data, you should instead use the function Read from Binary file.vi to read the binary file using native LabVIEW.
DAT is a pretty generic suffix, not necessarily something pointing to a specific format. If I'm understanding correctly, that help section is for DIAdem, which may be completely unrelated to how your data is saved from LV.
What you want is this help section, which tells you how LV flattens data to be stored on disk - http://zone.ni.com/reference/en-XX/help/371361J-01/lvconcepts/flattened_data/
You will need to look at the LV code to see exactly what kind of data you're saving and how the write file function is configured (byte order, size prepending, etc.) and then use that document to translate it to the actual representation.
I'm interested in using Spectral Python (SPy) to visualize and classify multiband raster GeoTIFF (not hyperspectral data). Currently it appaers that only .lan, .gis File Formats are readable.
I've tried to convert files to .lan with gdal_translate but the image format is not supported( IOError: Unable to determine file type or type not supported).
Any idea how to use this library for non hypersperctral dataset?
Convert the GeoTIFF file to a compatible format (e.g. LAN). This can be done in one of two ways. From a system shell, use gdal_translate:
gdal_translate -of LAN file.tif file.lan
Or similar within Python:
from osgeo import gdal
src_fname = 'file.tif'
dst_fname = 'file.lan'
driver = gdal.GetDriverByName('LAN')
sds = gdal.Open(src_fname)
dst = driver.CreateCopy(dst_fname, sds)
dst = None # close dataset; the file can now be used by other processes
Note that the first method is actually better, as it also transfers other metadata, such as the spatial reference system and possibly other data. To correctly do the same in Python would require adding more lines of code.