Error in reading hdf file using h5py package for python - python

I want to extract data from hdf files that I downloaded from MODIS website. A sample file is provided in the link. I am reading the hdf file by using the following lines of code:
>>> import h5py
>>> f = h5py.File( 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf', 'r' )
The error I am getting:
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
f = h5py.File( 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf', 'r' )
File "C:\Python27\lib\site-packages\h5py\_hl\files.py", line 165, in __init__
fid = make_fid(name, mode, userblock_size, fapl)
File "C:\Python27\lib\site-packages\h5py\_hl\files.py", line 57, in make_fid
fid = h5f.open(name, h5f.ACC_RDONLY, fapl=fapl)
File "h5f.pyx", line 70, in h5py.h5f.open (h5py\h5f.c:1640)
IOError: unable to open file (File accessability: Unable to open file)
I have tried several other hdf files from different sources but I am getting the same error. What seems to be the fault here?

I think there could be two possible problems:
1) As the file extension is "hdf", maybe this is a HDF4 file. HDF5 files normally have ".hdf5" or ".h5·" extension. I am not sure if h5py is able to read HDF4 files.
2) Perhaps you have to change permissions to the file itself. If you are in a linux machine try: chmod +r file.hdf
You can try to open your file with HDFView. This software is available in several platforms. You can check the properties of the files very easily with it.

This sounds like a file permission error, or even file existence. Maybe add some checks such as
import os
hdf_file = 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf'
if not os.path.isfile(hdf_file):
print 'file %s not found' % hdf_file
if not os.access(hdf_file, os.R_OK):
print 'file %s not readable' % hdf_file
f = h5py.File(hdf_file, 'r')

I had the same issue, and later identified that my file had only "read-only", which for some reason stopped the h5py to read it. After modifying the permission to "write", I was able to read it. Not sure why it was set up like this.

Related

Overwriting DICOMDIR file with pydicom

I am editing the meta data of multiple dicom images, divided between dicomdirs. I have successfully loaded the dicomdir, traversed it to find the images, edited their meta data and overwritten the original dicom files.
I then successfully overwrite the dicomdir file itself but when I try to open it (for example with Aeskulap) it gives an error message which says "No study or bad DICOMDIR".
When I try to rerun my code I get the following error messages:
Traceback (most recent call last):
File "dicom_run.py", line 28, in <module>
dicom_dir = read_dicomdir(list_files[0])
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 883, in read_dicomdir
ds = dcmread(filename)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 850, in dcmread
force=force, specific_tags=specific_tags)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 741, in read_partial
is_implicit_VR, is_little_endian)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/dicomdir.py", line 57, in __init__
self.parse_records()
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/dicomdir.py", line 95, in parse_records
child = map_offset_to_record[child_offset]
KeyError: 504
When I access the individual dicom files within the directory they load just fine so the problem is how I'm overwriting the dicomdir.
I do so using the following code
import pydicom
from pydicom.filereader import read_dicomdir
# Load dicomdir
dicom_dir = read_dicomdir(<path_to_dicomdir>)
# Here I just traverse the dicom_dir object
# as is outlined here:
# https://pydicom.github.io/pydicom/stable/auto_examples/input_output/plot_read_dicom_directory.html
# Then I (successfully) overwrite the dicomdir with
dicom_dir.save_as(<path_to_dicomdir>)
I have also tried to use the write_file and write_dataset functions as detailed here:
https://pydicom.github.io/pydicom/stable/api_ref.html#module-pydicom.filewriter
again unsuccessfully. I have a backup of the original dicomdir file and when I replace that everything works fine again (and the meta data of each image has been edited). I'm completely lost here.
Edit:
https://github.com/pydicom/pydicom/issues/918
I stumbled upon this. Guess I'll have to do it another way.
I used dcmmkdir and wrote a small bash script that entered all the folders and ran dcmmkdir +r which successfully overwrote the dicomdir files.

OSError: file not found

I'm trying to write a script that needs to rename (in the script itself, not in the folder) some .txt files to be able to use them in a loop, enumerating them.
I decided to use a dictionary, something like this:
import os
import fnmatch
dsc = {}
for filename in os.listdir('./texto'):
if fnmatch.fnmatch(filename, 'dsc_hydra*.txt'):
dsc[filename[:6]] = filename
print(dsc)
print(dsc['dsc_hydra1'])
The 'print(something)' are just to check if everything is going well.
I need to rename them because I'm using them in future functions and I don't want to address them using all that path stuff, something like:
IFOV = gi.IFOV_generic(gmatOUTsat1, matrixINPUTsat1, dsc['dsc_hydra1'], 'ifovfileMST.json', k_lim, height, width)
Using dsc['dsc_hydra1'], I get this error:
Traceback (most recent call last):
File "mainSMART_MST.py", line 429, in <module>
IFOV1= gi.IFOV_generic(gmatOUTsat1,matrixINPUTsat1,dsc['dsc_hydra1'],'ifovfileMST.jso',k_lim, height, width)
File "/home/alumno/Escritorio/HDD_Nuevo/HO(PY)/src/generateIFOV.py", line 49, in IFOV_generic
DCM11,DCM12,DCM13,DCM21,DCM22,DCM23,DCM31,DCM32,DCM33 = np.loadtxt(gmatDCM,unpack=True,skiprows = 2,dtype = float)
File "/home/alumno/.local/lib/python3.5/site-packages/numpy/lib/npyio.py", line 962, in loadtxt
fh = np.lib._datasource.open(fname, 'rt', encoding=encoding)
File "/home/alumno/.local/lib/python3.5/site-packages/numpy/lib/_datasource.py", line 266, in open
return ds.open(path, mode, encoding=encoding, newline=newline)
File "/home/alumno/.local/lib/python3.5/site-packages/numpy/lib/_datasource.py", line 624, in open
raise IOError("%s not found." % path)
OSError: dsc_hydra1.txt not found.
I've already checked the folder and the file is there, why do I keep getting this error?
I had this same issue. It cannot locate the .txt file because you're in the wrong directory. Make sure that where you're trying to execute the code is within the directories of which the code needs. Hope this helps.
I had the same problem. In my case, inside the file.txt, I had a space at the end of the string. You should control the spaces! For example, inside the file.txt (space = -):
-365-
string1-
string2
-string3
if you remove all the spaces (-) it should work!

gensim file not found error

I am executing the following line:
id2word = gensim.corpora.Dictionary.load_from_text('wiki_en_wordids.txt')
This code is available at "https://radimrehurek.com/gensim/wiki.html". I downloaded the wikipedia corpus and generated the required files and wiki_en_wordids.txt is one of those files. This file is available in the following location:
~/gensim/results/wiki_en
So when i execute the code mentioned above I get the following error:
Traceback (most recent call last):
File "~\Python\Python36-32\temp.py", line 5, in <module>
id2word = gensim.corpora.Dictionary.load_from_text('wiki_en_wordids.txt')
File "~\Python\Python36-32\lib\site-packages\gensim\corpora\dictionary.py", line 344, in load_from_text
with utils.smart_open(fname) as f:
File "~\Python\Python36-32\lib\site-packages\smart_open\smart_open_lib.py", line 129, in smart_open
return file_smart_open(parsed_uri.uri_path, mode)
File "~\Python\Python36-32\lib\site-packages\smart_open\smart_open_lib.py", line 613, in file_smart_open
return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'wiki_en_wordids.txt'
Even though the file is available in the required location I get that error. Should I place the file in any other location? How do I determine what the right location is?
The code requires an absolute path here. Relative path should be used when entire operation is carried out in the same directory location, but in this case, the file name is passed as argument to some other function which is located at different location.
One way to handle this situation is using abspath -
import os
id2word = gensim.corpora.Dictionary.load_from_text(os.path.abspath('wiki_en_wordids.txt'))

How do I to turn my .tar.gz file into a file-like object for shutil.copyfileobj?

My goal is to extract a file out of a .tar.gz file without also extracting out the sub directories that precede the desired file. I am trying to module my method off this question. I already asked a question of my own but it seemed like the answer I thought would work didn't work fully.
In short, shutil.copyfileobj isn't copying the contents of my file.
My code is now:
import os
import shutil
import tarfile
import gzip
with tarfile.open('RTLog_20150425T152948.gz', 'r:*') as tar:
for member in tar.getmembers():
filename = os.path.basename(member.name)
if not filename:
continue
source = tar.fileobj
target = open('out', "wb")
shutil.copyfileobj(source, target)
Upon running this code the file out was successfully created however, the file was empty. I know that this file I wanted to extract does, in fact, have lots of information (approximately 450 kb). A print(member.size) returns 1564197.
My attempts to solve this were unsuccessful. A print(type(tar.fileobj)) told me that tar.fileobj is a <gzip _io.BufferedReader name='RTLog_20150425T152948.gz' 0x3669710>.
Therefore I tried changing source to: source = gzip.open(tar.fileobj) but this raised the following error:
Traceback (most recent call last):
File "C:\Users\dzhao\Desktop\123456\444444\blah.py", line 15, in <module>
shutil.copyfileobj(source, target)
File "C:\Python34\lib\shutil.py", line 67, in copyfileobj
buf = fsrc.read(length)
File "C:\Python34\lib\gzip.py", line 365, in read
if not self._read(readsize):
File "C:\Python34\lib\gzip.py", line 433, in _read
if not self._read_gzip_header():
File "C:\Python34\lib\gzip.py", line 297, in _read_gzip_header
raise OSError('Not a gzipped file')
OSError: Not a gzipped file
Why isn't shutil.copyfileobj actually copying the contents of the file in the .tar.gz?
fileobj isn't a documented property of TarFile. It's probably an internal object used to represent the whole tar file, not something specific to the current file.
Use TarFile.extractfile() to get a file-like object for a specific member:
…
source = tar.extractfile(member)
target = open("out", "wb")
shutil.copyfile(source, target)

Python BZ2 IOError: invalid data stream

Traceback (most recent call last):
File "TTRC_main.py", line 309, in <module>
updater.start()
File "TTRC_main.py", line 36, in start
newFileData = bz2.BZ2File("C:/Program Files (x86)/Toontown Rewritten/temp/phase_7.mf.bz2"," rb").read()
IOError: invalid data stream
The code to retrieve file I'm getting that's giving me this error is:
newFileComp = urllib.URLopener()
newFileComp.retrieve("http://kcmo-1.download.toontownrewritten.com/content/phase_7.mf.bz2", "C:/Program Files (x86)/Toontown Rewritten/temp/phase_7.mf.bz2")
What do I do to fix this error? Its not really descriptive. (to me)
Could the issue be occuring because of the extra spacein the file mode? -
newFileData = bz2.BZ2File("C:/Program Files (x86)/Toontown Rewritten/temp/phase_7.mf.bz2"," rb").read()
Try this -
newFileData = bz2.BZ2File("C:/Program Files (x86)/Toontown Rewritten/temp/phase_7.mf.bz2","rb").read()
For me the issue was that the files were not in .bz2 format.
Make sure file is bz2 format.
Make sure the read and write actions are the same "r","w" or "rb","wb"
Like Anand said, no space in "rb".

Categories