Overwriting DICOMDIR file with pydicom

Overwriting DICOMDIR file with pydicom - python

I am editing the meta data of multiple dicom images, divided between dicomdirs. I have successfully loaded the dicomdir, traversed it to find the images, edited their meta data and overwritten the original dicom files.
I then successfully overwrite the dicomdir file itself but when I try to open it (for example with Aeskulap) it gives an error message which says "No study or bad DICOMDIR".
When I try to rerun my code I get the following error messages:
Traceback (most recent call last):
File "dicom_run.py", line 28, in <module>
dicom_dir = read_dicomdir(list_files[0])
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 883, in read_dicomdir
ds = dcmread(filename)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 850, in dcmread
force=force, specific_tags=specific_tags)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/filereader.py", line 741, in read_partial
is_implicit_VR, is_little_endian)
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/dicomdir.py", line 57, in __init__
self.parse_records()
File "/home/user/anaconda3/lib/python3.7/site-packages/pydicom
/dicomdir.py", line 95, in parse_records
child = map_offset_to_record[child_offset]
KeyError: 504
When I access the individual dicom files within the directory they load just fine so the problem is how I'm overwriting the dicomdir.
I do so using the following code
import pydicom
from pydicom.filereader import read_dicomdir
# Load dicomdir
dicom_dir = read_dicomdir(<path_to_dicomdir>)
# Here I just traverse the dicom_dir object
# as is outlined here:
# https://pydicom.github.io/pydicom/stable/auto_examples/input_output/plot_read_dicom_directory.html
# Then I (successfully) overwrite the dicomdir with
dicom_dir.save_as(<path_to_dicomdir>)
I have also tried to use the write_file and write_dataset functions as detailed here:
https://pydicom.github.io/pydicom/stable/api_ref.html#module-pydicom.filewriter
again unsuccessfully. I have a backup of the original dicomdir file and when I replace that everything works fine again (and the meta data of each image has been edited). I'm completely lost here.
Edit:
https://github.com/pydicom/pydicom/issues/918
I stumbled upon this. Guess I'll have to do it another way.

I used dcmmkdir and wrote a small bash script that entered all the folders and ran dcmmkdir +r which successfully overwrote the dicomdir files.

Related

MLflow load model fails Python

I am trying to build an API using an MLflow model.
the funny thing is it works from one location on my PC and not from another. So, the reason for doing I wanted to change my repo etc.
So, the simple code of
from mlflow.pyfunc import load_model
MODEL_ARTIFACT_PATH = "./model/model_name/"
MODEL = load_model(MODEL_ARTIFACT_PATH)
now fails with
ERROR: Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 540, in lifespan
async for item in self.lifespan_context(app):
File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 481, in default_lifespan
await self.startup()
File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 516, in startup
await handler()
File "/code/./app/main.py", line 32, in startup_load_model
MODEL = load_model(MODEL_ARTIFACT_PATH)
File "/usr/local/lib/python3.8/dist-packages/mlflow/pyfunc/__init__.py", line 733, in load_model
model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
File "/usr/local/lib/python3.8/dist-packages/mlflow/spark.py", line 737, in _load_pyfunc
return _PyFuncModelWrapper(spark, _load_model(model_uri=path))
File "/usr/local/lib/python3.8/dist-packages/mlflow/spark.py", line 656, in _load_model
return PipelineModel.load(model_uri)
File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/util.py", line 332, in load
return cls.read().load(path)
File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/pipeline.py", line 258, in load
return JavaMLReader(self.cls).load(path)
File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/util.py", line 282, in load
java_obj = self._jread.load(path)
File "/usr/local/lib/python3.8/dist-packages/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/usr/local/lib/python3.8/dist-packages/pyspark/sql/utils.py", line 117, in deco
raise converted from None
pyspark.sql.utils.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.
The model artifacts are already downloaded to the folder /model folder which has the following structure.
the load model call is in the main.py file
As I mentioned it works from another directory, but there is no reference to any absolute paths. Also, I have made sure that my package references are identical. e,g I have pinned them all down
# Model
mlflow==1.25.1
protobuf==3.20.1
pyspark==3.2.1
scipy==1.6.2
six==1.15.0
also, the same docker file is used both places, which among other things, makes sure that the final resulting folder structure is the same
......other stuffs
COPY ./app /code/app
COPY ./model /code/model
what can explain it throwing this exception whereas in another location (on my PC), it works (same model artifacts) ?
Since it uses load_model function, it should be able to read the parquet files ?
Any question and I can explain.
EDIT1: I have debugged this a little more in the docker container and it seems the parquet files in the itemFactors folder (listed in my screenshot above) are not getting copied over to my image , even though I have the copy command to copy all files under the model folder. It is copying the _started , _committed and _SUCCESS files, just not the parquet files. Anyone knows why would that be? I DO NOT have a .dockerignore file. Why are those files ignored while copying?

I found the problem. Like I wrote in the EDIT1 of my post, with further observations, the parquet files were missing in the docker container. That was strange because I was copying the entire folder in my Dockerfile.
I then realized that I was hitting this problem mentioned here. File paths exceeding 260 characters, silently fail and do not get copied over to the docker container. This was really frustrating because nothing failed during build and then during run, it gave me that cryptic error of "unable to infer schema for parquet", essentially because the parquet files were not copied over during docker build.

How to append keywords to IPTC data in a JPG image?

I'm trying to add keywords to the IPTC data in a JPG file and failing miserably. I'm able to read in the keywords using the iptcinfo3 library and, seemingly, append the keyword to the list of current keywords but I'm failing when trying to write those keywords back to the JPG file, if not sooner. The error message is a bit unclear to me and may actually reference the appending of the new keyword (although a print statement seems to indicate it took).
I've tried three different metadata libraries (there doesn't seem to be one standard) and this is the furthest I've gotten with any of them (failing to even install one and not being able to get a second one to run). This seems so basic but I can't figure it out and haven't been able to adapt the few other code examples I've seen online to work, including iptcinfo3's example code fragment.
The current Error message is:
| => pipenv run python editMetadata.py
WARNING: problems with charset recognition (b'\x1b')
[b'Gus']
[b'Gus', b'frog']
Traceback (most recent call last):
File "editMetadata.py", line 22, in <module>
info.save_as('Gus2.jpg')
File "/Users/Scott/.local/share/virtualenvs/editPhotoMetadata-tx0JAOmI/lib/python3.7/site-packages/iptcinfo3.py", line 635, in save_as
jpeg_parts = jpeg_collect_file_parts(fh)
File "/Users/Scott/.local/share/virtualenvs/editPhotoMetadata-tx0JAOmI/lib/python3.7/site-packages/iptcinfo3.py", line 324, in jpeg_collect_file_parts
adobeParts = collect_adobe_parts(partdata)
File "/Users/Scott/.local/share/virtualenvs/editPhotoMetadata-tx0JAOmI/lib/python3.7/site-packages/iptcinfo3.py", line 433, in collect_adobe_parts
out = [''.join(out)]
TypeError: sequence item 0: expected str instance, bytes found
Code:
from iptcinfo3 import IPTCInfo
import os
# Create new info object
info = IPTCInfo('Gus.jpg')
# Print list of keywords
print(info['keywords'])
# Append the keyword I want to add
info['keywords'].append(b'frog')
# Print to test keyword has been added
print(info['keywords'])
# Save new info to file
info.save()
info.save_as('Gus2.jpg')

Instead of appending use equal "="
from iptcinfo3 import IPTCInfo
info = IPTCInfo('Gus.jpg')
print(info['keywords'])
# add keyword
info['keywords'] = ['new keyword']
info.save()
info.save_as('Gus_2.jpg')

I have the same error. It seems to be an issue with the save depending on the file.
from iptcinfo3 import IPTCInfo
info = IPTCInfo('image.jpg', force=True)
info.save()
Which gives me the same error.
WARNING: problems with charset recognition (b'\x1b')
WARNING: problems with charset recognition (b'\x1b')
Traceback (most recent call last):
File "./searchimages.py", line 123, in <module>
main(sys.argv[1:])
File "./searchimages.py", line 119, in main
find_photos(str(sys.argv[1]))
File "./searchimages.py", line 46, in find_photos
write_keywords(image, current_keywords, new_keywords)
File "./searchimages.py", line 109, in write_keywords
info.save_as('out.jpg')
File "/usr/local/lib/python3.7/site-packages/iptcinfo3.py", line 635, in save_as
jpeg_parts = jpeg_collect_file_parts(fh)
File "/usr/local/lib/python3.7/site-packages/iptcinfo3.py", line 324, in jpeg_collect_file_parts
adobeParts = collect_adobe_parts(partdata)
File "/usr/local/lib/python3.7/site-packages/iptcinfo3.py", line 433, in collect_adobe_parts
out = [''.join(out)]
TypeError: sequence item 0: expected str instance, bytes found

PermissionError: [Errno 13] Permission denied when saving HoloMap to GIF

I am trying to create an animated gig from a series of heat maps with HoloViews.
I need to do this in a Python script, i. e. specifically not in a Jupyter notebook.
When saving the image, Python throws an error because it cannot create a temporary file in the temp-folder of the current user (this is under Windows). Happens regardless of the user, even when I run Python as admin.
When I stop in the debugger and change the temp-file path to some other place, e. g. Desktop, that works, but the resulting holo.gif in the working directory is empty (0 bytes). The temporary gif, though, is correctly animated, so I guess the code is basically OK.
[Edit: Not so sure anymore. I ran this the night through on 26.531 heat maps each of which consisted of a 5x5 grid. The process did not finish (i. e. did not hit the breakppoint at Image.py line 1966). Is there a way to do what I want that is less painfully slow?]
Answers to similar problems on StackOverflow did point to permission problems (but what kind of problem could that be if it doesn't even work for an admin?) and suggest saving to another location, which is impossible here as I have no control over where matplotlib will try to create temporary files.
The problem is specifically with gif's, I can create *.png or *.html output without error. (AFAIK, the difference is that gif-creation uses ImageMagick.)
Here's the code (construction of underlying heat map data left out):
import holoviews as hv
hv.extension('matplotlib')
renderer = hv.renderer('matplotlib')
renderer.fps = 3
heatMapDict = {
k: hv.HeatMap(measurements[k].sensors) for k in range(len(measurements))
}
holo = hv.HoloMap(heatMapDict, kdims='index')
renderer.save(holo, 'holo', fmt='gif')
And the traceback:
INFO:matplotlib.animation:Animation.save using <class 'matplotlib.animation.PillowWriter'>
Traceback (most recent call last):
File "cm3.py", line 69, in <module>
renderer.save(holo, 'holo', fmt='gif')
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\holoviews\plotting\renderer.py", line 554, in save
rendered = self_or_cls(plot, fmt)
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\holoviews\plotting\mpl\renderer.py", line 108, in __call__
data = self._figure_data(plot, fmt, **({'dpi':self.dpi} if self.dpi else {}))
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\holoviews\plotting\mpl\renderer.py", line 196, in _figure_data
data = self._anim_data(anim, fmt)
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\holoviews\plotting\mpl\renderer.py", line 246, in _anim_data
anim.save(f.name, writer=writer, **anim_kwargs)
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\matplotlib\animation.py", line 1174, in save
writer.grab_frame(**savefig_kwargs)
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\matplotlib\animation.py", line 232, in saving
self.finish()
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\matplotlib\animation.py", line 583, in finish
duration=int(1000 / self.fps))
File "C:\Users\y2046\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\Image.py", line 1966, in save
fp = builtins.open(filename, "w+b")
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\y2046\\AppData\\Local\\Temp\\tmp4im5ozo8.gif'
Addendum:
I'm coming to think that this is not a permission problem after all. Perhaps it has to do with reentrancy and file-locking under Windows? The Python process in fact may create files in the temp directory, as proved by inserting the following test code before calling renderer.save():
import os
import builtins
filename = 'C:\\Users\\y2046\\AppData\\Local\\Temp\\test.txt'
fp = builtins.open(filename, "w+b")
try:
fp.write("first".encode('utf-8'))
finally:
fp.close()
os.remove(filename)
I should test this under Linux. If it works there, there must be a bug in the Pillow writer.

It looks like there is something broken with HoloViews. I have opened issue #3151 with them.

NBT Parser Minecraft mca file not a gzipped file error

I try to read a Minecraft world with Python from the filesystem and the .mca region/anvil files using the NBT 1.4.1 module (Named Binary Tag Reader/Writer), which is supposed to read the NBT format used in Minecraft. It works fine for files such as level.dat, but throws an error for the region files such as r.0.0.mca
Edit: I am referring to the auto generated world files that minecraft stores in the .minecraft/saves/"MyWorld"/ folder. Such as the level.dat (which works), and the mca files stored in the .minecraft/saves/"MyWorld"/region/ folder such as r.0.0.mca which don't work. I uploaded two sample files from one of my worlds.
Code:
from nbt import nbt
level_file = nbt.NBTFile("level.dat", "rb") # works
region_file = nbt.NBTFile("r.0.0.mca", "rb")# does not work
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 508, in __init__
self.parse_file()
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 532, in parse_file
type = TAG_Byte(buffer=self.file)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 85, in __init__
self._parse_buffer(buffer)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 90, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')
Any suggestions how to get this working?

r.0.0.mca is most definitely not compressed. About 80% of the bytes are zeros.

It turns out that the NBT library only supports .mcr region files which have been replaced by .mca files about 6 years ago. However, mcedit is written in Python and supports those files. Due the changes in the Minecraft save format, the interpretation of the content needs to be adjusted though, but the files can be successfully read.

Error in reading hdf file using h5py package for python

I want to extract data from hdf files that I downloaded from MODIS website. A sample file is provided in the link. I am reading the hdf file by using the following lines of code:
>>> import h5py
>>> f = h5py.File( 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf', 'r' )
The error I am getting:
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
f = h5py.File( 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf', 'r' )
File "C:\Python27\lib\site-packages\h5py\_hl\files.py", line 165, in __init__
fid = make_fid(name, mode, userblock_size, fapl)
File "C:\Python27\lib\site-packages\h5py\_hl\files.py", line 57, in make_fid
fid = h5f.open(name, h5f.ACC_RDONLY, fapl=fapl)
File "h5f.pyx", line 70, in h5py.h5f.open (h5py\h5f.c:1640)
IOError: unable to open file (File accessability: Unable to open file)
I have tried several other hdf files from different sources but I am getting the same error. What seems to be the fault here?

I think there could be two possible problems:
1) As the file extension is "hdf", maybe this is a HDF4 file. HDF5 files normally have ".hdf5" or ".h5·" extension. I am not sure if h5py is able to read HDF4 files.
2) Perhaps you have to change permissions to the file itself. If you are in a linux machine try: chmod +r file.hdf
You can try to open your file with HDFView. This software is available in several platforms. You can check the properties of the files very easily with it.

This sounds like a file permission error, or even file existence. Maybe add some checks such as
import os
hdf_file = 'MYD08_M3.A2002182.051.2008334061251.psgscs_000500751197.hdf'
if not os.path.isfile(hdf_file):
print 'file %s not found' % hdf_file
if not os.access(hdf_file, os.R_OK):
print 'file %s not readable' % hdf_file
f = h5py.File(hdf_file, 'r')

I had the same issue, and later identified that my file had only "read-only", which for some reason stopped the h5py to read it. After modifying the permission to "write", I was able to read it. Not sure why it was set up like this.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Overwriting DICOMDIR file with pydicom - python

I used dcmmkdir and wrote a small bash script that entered all the folders and ran dcmmkdir +r which successfully overwrote the dicomdir files.

Related

MLflow load model fails Python

How to append keywords to IPTC data in a JPG image?

PermissionError: [Errno 13] Permission denied when saving HoloMap to GIF

NBT Parser Minecraft mca file not a gzipped file error

Error in reading hdf file using h5py package for python

Categories

Resources