Reading .dcm files from a nested directory - python

This is the form of my nested directory:
/data/patient_level/study_level/series_level/
For each patient_level folder, I need to read the ".dcm" files from the corresponding "series_level" folder.
How can I access the ".dcm" file in the "series_level" folders?
I need 3 features from a DICOM object.
This is my source code:
import dicom
record = dicom.read_file("/data/patient_level/study_level/series_level/000001.dcm")
doc = {"PatientID": record.PatientID, "Manufacturer": record.Manufacturer, "SeriesTime": record.SeriesTime}
Then, I will insert this doc to a Mongo DB.
Any suggestions is appreciated.
Thanks in advance.

It is not quite clear the problem you are trying to solve, but if all you want is to get a list of all .dcm files from that the data directory, you can use pathlib.Path():
from pathlib import Path
data = Path('/data')
list(data.rglob('*.dcm'))
To split a file in its components, you can use something like this:
dcm_file = '/patient_level/study_level/series_level/000001.dcm'
_, patient_level, study_level, series_level, filename = dcm_file.split('/')
Which would give you patient_level, study_level, series_level, and filename from dcm_file.
But it would be better to stick with Path() and its methods, like this:
dcm_file = Path('/patient_level/study_level/series_level/000001.dcm')
dcm_file.parts
Which would give you something like this:
('/', 'patient_level', 'study_level', 'series_level', '000001.dcm')
Those are just starting points anyway.

Related

Python - Read txt files from Network shared drive

I am trying to read in data from text files that are moved into a network share drive over a VPN. The overall intent is to be able to loop through the files with yesterday's date (either in the file name, or by the Modified Date) and extract the pipe delimited data separated by "|" and concat it into a pandas df. The issue I am having is actually being able to read files from the network drive. So far I have only been able to figure out how to use os.listdir to identify the file names, but not actually read them. Anyone have any ideas?
This is what I've tried so far that has actually started to pan out = with os.listdir correctly being able to see the Network folder and the files inside - but how would I call the actual files inside (filtered by date or not) to actually get it to work in the loop?
import pandas as pd
#folder = os.listdir(r'\\fileshare.com\PATH\TO\FTP\FILES')
folder = (r'\\fileshare.com\PATH\TO\FTP\FILES')
main_dataframe = pd.DataFrame(pd.read_csv(folder[0]))
for i in range (1, len(folder)):
data = pd.read_csv(folder[i])
df = pd.DataFrame(data)
main_dataframe = pd.concat([main_dataframe, df], axis=1)
print(main_dataframe)
I'm pretty new at Python and doing things like this, so I apologize if I refer to anything wrong. Any advice would be greatly appreciated!

Unable to access sub-folders in flask

I am using below code to access sub-folders inside a folder named 'dataset' in VSCode, however, I am getting an empty list for dataset in the output and hence, unable to get json and image files stored inside that folder. Same code is working in Google Colab.
Code:
if __name__ == "__main__":
"""Dataset path"""
dataset_path = "vehicle images/dataset"
dataset = glob(os.path.join(dataset_path, "*"))
for data in dataset:
image_path = glob(os.path.join(data, "*.jpg"))
json_file = glob(os.path.join(data, "*.json"))
File Structure in VSCode:
File Structure in Google Colab:
Any suggestions would be helpful.
It looks like you used space in folder name. So it can be solvable by two method either change the vehicle images folder to vehicle_images or use row string like
dataset_path = r"vehicle images/dataset"

Copy images from one folder to another using their names on a pandas dataframe

I have a pandas dataframe that consists of 10000s of image names and these images are in a folder locally.
I want to filter that dataframe to pick certain images (in 1000s) and copy those images from the aformentioned local folder to another local folder.
Is there a way that it can be done in python?
I have tried to do that using glob but couldn't make much sense out of it.
I will create an sample example here: I have the following df:
img_name
2014.png
2015.png
2016.png
2021.png
2022.png
2023.png
I have a folder for ex. "my_images" and I wish to move "2015.png" and "2022.png" to another folder called "proc_images".
Thanks
import os
import shutil
path_to_your_files = '../my_images'
copy_to_path = '../proc_images'
files_list = sorted(os.listdir(path_to_your_files))
file_names= ["2015.png","2022.png"]
for curr_file in file_names:
shutil.copyfile(os.path.join(path_to_your_files, curr_file),
os.path.join(copy_to_path, curr_file))
Something like this ?

Iterate over folder with images and put names into a .csv

I am trying out the google vision API & for that I want to do some preparations. I have collected some images online with which I want to work with.
There are around 100 images and now I want to set up a .csv file where the first column has the names of the images inside, so that I can later go over them.
Example:
Name
Picture1.jpg
Picture2.jpg
etc.
Does someone know a Python way to achieve this? So that I can run the code and it puts those names into a .csv?
Thanks already and have a good one!
You can use glob in python to iterate on all the images in a directory and write the image names to a csv file.
Example:
import glob
import os
f = open('images.csv', 'w')
for file in glob.glob('*.png'):
f.write(os.path.basename(file))
f.close()

Parsing a json file using python

I have this json file which has the image name(medical images-around 7500) and its corresponding medical reports.
It is of the format:
{"IMAGE_NAME1.png": "MEDICAL_REPORT1_IN_TEXT", "IMAGE_NAME2.png": "MEDICAL_REPORT2_IN_TEXT", ...and so on for all images ...}
What I want is all the image names in the JSON file so that I can take all the images(From a database of images which is a super set of the image names in the JSON file) and make its own folder. Can someone help me with this?
I think you're looking for the json library. I haven't tested this, but I am pretty confident it will work.
import json
with open("path/to/json.json", 'r') as file:
json_data = json.load(file)
to get image names from what you described the data to look like.
image_names = list(json_data.keys())

Categories