I have Train and Test folders and inside each folder there is many folders with images inside each. In .csv file there is label for each folder and class.
here is csv file
https://i.imgur.com/qMLGOpC.png
and folders
https://i.imgur.com/RrYBZxG.png
How to load these folders with labels in keras ?
I tried to make a dataframe with folders and labels like below :
pth = 'C:/Users/Documents/train/'
folders = os.listdir(pth)
filepath='C:/Users//Documents//keras/labels.csv'
metadf = pd.read_csv(filepath)
metadf.index = metadf.Class
videos = pd.DataFrame([])
for folder in folders:
pth_upd = pth + folder + '/'
for file in allfiles:
videos = pd.DataFrame(metadf.values, index=folders)
the output is :
https://i.imgur.com/CsXAE8f.png
Is that the correct way of doing it ? how can I load each folder with images and corresponding labels ?
you can use inbuild flow_from_directory() method
check out this link keras docs
Related
I have 10.000 CSV datasets for my project. I need to preprocess all of those datasets but Pandas can only process five at one run? Here is my code for opening all files.
path = "./for_processing"
files = [file for file in os.listdir(path) if not file.startswith('.')]
index_cleaned_news = 0
for file in files:
NEWS_DATA = pd.read_csv(path+"/kompas/clean_news"+str(index_cleaned_news)+".csv")
#Process here
NEWS_DATA.to_csv(path+"/process_kompas/preprocess_news"+str(index_cleaned_news)+".csv")
index_cleaned_news = index_cleaned_news+1
Here is a screenshot when I try to run the code
Anybody know how to fix this or is it a rule from Pandas? Thank you
I am using below code to access sub-folders inside a folder named 'dataset' in VSCode, however, I am getting an empty list for dataset in the output and hence, unable to get json and image files stored inside that folder. Same code is working in Google Colab.
Code:
if __name__ == "__main__":
"""Dataset path"""
dataset_path = "vehicle images/dataset"
dataset = glob(os.path.join(dataset_path, "*"))
for data in dataset:
image_path = glob(os.path.join(data, "*.jpg"))
json_file = glob(os.path.join(data, "*.json"))
File Structure in VSCode:
File Structure in Google Colab:
Any suggestions would be helpful.
It looks like you used space in folder name. So it can be solvable by two method either change the vehicle images folder to vehicle_images or use row string like
dataset_path = r"vehicle images/dataset"
I have a pandas dataframe that consists of 10000s of image names and these images are in a folder locally.
I want to filter that dataframe to pick certain images (in 1000s) and copy those images from the aformentioned local folder to another local folder.
Is there a way that it can be done in python?
I have tried to do that using glob but couldn't make much sense out of it.
I will create an sample example here: I have the following df:
img_name
2014.png
2015.png
2016.png
2021.png
2022.png
2023.png
I have a folder for ex. "my_images" and I wish to move "2015.png" and "2022.png" to another folder called "proc_images".
Thanks
import os
import shutil
path_to_your_files = '../my_images'
copy_to_path = '../proc_images'
files_list = sorted(os.listdir(path_to_your_files))
file_names= ["2015.png","2022.png"]
for curr_file in file_names:
shutil.copyfile(os.path.join(path_to_your_files, curr_file),
os.path.join(copy_to_path, curr_file))
Something like this ?
I have around 500 .txt files in my local system and would like to merge them into a dataframe in Google Colab. I have already uploaded them via Upload option where I uploaded the zipped folder containing the .txt files and later unzipped them in Google Colab. Each .txt file has one row data eg. 0 12 34.3 423
I tried the following code to directly upload from my local system but it did not work
Colab cannot access your local files through the typical built-ins as far as I know. You have to use Colab-specific modules. The guide is here.
from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
This will prompt you to select the files to upload.
EDIT: As you need the file names, you can just use the loop above and then concatenate as you mentioned correctly.
# create a list of file names
file = []
for fn in uploaded.keys():
files.append(fn)
# create a list of dataframes
for file in files:
new = pd.read_csv(file)
try:
frames.append(new)
except:
frames = [new]
# concat all of your frames at once
df = pd.concat(frames)
Alternatively, depending on the size of your files, you could also join the for loops and load one file and concat it directly to the existing frames such that the memory has to hold less data at once.
This is the form of my nested directory:
/data/patient_level/study_level/series_level/
For each patient_level folder, I need to read the ".dcm" files from the corresponding "series_level" folder.
How can I access the ".dcm" file in the "series_level" folders?
I need 3 features from a DICOM object.
This is my source code:
import dicom
record = dicom.read_file("/data/patient_level/study_level/series_level/000001.dcm")
doc = {"PatientID": record.PatientID, "Manufacturer": record.Manufacturer, "SeriesTime": record.SeriesTime}
Then, I will insert this doc to a Mongo DB.
Any suggestions is appreciated.
Thanks in advance.
It is not quite clear the problem you are trying to solve, but if all you want is to get a list of all .dcm files from that the data directory, you can use pathlib.Path():
from pathlib import Path
data = Path('/data')
list(data.rglob('*.dcm'))
To split a file in its components, you can use something like this:
dcm_file = '/patient_level/study_level/series_level/000001.dcm'
_, patient_level, study_level, series_level, filename = dcm_file.split('/')
Which would give you patient_level, study_level, series_level, and filename from dcm_file.
But it would be better to stick with Path() and its methods, like this:
dcm_file = Path('/patient_level/study_level/series_level/000001.dcm')
dcm_file.parts
Which would give you something like this:
('/', 'patient_level', 'study_level', 'series_level', '000001.dcm')
Those are just starting points anyway.