Iterate over folder with images and put names into a .csv - python

I am trying out the google vision API & for that I want to do some preparations. I have collected some images online with which I want to work with.
There are around 100 images and now I want to set up a .csv file where the first column has the names of the images inside, so that I can later go over them.
Example:
Name
Picture1.jpg
Picture2.jpg
etc.
Does someone know a Python way to achieve this? So that I can run the code and it puts those names into a .csv?
Thanks already and have a good one!

You can use glob in python to iterate on all the images in a directory and write the image names to a csv file.
Example:
import glob
import os
f = open('images.csv', 'w')
for file in glob.glob('*.png'):
f.write(os.path.basename(file))
f.close()

Related

Python - Read txt files from Network shared drive

I am trying to read in data from text files that are moved into a network share drive over a VPN. The overall intent is to be able to loop through the files with yesterday's date (either in the file name, or by the Modified Date) and extract the pipe delimited data separated by "|" and concat it into a pandas df. The issue I am having is actually being able to read files from the network drive. So far I have only been able to figure out how to use os.listdir to identify the file names, but not actually read them. Anyone have any ideas?
This is what I've tried so far that has actually started to pan out = with os.listdir correctly being able to see the Network folder and the files inside - but how would I call the actual files inside (filtered by date or not) to actually get it to work in the loop?
import pandas as pd
#folder = os.listdir(r'\\fileshare.com\PATH\TO\FTP\FILES')
folder = (r'\\fileshare.com\PATH\TO\FTP\FILES')
main_dataframe = pd.DataFrame(pd.read_csv(folder[0]))
for i in range (1, len(folder)):
data = pd.read_csv(folder[i])
df = pd.DataFrame(data)
main_dataframe = pd.concat([main_dataframe, df], axis=1)
print(main_dataframe)
I'm pretty new at Python and doing things like this, so I apologize if I refer to anything wrong. Any advice would be greatly appreciated!

How to plot and analyse CAN data directly on python?

I need to analyse a lot of CAN data and want to use python for that. I recently came across the python-can library and saw that it's possible to convert .blf to .asc files.
How do I convert .blf data of CAN to .asc using python This post helped a lot.
https://stackoverflow.com/users/13525512/tranbi Can #Tranbi or anyone else help me with some example code?
This is the part I have done till now:
import can
import os
fileList = os.listdir(".\inputFiles")
for i in range(len(fileList)):
with open(os.path.join(".\inputFiles", fileList[i]), 'rb') as f_in:
log_in = can.io.BLFReader(f_in)
with open(os.path.join(".\outputFiles", os.path.splitext(fileList[i])[0] + '.asc'), 'w') as f_out:
log_out = can.io.ASCWriter(f_out)
for msg in log_in:
log_out.on_message_received(msg)
log_out.stop()
I need to either directly read data from .blf files sequentially, or convert them to .asc, correct the timestamp using the file name, combine the files, convert them to .csv and then analyse in python. Would really help if I can get a shorter route?

Copy images from one folder to another using their names on a pandas dataframe

I have a pandas dataframe that consists of 10000s of image names and these images are in a folder locally.
I want to filter that dataframe to pick certain images (in 1000s) and copy those images from the aformentioned local folder to another local folder.
Is there a way that it can be done in python?
I have tried to do that using glob but couldn't make much sense out of it.
I will create an sample example here: I have the following df:
img_name
2014.png
2015.png
2016.png
2021.png
2022.png
2023.png
I have a folder for ex. "my_images" and I wish to move "2015.png" and "2022.png" to another folder called "proc_images".
Thanks
import os
import shutil
path_to_your_files = '../my_images'
copy_to_path = '../proc_images'
files_list = sorted(os.listdir(path_to_your_files))
file_names= ["2015.png","2022.png"]
for curr_file in file_names:
shutil.copyfile(os.path.join(path_to_your_files, curr_file),
os.path.join(copy_to_path, curr_file))
Something like this ?

Reading .dcm files from a nested directory

This is the form of my nested directory:
/data/patient_level/study_level/series_level/
For each patient_level folder, I need to read the ".dcm" files from the corresponding "series_level" folder.
How can I access the ".dcm" file in the "series_level" folders?
I need 3 features from a DICOM object.
This is my source code:
import dicom
record = dicom.read_file("/data/patient_level/study_level/series_level/000001.dcm")
doc = {"PatientID": record.PatientID, "Manufacturer": record.Manufacturer, "SeriesTime": record.SeriesTime}
Then, I will insert this doc to a Mongo DB.
Any suggestions is appreciated.
Thanks in advance.
It is not quite clear the problem you are trying to solve, but if all you want is to get a list of all .dcm files from that the data directory, you can use pathlib.Path():
from pathlib import Path
data = Path('/data')
list(data.rglob('*.dcm'))
To split a file in its components, you can use something like this:
dcm_file = '/patient_level/study_level/series_level/000001.dcm'
_, patient_level, study_level, series_level, filename = dcm_file.split('/')
Which would give you patient_level, study_level, series_level, and filename from dcm_file.
But it would be better to stick with Path() and its methods, like this:
dcm_file = Path('/patient_level/study_level/series_level/000001.dcm')
dcm_file.parts
Which would give you something like this:
('/', 'patient_level', 'study_level', 'series_level', '000001.dcm')
Those are just starting points anyway.

Python: unziping special files into memory and getting them into a DataFrame

I'm quite stuck with a code I'm writing in Python, I'm a beginner and maybe is really easy, but I just can't see it. Any help would be appreciated. So thank you in advance :)
Here is the problem: I have to read some special data files with an special extension .fen into a pandas DataFrame.This .fen files are inside a zipped file .fenx that contains the .fen file and a .cfg configuration file.
In the code I've written I use zipfile library in order to unzip the files, and then get them in the DataFrame. This code is the following
import zipfile
import numpy as np
import pandas as pd
def readfenxfile(Directory,File):
fenxzip = zipfile.ZipFile(Directory+ '\\' + File, 'r')
fenxzip.extractall()
fenxzip.close()
cfgGeneral,cfgDevice,cfgChannels,cfgDtypes=readCfgFile(Directory,File[:-5]+'.CFG')
#readCfgFile redas the .cfg file and returns some important data.
#Here only the cfgDtypes would be important as it contains the type of data inside the .fen and that will become the column index in the final DataFrame.
if cfgChannels!=None:
dtDtype=eval('np.dtype([' + cfgDtypes + '])')
dt=np.fromfile(Directory+'\\'+File[:-5]+'.fen',dtype=dtDtype)
dt=pd.DataFrame(dt)
else:
dt=[]
return dt,cfgChannels,cfgDtypes
Now, the extract() method saves the unzipped file in the hard drive. The .fenx files can be quite big so this need of storing (and afterwards deleting them) is really slow. I would like to do the same I do now, but getting the .fen and .cfg files into the memory, not the hard drive.
I have tried things like fenxzip.read('whateverthenameofthefileis.fen')and some other methods like .open() from the zipfile library. But I can't get what .read() returns into a numpy array in anyway i tried.
I know it can be a difficult question to answer, because you don't have the files to try and see what happens. But if someone would have any ideas I would be glad of reading them. :) Thank you very much!
Here is the solution I finally found in case it can be helpful for anyone. It uses the tempfile library to create a temporal object in memory.
import zipfile
import tempfile
import numpy as np
import pandas as pd
def readfenxfile(Directory,File,ExtractDirectory):
fenxzip = zipfile.ZipFile(Directory+ r'\\' + File, 'r')
fenfile=tempfile.SpooledTemporaryFile(max_size=10000000000,mode='w+b')
fenfile.write(fenxzip.read(File[:-5]+'.fen'))
cfgGeneral,cfgDevice,cfgChannels,cfgDtypes=readCfgFile(fenxzip,File[:-5]+'.CFG')
if cfgChannels!=None:
dtDtype=eval('np.dtype([' + cfgDtypes + '])')
fenfile.seek(0)
dt=np.fromfile(fenfile,dtype=dtDtype)
dt=pd.DataFrame(dt)
else:
dt=[]
fenfile.close()
fenxzip.close()
return dt,cfgChannels,cfgDtypes

Categories