I have a folder of 400 individual data files saved on my Mac with pathway Users/path/to/file/data. I am trying to write code that will iterate through each data file and plot the data inside of it, however I am having trouble actually importing this folder of all the data into python. Does anyone have a way for me to import this entire folder so I can just iterate through each file by writing
for file in folder:
read data file
plot data file
Any help is greatly appreciated. Thank you!
EDIT: I am using Spyder for this also
You can use the os module to list all files in a directory by path. os provides a function os.listdir(), that when a path is passed, lists all items located in that path, like this: ['file1.py', 'file2.py']. If no argument is passed, it defaults to the current one.
import os
path_to_files = "Users/path/to/file/data"
file_paths = os.listdir(path_to_files)
for file_path in file_paths:
# reading file
# "r" means that you open the file for *reading*
with open(file_path, "r") as file:
lines = file.readlines()
# plot data....
Related
I have an issue within Jupyter that I cannot find online anywhere and was hoping I could get some help.
Essentially, I want to open .JSON files from multiple folders with different names. For example.
data/weather/date=2022-11-20/data.JSON
data/weather/date=2022-11-21/data.JSON
data/weather/date=2022-11-22/data.JSON
data/weather/date=2022-11-23/data.JSON
I want to be able to output the info inside the data.JSON onto my Jupyter Notebook, but how do I do that as the folder names are all different.
Thank you in advance.
What I tried so far
for path,dirs,files in os.walk('data/weather'): for file in files: if fnmatch.fnmatch(file,'*.json'): data = os.path.join(path,file) print(data)
OUTPUT:
data/weather/date=2022-11-20/data.JSON
data/weather/date=2022-11-21/data.JSON
data/weather/date=2022-11-22/data.JSON
data/weather/date=2022-11-23/data.JSON
But i dont want it to output the directory, I want to actually open the .JSON and display its content
This solution uses the os library to go thru different directories
import os
import json
for root, dirs, files in os.walk('data/weather'):
for file in files:
if file.endswith('.JSON'):
with open(os.path.join(root, file), 'r') as f:
data = json.load(f)
print(data)
Problem Summary:
In one of my folder I have .tar.gz file and I need to extract all the images (.jpg & .png) from it. But I have to use the .tar.gz extension (using path to directory) to extract it and not by using the usual way of giving the input file_name to extract it. I need this in one of the part of GUI (Tkinter) for the image classification project.
Code I'm trying:
import os
import tarfile
def extractfile():
os.chdir('GUI_Tkinter/PMC_downloads')
with tarfile.open(os.path.join(os.environ['GUI_Tkinter/PMC_downloads'], f'Backup_{self.batch_id}.tar.gz'), "r:gz") as so:
so.extractall(path=os.environ['GUI_Tkinter/PMC_downloads'])
The code is not giving any error but it's not working. Please suggest me how to do the same by any other way by specifying the .tar.gz file extension to extract it.
I think you can use this code.
import tarfile
import os
t = tarfile.open('example.tar.gz', 'r')
for member in t.getmembers():
if ".jpg" in member.name:
t.extract(member, "outdir")
print(os.listdir('outdir'))
Hope to be helpful for you. Thanks.
Generic/dynamic way to extract one or more .tar.gz or zip file present in a folder without specifying the file name. This is executed by using the extension and the path (location) of the file. You can extract any type of file (.pdf, .nxml, .xml, .gif, etc.) you want from the .tar.gz/zip/compressed file just by mentioning the extension of the required file as the member name in this code. As, I needed all the images from that .tar.gz file to be extracted in one folder. So, in the code below I have specified the extensions .jpg and .png and extracted all the images in the same directory under a folder named "Extracted_Images". If you want, you can also change the directory where the files needed to be extracted by providing the path parameter.
For example "C:/Users/dell/project/histo_images" instead of "Extracted_Images".
import tarfile
import os
import glob
path = glob.glob("*.tar.gz")
for file in path:
t = tarfile.open(file, 'r')
for member in t.getmembers():
if ".jpg" in member.name:
t.extract(member, "Extracted_Images")
elif ".png" in member.name:
t.extract(member, "Extracted_Images")
I need to read a CSV file from a folder, which is generating from another Module. If that Module fails it won't generate the folder which will have a CSV file.
Example:
path = 'c/files' --- fixed path
When Module successfully runs it will create a folder called output and a file in it.
path =
'c/files/output/somename.csv'
But here is a catch everytime it generates a output folder, CSV file has a different name.
First i need to check if that output folder and a CSV file is there or not.
After that I need to read that CSV.
The following will check for existance of output folder as well as csv file and read the csv file:
import os
import pandas as pd
if 'output' in os.listdir('c/files'):
if len(os.listdir('c/files/output')>0:
x=[i for i in os.listdir('c/files/output') if i[-3:]=='csv][0]
new_file=pd.read_csv(x)
glob.glob can help.
glob.glob('c/files/output/*.csv') returns either an empty list or a list with (hopefully) the path to a single file
You may also try to get the latest file based on creation time, after you have done check on existence of a directory (from above post). Something like below
list_of_files = glob.glob("c/files/*.csv")
latest_file = max(list_of_files, key=os.path.getctime)
latest_file is the latest created file in your directory.
I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname
Currently, I have created a code that makes graphs from data in .csv files. However, I can only run the code if that code is present in the folder with the csv files. How can I make the the script file so that it doesn't have to be in the same directory as the .csv files.
Assuming you mean to include a fixed CSV file with your code, store an absolute path based on the script path:
HERE = os.path.dirname(os.path.abspath(__file__))
csv_filename = open(os.path.join(HERE, 'somefile.csv')
__file__ is the filename of the current module or script, os.path.dirname(__file__) is the directory the module resides in. For scripts, __file__ can be a relative pathname, so we use os.path.abspath() to turn that into an absolute path.
This means you can run your script from anywhere.
If you meant to make your script work with arbitrary CSV input files, use command line options:
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser('CSV importer')
parser.add_argument('csvfile', type=argparse.FileType('w'),
default='somedefaultfilename.csv')
options = parser.parse_args()
import_function(options.csvfile)
where csvfile will be an open file object, so your import_function() can just do:
def import_function(csvfile):
with csvfile:
reader = csv.reader(csvfile)
for row in reader:
# etc.
If you don't plan on moving around the csv files too much, the best answer is to hard code the absolute path to the csv folder into the script.
import os
csvdir = "/path/to/csv/dir"
csvfpath = os.path.join(csvdir, "myfile.csv")
csvfile = open(csvfpath)
You can also use a command line parser like argparse to let the user easily change the path to the csv files.
Using Martijn Pieters's solution will work only if you are going to be moving the folder containing both the script and csv files around. However in that case, you may as well just use relative paths to the csv files.