I have an issue within Jupyter that I cannot find online anywhere and was hoping I could get some help.
Essentially, I want to open .JSON files from multiple folders with different names. For example.
data/weather/date=2022-11-20/data.JSON
data/weather/date=2022-11-21/data.JSON
data/weather/date=2022-11-22/data.JSON
data/weather/date=2022-11-23/data.JSON
I want to be able to output the info inside the data.JSON onto my Jupyter Notebook, but how do I do that as the folder names are all different.
Thank you in advance.
What I tried so far
for path,dirs,files in os.walk('data/weather'): for file in files: if fnmatch.fnmatch(file,'*.json'): data = os.path.join(path,file) print(data)
OUTPUT:
data/weather/date=2022-11-20/data.JSON
data/weather/date=2022-11-21/data.JSON
data/weather/date=2022-11-22/data.JSON
data/weather/date=2022-11-23/data.JSON
But i dont want it to output the directory, I want to actually open the .JSON and display its content
This solution uses the os library to go thru different directories
import os
import json
for root, dirs, files in os.walk('data/weather'):
for file in files:
if file.endswith('.JSON'):
with open(os.path.join(root, file), 'r') as f:
data = json.load(f)
print(data)
Related
I have a folder full of csv files that contain results for different participants in an experiment. I'm trying to create one large csv file containing all the csv files in the directory. I'm using listdir() to create a list of all the files but I haven't been able to open the individual files in this list. I think I need to loop over each file but I haven't been able to figure out how.
This is the code I've come up with so far. I get the following error: FileNotFoundError: [Errno 2] No such file or directory: 'results_262.csv' because it appears that the loop only reads one file in the files variable even though there should be many.
from os import listdir
path = "results"
files = listdir(path)
print(files)
results = open("results.csv", "w")
data = []
for file in files:
participant = open(f"{file}", "r")
data.append(participant.readlines())
Would anyone be able to help?
Your issue is that listdir() returns the filenames without their path information, so you need to append the filename to the path.
import os
...
for file in files:
participant = open(os.join(path, file), "r"):
...
Other details:
f"{file}" is the same thing as just file. I.e., open(file, "r") is equivalent to open(f"{file}", "r"), but more efficient - there is not need to use interpolation what you already have the value you want in a variable.
And, you're not closing your files. You should add participant.close() in your loop, or, even better, use a context manager:
for file in files:
with open(os.join(path, file), "r") as participant:
...
The with puts your file handle in a context manager, and the file gets closed automatically when you leave the scope of the context manager.
I have a folder of 400 individual data files saved on my Mac with pathway Users/path/to/file/data. I am trying to write code that will iterate through each data file and plot the data inside of it, however I am having trouble actually importing this folder of all the data into python. Does anyone have a way for me to import this entire folder so I can just iterate through each file by writing
for file in folder:
read data file
plot data file
Any help is greatly appreciated. Thank you!
EDIT: I am using Spyder for this also
You can use the os module to list all files in a directory by path. os provides a function os.listdir(), that when a path is passed, lists all items located in that path, like this: ['file1.py', 'file2.py']. If no argument is passed, it defaults to the current one.
import os
path_to_files = "Users/path/to/file/data"
file_paths = os.listdir(path_to_files)
for file_path in file_paths:
# reading file
# "r" means that you open the file for *reading*
with open(file_path, "r") as file:
lines = file.readlines()
# plot data....
I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname
I have multiple zip folders named zip_folder_1, zip_folder_2, ..., zip_folder_n.
All these zip folders are located in the same directory. Each one of these zip folders contains a csv file named "selected_file.csv".
I need to read each one of the "selected_file.csv" located at each one of the zip folders and concatenate them into a single file
Could someone give me a hint on the required python code to solve this problem? I appreciate your help!
This should produce concatenated_data.csv in your working directory, and assumes that all files in my_data_dir are zip files with data in them.
import os, numpy as np, zipfile
def add_data_to_file(new_data,file_name):
if os.path.isfile(file_name):
mode = 'ab'
else:
mode = 'wb'
with open(file_name,mode) as f:
np.savetxt(f,np.array([new_data]),delimiter=',')
my_data_dir = 'C:/my/zip/data/dir/'
data_files = os.listdir(my_data_dir)
for data_file in data_files:
full_path = os.path.join(my_data_dir,data_file)
with zipfile.ZipFile(full_path,'r',zipfile.ZIP_DEFLATED) as zip_file:
with zip_file.open('selected_file.csv','r') as selected_file:
data = np.loadtxt(selected_file,delimiter=",")
add_data_to_file(data,'concatenated_data.csv')
First of all thanks for reading this. I am a little stuck with sub directory walking (then saving) in Python. My code below is able to walk through each sub directory in turn and process a file to search for certain strings, I then generate an xlsx file (using xlsxwriter) and post my search data to an Excel.
I have two problems...
The first problem I have is that I want to process a text file in each directory, but the text file name varies per sub directory, so rather than specifying 'Textfile.txt' I'd like to do something like *.txt (would I use glob here?)
The second problem is that when I open/create an Excel I would like to save the file to the same sub directory where the .txt file has been found and processed. Currently my Excel is saving to the python script directory, and consequently gets overwritten each time a new sub directory is opened and processed. Would it be wiser to save the Excel at the end to the sub directory or can it be created with the current sub directory path from the start?
Here's my partially working code...
for root, subFolders, files in os.walk(dir_path):
if 'Textfile.txt' in files:
with open(os.path.join(root, 'Textfile.txt'), 'r') as f:
#f = open(file, "r")
searchlines = f.readlines()
searchstringsFilter1 = ['Filter Used :']
searchstringsFilter0 = ['Filter Used : 0']
timestampline = None
timestamp = None
f.close()
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Excel.xlsx', {'strings_to_numbers': True})
worksheetFilter = workbook.add_worksheet("Filter")
Thanks again for looking at this problem.
MikG
I will not solve your code completely, but here are hints:
the text file name varies per sub directory, so rather than specifying 'Textfile.txt' I'd like to do something like *.txt
you can list all files in directory, then check file extension
for filename in files:
if filename.endswith('.txt'):
# do stuff
Also when creating woorkbook, can you enter path? You have root, right? Why not use it?
You don't want glob because you already have a list of files in the files variable. So, filter it to find all the text files:
import fnmatch
txt_files = filter(lambda fn: fnmatch.fnmatch(fn, '*.txt'), files)
To save the file in the same subdirectory:
outfile = os.path.join(root, 'someoutfile.txt')