Reading all json files in a directory

Reading all json files in a directory - python

I have multiple (400) json files containing a dict in a directory that I want to read and append to a list. I've tried looping over all the files in the directory like this:
path_to_jsonfiles = 'TripAdvisorHotels'
alldicts = []
for file in os.listdir(path_to_jsonfiles):
with open(file,'r') as fi:
dict = json.load(fi)
alldicts.append(dict)
I keep getting the following error:
FileNotFoundError: [Errno 2] No such file or directory
However, when I look at the files in the directory, it gives me all the right files.
for file in os.listdir(path_to_jsonfiles):
print(file)
Just opening one of them with the file name works as well.
with open('AWEO-q_GiWls5-O-PzbM.json','r') as fi:
data = json.load(fi)
Were in the loop is it going wrong?

Your code has two errors:
1.file is only the file name. You have to write full filepath (including its folder).
2.You have to use append inside the loop.
To sum up, this should work:
alldicts = []
for file in os.listdir(path_to_jsonfiles):
full_filename = "%s/%s" % (path_to_jsonfiles, file)
with open(full_filename,'r') as fi:
dict = json.load(fi)
alldicts.append(dict)

Related

Filter Directory using Regex and output filtered files to another directory

I am simply trying to create a python 3 program that runs through all .sql files in a specific directory and then apply my regex that adds ; after a certain instance and write the changes made to the file to a separate directory with their respective file names as the same.
So, if I had file1.sql and file2.sql in "/home/files" directory, after I run the program, the output should write those two files to "/home/new_files" without changes the content of the original files.
Here is my code:
import glob
import re
folder_path = "/home/files/d_d"
file_pattern = "/*sql"
folder_contents = glob.glob(folder_path + file_pattern)
for file in folder_contents:
print("Checking", file)
for file in folder_contents:
read_file = open(file, 'rt',encoding='latin-1').read()
#words=read_file.split()
with open(read_file,"w") as output:
output.write(re.sub(r'(TBLPROPERTIES \(.*?\))', r'\1;', f, flags=re.DOTALL))
I receive an error of File name too long:"CREATE EXTERNAL TABLe" and also I am not too sure where I would put my output path (/home/files/new_dd)in my code.
Any ideas or suggestions?

With read_file = open(file, 'rt',encoding='latin-1').read() the whole content of the file was being used as the file descriptor. The code provided here iterate over the files names found with glob.glob pattern open to read, process data, and open to write (assuming that a folder newfile_sqls already exist,
if not, an error would rise FileNotFoundError: [Errno 2] No such file or directory).
import glob
import os
import re
folder_path = "original_sqls"
#original_sqls\file1.sql, original_sqls\file2.sql, original_sqls\file3.sql
file_pattern = "*sql"
# new/modified files folder
output_path = "newfile_sqls"
folder_contents = glob.glob(os.path.join(folder_path,file_pattern))
# iterate over file names
for file_ in [os.path.basename(f) for f in folder_contents]:
# open to read
with open(os.path.join(folder_path,file_), "r") as inputf:
read_file = inputf.read()
# use variable 'read_file' here
tmp = re.sub(r'(TBLPROPERTIES \(.*?\))', r'\1;', read_file, flags=re.DOTALL)
# open to write to (previouly created) new folder
with open(os.path.join(output_path,file_), "w") as output:
output.writelines(tmp)

opening and reading all the files in a directory in python - python beginner

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document)
So far I have this code
import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)
at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?
also i get the same result when I use
file = open(os.path.join('results/',i), 'r')
in the 5th line
Please help I'm so lost
Thanks!!

Separate the different functions of the thing you want to do.
Use generators wherever possible. Especially if there are a lot of files or large files
Imports
from pathlib import Path
import sys
Deciding which files to process:
source_dir = Path('results/')
files = source_dir.iterdir()
[Optional] Filter files
For example, if you only need files with extension .ext
files = source_dir.glob('*.ext')
Process files
def process_files(files):
for file in files:
with file.open('r') as file_handle :
for line in file_handle:
# do your thing
yield line
Save the lines you want to keep
def save_lines(lines, output_file=sys.std_out):
for line in lines:
output_file.write(line)

you forgot indentation at this line allLines = file.readlines()
and maybe you can try that :
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)

You forgot to indent this line allLines.append(file.read()).
Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)

This also creates a file containing all the files you wanted to print.
rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):
for filename in files:
data = open(full_name).read()
f.write(data + "\n")
f.close()
This is a similar case, with more features: Copying selected lines from files in different directories to another file

How to put output data into directory specified in path1?

I want to write my output files into directory added to path1. But when I try to do that error occurs.
import os
import sys
path = 'ODOGS/LAB'
path1= 'ODOGS/lab_res_voiced'
for file in os.listdir(path):
current = os.path.join(path, file)
name_output=file #name of the current file must be the same for the output file
index_list_B = []
line_list_B = []
line_list_linije = []
with open(current) as f1:
for index1, line1 in enumerate(f1):
strings_B = ("p","P","t","T") #find lines which contain this str
line_list_B.append(line1) #put the line in line_list_B
if any(s in line1 for s in strings_B):
line_list_linije.append(line1)
print(line_list_linije)
index_list_B.append(index1) #positions of lines
with open(path1.join(name_output).format(index1), 'w') as output1:
for index1 in index_list_B:
print(line_list_B[index1])
output1.writelines([line_list_B[index11],
line_list_B[index1],','])
This code goes trough the text files in 'ODOGS/LAB' directory and searches if there are lines that contain ceratain strings. After it finds all lines which match the condition, it needs to write them in new file but with the same name as input file. That part of a code works just fine.
I want to put all of output files into another directory path1 but the part of
with statement doesn't work.
I get an error:
FileNotFoundError: [Errno 2] No such file or directory: 'sODOGS/lab_res_voicedzODOGS...
It works when the with statement is:
with open(name_output.format(index1), 'w') as output1:
but then I get all the files in the root folder which I don't want.
My question is how can I put my output files into directory in path1?

There's an error in forming the output path:
Instead of
with open(path1.join(name_output).format(index1), 'w') as output1:
you want
with open(os.path.join(path1, name_output), 'w') as output1:

Renaming files in folder from a text file

I want to know if it's possible to rename file in folder from a text file
..?
I explain:
I have a text file in which we find for each line a name and path (and checksum).
I would like to rename the name of EVERY photo file ( path).
Extract from text file:
...
15554615_05_hd.jpg /photos/FRYW-1555-16752.jpg de9da252fa1e36dc0f96a6213c0c73a3
15554615_06_hd.jpg /photos/FRYW-1555-16753.jpg 04de10fa29b2e6210d4f8159b8c3c2a8
...
My /photos folder:
Example:
Rename the file FRYW-1555-16752.jpg to 15554615_05_hd.jpg
My script (just a beginning):
for line in open("myfile.txt") :
print line.rstrip('\n') # .rstrip('\n') removes the line breaks

Something like this ought to work. Replace the txt with reading from a file and for the file names use something like os.walk
import os
import shutil
txt = """
15554615_05_hd.jpg /photos/FRYW-1555-16752.jpg de9da252fa1e36dc0f96a6213c0c73a3
15554615_06_hd.jpg /photos/FRYW-1555-16753.jpg 04de10fa29b2e6210d4f8159b8c3c2a8
"""
filenames = 'FRYW-1555-16752', 'FRYW-1555-16753.jpg'
new_names = []
old_names = []
hashes = []
for line in txt.splitlines():
if not line:
continue
new_name, old_name, hsh = line.split()
new_names.append(new_name)
old_names.append(old_name)
hashes.append(hsh)
dump_folder = os.path.expanduser('~/Desktop/dump') # or some other folder ...
if not os.path.exists(dump_folder):
os.makedirs(dump_folder)
for old_name, new_name in zip(old_names, new_names):
if os.path.exists(old_name):
base = os.path.basename(old_name)
dst = os.path.join(dump_folder, base)
shutil.copyfile(old_name, dst)

import os
with open('file.txt') as f:
for line in f:
newname, file, checksum = line.split()
if os.path.exists(file):
try:
os.rename(file, os.sep.join([os.path.dirname(file), newname]))
except OSError:
print "Got a problem with file {}. Failed to rename it to {}.".format(file, newname)

The problem can be solved by:
Looping through all files using os.listdir(). listdir will help you get all file name, with current directory, use os.listdir(".")
Then using os.rename() to rename the file: os.rename(old_name, new_name)
Sample code: assuming you're dealing with *.jpg
added = "NEW"
for image in os.listdir("."):
new_image = image[:len(image)-4] + added + image[len(image)-4:]
os.rename(image, new_image)

Yes it can be done.
You can divide your problem in sub-problems:
Open txt-file
Use line from txt-file to identify the image you want to rename and the new name you want to give to it
Open the image copy content and write it in a new file with the new name, save new file
Delete old file
I am sure there will be a faster/better/more efficient way of doing this but it all comes to dividing and conquering your problem and its sub-problems.
Can be done in python using a loop, file open in read/write modes and "os" module to access the file system.

How do I retrieve the contents of all the files in a directory in a list each one?

I would like to read the all the files in a directory so I'm doing the following:
path = '/thepath/of/the/files/*'
files = glob.glob(path)
for file in files:
print file
The problem is that when I print the files I don't obtain anything; any idea of how to return all the content of the files in a list per file?
EDIT: I appended the path with an asterisk, this should give you all the files and directories in that path.

Like in the comment I posted some time ago, this should work:
contents=[open(ii).read() for ii in glob.glob(path)]
or this, if you want a dictionary instead:
contents={ii : open(ii).read() for ii in glob.glob(path)}

I would do something like the following to only get files.
import os
import glob
path = '/thepath/of/the/files/*'
files=glob.glob(path)
for file in files:
if os.path.isfile(file):
print file

Your question is kind of unclear, but as I understand it, you'd like to get the contents of all the files in the directory. Try this:
# ...
contents = {}
for file in files:
with open(file) as f:
contents[file] = f.readlines()
print contents
This creates a dict where the key is the file name, and the value is the contents of the file.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Reading all json files in a directory - python

Related

Filter Directory using Regex and output filtered files to another directory

opening and reading all the files in a directory in python - python beginner

How to put output data into directory specified in path1?

Renaming files in folder from a text file

How do I retrieve the contents of all the files in a directory in a list each one?

Categories

Resources