In the code below I'm trying to open a series of text files and copy their contents into a single file. I'm getting an error on the "os.write(out_file, line)" in which it asks me for an integer. I haven't defined what "line" is, so is that the problem? Do I need to specify somehow that "line" is a text string from the in_file? Also, I open the out_file through each iteration of the for-loop. Is that bad? Should I open it once at the beginning? Thanks!
import os
import os.path
import shutil
# This is supposed to read through all the text files in a folder and
# copy the text inside to a master file.
# This defines the master file and gets the source directory
# for reading/writing the files in that directory to the master file.
src_dir = r'D:\Term Search'
out_file = r'D:\master.txt'
files = [(path, f) for path,_,file_list in os.walk(src_dir) for f in file_list]
# This for-loop should open each of the files in the source directory, write
# their content to the master file, and finally close the in_file.
for path, f_name in files:
open(out_file, 'a+')
in_file = open('%s/%s' % (path, f_name), 'r')
for line in in_file:
os.write(out_file, line)
close(file_name)
close(out_file)
print 'Finished'
You're doing it wrong:
You did:
open(out_file, 'a+')
but that doesn't save the reference as a variable, so you have no way to access the file object you just created. What you need to do:
out_file_handle = open(out_file, 'a+')
...
out_file_handle.write(line)
...
out_file_handle.close()
Or, more pythonically:
out_filename = r"D:\master.txt"
...
with open(out_filename, 'a+') as outfile:
for filepath in files:
with open(os.path.join(*filepath)) as infile:
outfile.write(infile.read())
print "finished"
Related
I have tried to merge some files in server folder into a new file, saving under same server folder.
In my below script, I keep receiving unexpected indent error. I would like to seek some expert guidance.
import pandas as pd # import need to be in lower case
import numpy as np
import openpyxl
from openpyxl.workbook import workbook #save to excel doc
#>>> 1.1 Define common file path and filename
path= '\hbap.adroot.abb\HK\Finance\00210602\AMH_A2R\1KY\Drv Reengine\Python\'
#>>> 1.2 Define list of files
filenames = [path+'100_6.xlsx', path+'101_6.xlsx']
# Open file3 in write mode
with Open(r path+'file3.xlsx','w') as outfile:
# Iterate through list
for names in filenames:
# Open each file in read mode
with open(names) as infile:
# read the data from file1 and
# file2 and write it in file3
outfile.write(infile.read())
# Add '\n' to enter data of file2
# from next line
outfile.write("\n")
please take a look at proposed solution.
In this case I take all files present in /sql directory, read all of them one by one and append the result to the output file.
import os
files_list = list()
output = r"E:\Downloads\output.txt"
for (dirpath, dirnames, filenames) in os.walk(r'E:\Downloads\sql'):
files_list += [os.path.join(dirpath, file) for file in filenames]
for file in files_list:
fin = open(file, "rt")
data = fin.read()
fin.close()
fin = open(output, "a+")
fin.write(data)
fin.write("\n ---------- \n")
fin.close()
Also I might suggest that you are dealing with .xlsx files which is a bit different topic and merging excel files should be treated in another way.
Having a correct indentation is important in python as the interpreter uses it to read the code.
This has an ordered indent level:
# Open file3 in write mode
with Open(r path+'file3.xlsx','w') as outfile:
# Iterate through list
for names in filenames:
# Open each file in read mode
with open(names) as infile:
# read the data from file1 and
# file2 and write it in file3
outfile.write(infile.read())
# Add '\n' to enter data of file2
# from next line
outfile.write("\n")
Hello I have a lot of text files
I want to change their contents to the same phrase. I tried this and it didn't work.
I want to read all the text files open them and change the content to just one same phrase.
import os
text_files = []
os.chdir(os.path.join("data"))
for filename in os.listdir(os.getcwd()):
if filename.endswith(".txt"):
text_files.append("data/" + filename)
with open(filename , "w") as outfile:
for text in text_files:
outfile.write("new content")
outfile.close()
os.chdir("..")
Please fix your indentation. As far as I have understood this is probably what you are trying to do:
import os
text_files = []
# get all file names that end with .txt
for filename in os.listdir("data"):
if filename.endswith(".txt"):
text_files.append(os.path.join("data" , filename))
# open each file and over-write the text "new content"
for file in text_files:
with open(file , "w") as outfile:
outfile.write("new content")
There are a few unnecessary lines of code in your code...
For example:
os.chdir(os.path.join("data"))
isn't neessasary as os.path.join("data") will simply return "data", so os.chdir("data") would have sufficed...
And then again, you can just do os.listdir("data") to get the list of files/folders in "data" without having to change your directory.
In fact, the code above can be simplified even more:
import os
for filename in os.listdir("data"):
if filename.endswith(".txt"):
with open(os.path.join("data" , filename) , "w") as outfile:
outfile.write("new content")
import os
os.chdir("data")
for filename in os.listdir(os.getcwd()):
if filename.endswith(".txt"):
with open(filename, "w") as outfile:
outfile.write("new content")
There's no need to use 'data/' in the path as you change into the directory.
You also don't need two for loops, your indentation is also off.
This works.
It's clear for me how to open one file and it's pretty straight forward by using open() function just like this:
with open('number.txt', 'rb') as myfile:
data=myfile.read()
But what will be my actions if I want to open 5 .txt files and also view them as a string in Python? Should I somehow use os.listdir() possibilities?
Here a flexible/reusable approach for doing exactly what you need:
def read_files(files):
for filename in files:
with open(filename, 'rb') as file:
yield file.read()
def read_files_as_string(files, separator='\n'):
files_content = list(read_files(files=files))
return separator.join(files_content)
# build your files list as you need
files = ['f1.txt', 'f2.txt', 'f3.txt']
files_content_str = read_files_as_string(files)
print(files_content_str)
Looks like you need.
import os
path = "your_path"
for filename in os.listdir(path):
if filename.endswith(".txt"):
with open(os.path.join(path, filename), 'rb') as myfile:
data=myfile.read()
My intention is to copy the text of all my c# (and later aspx) files to one final text file, but it doesn't work.
For some reason, the "yo.txt" file is not created.
I know that the iteration over the files works, but I can't write the data into the .txt file.
The variable 'data' eventually does contain all text from the files . . .
*******Could it be connected to the fact that there are some non-ascii characters in the text of the c# files?
Here is my code:
import os
import sys
src_path = sys.argv[1]
os.chdir(src_path)
data = ""
for file in os.listdir('.'):
if os.path.isfile(file):
if file.split('.')[-1]=="cs" and (len(file.split('.'))==2 or len(file.split('.'))==3):
print "Copying", file
with open(file, "r") as f:
data += f.read()
print data
with open("yo.txt", "w") as f:
f.write(data)
If someone has an idea, it will be great :)
Thanks
You have to ensure the directory the file is created has sufficient write permissions, if not run
chmod -R 777 .
to make the directory writable.
import os
for r, d, f in os.walk(inputdir):
for file in f:
filelist.append(f"{r}\\{file}")
with open(outputfile, 'w') as outfile:
for f in filelist:
with open(f) as infile:
for line in infile:
outfile.write(line)
outfile.write('\n \n')
Trying to extract all the zip files and giving the same name to the folder where all the files are gonna be.
Looping through all the files in the folder and then looping through the lines within those files to write on a different text file.
This is my code so far:
#!usr/bin/env python3
import glob
import os
import zipfile
zip_files = glob.glob('*.zip')
for zip_filename in zip_files:
dir_name = os.path.splitext(zip_filename)[0]
os.mkdir(dir_name)
zip_handler = zipfile.ZipFile(zip_filename, "r")
zip_handler.extractall(dir_name)
path = dir_name
fOut = open("Output.txt", "w")
for filename in os.listdir(path):
for line in filename.read().splitlines():
print(line)
fOut.write(line + "\n")
fOut.close()
This is the error that I encounter:
for line in filename.read().splitlines():
AttributeError: 'str' object has no attribute 'read'
You need to open the file and also join the path to the file, also using splitlines and then adding a newline to each line is a bit redundant:
path = dir_name
with open("Output.txt", "w") as fOut:
for filename in os.listdir(path):
# join filename to path to avoid file not being found
with open(os.path.join(path, filename)):
for line in filename:
fOut.write(line)
You should always use with to open your files as it will close them automatically. If the files are not large you can simply fOut.write(f.read()) and remove the loop.
You also set path = dir_name which means path will be set to whatever the last value of dir_name was in your first loop which may or may not be what you want. You can also use iglob to avoid creating a full list zip_files = glob.iglob('*.zip').