replacing strings in files from list in python - python

So I need to find all files with certain extension in this case .txt. Then I must open all these files and change certain string with another string... and here i'm stuck.
here is my code:
import os, os.path
find_files=[]
for root, dirs, files in os.walk("C:\\Users\\Kevin\\Desktop\\python programi"):
for f in files:
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.txt':
find_files.append(fullpath)
for i in find_files:
file = open(i,'r+')
contents = file.write()
replaced_contents = contents.replace('xy', 'djla')
print i
ERROR mesage:
line 12, in <module>
contents = file.write()
TypeError: function takes exactly 1 argument (0 given)
i know that there is misssing an argument but which argument should I use?
i think it would be better if i change the code from for i in find files: down
any advice?

I think you mean to use file.read() rather than file.write()

Not sure if you're just trying to print out the changes or if you want to actually rewrite them into the file, in which case you could just do this:
for i in find_files:
replaced_contents = ""
contents = ""
with open(i, "r") as file:
contents = file.read()
replaced_contents = contents.replace('xy', 'djla')
with open(i, "w") as file:
file.write(replaced_contents)
print i

Related

How to open and read text files in a folder python

I have a folder which has a text files in it. I want to be able to put in a path to this file and have python go through the folder, open each file and append its content to a list.
import os
folderpath = "/Users/myname/Downloads/files/"
inputlst = [os.listdir(folderpath)]
filenamelist = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
filenamelist.append(filename)
print(filename list)
So far this outputs:
['test1.txt', 'test2.txt', 'test3.txt', 'test4.txt', 'test5.txt', 'test6.txt', 'test7.txt', 'test8.txt', 'test9.txt', 'test10.txt']
I want to have the code take each of these files, open them and put all of its content into a single huge list not just print the file name. Is there any way to do this?
You should use file open for this.
Read here a documentation about its advanced options
Anyway, here is one way how you can do it:
import os
folderpath = r"yourfolderpath"
inputlst = [os.listdir(folderpath)]
filenamecontent = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
f = open(os.path.join(folderpath,filename), 'r')
filenamecontent.append(f.read())
print(filenamecontent)
If you are using Python3, you can use :
for filename in filename_list :
with open(filename,"r") as file_handler :
data = file_handler.read()
Please do mind that you will need the full (either relative or absolute) path to your file in filename
This way, your file handler will be automatically closed when you get out of the with scope.
More information around here : https://docs.python.org/fr/3/library/functions.html#open
On a side note, in order to list files, you might want to have a look to glob and use :
filename_list = glob.glob("/path/to/files/*.txt")
You can use fileinput
Code:
import fileinput
folderpath = "your_path_to_directory_where_files_are_stored"
file_list = [a for a in os.listdir(folderpath) if a.endswith(".txt")]
# This will return all the files which are in .txt format
get_all_files = fileinput.input(file_list)
with open("alldata.txt", 'ab+') as writefile:
for line in get_all_files:
writefile.write(line+'\n')
The above code will read all the data from .txt from a specified directory(folderpath) and store it in alldata.txt So, you wanted to have that long list, that list is now stored in .txt file if you want, else you can remove the write process.
Links:
https://docs.python.org/3/library/fileinput.html
https://docs.python.org/3/library/functions.html#open

opening and reading all the files in a directory in python - python beginner

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document)
So far I have this code
import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)
at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?
also i get the same result when I use
file = open(os.path.join('results/',i), 'r')
in the 5th line
Please help I'm so lost
Thanks!!
Separate the different functions of the thing you want to do.
Use generators wherever possible. Especially if there are a lot of files or large files
Imports
from pathlib import Path
import sys
Deciding which files to process:
source_dir = Path('results/')
files = source_dir.iterdir()
[Optional] Filter files
For example, if you only need files with extension .ext
files = source_dir.glob('*.ext')
Process files
def process_files(files):
for file in files:
with file.open('r') as file_handle :
for line in file_handle:
# do your thing
yield line
Save the lines you want to keep
def save_lines(lines, output_file=sys.std_out):
for line in lines:
output_file.write(line)
you forgot indentation at this line allLines = file.readlines()
and maybe you can try that :
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
You forgot to indent this line allLines.append(file.read()).
Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
This also creates a file containing all the files you wanted to print.
rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):
for filename in files:
data = open(full_name).read()
f.write(data + "\n")
f.close()
This is a similar case, with more features: Copying selected lines from files in different directories to another file

How to read all files in a directory using python

I have a directory with lot of text files. I want to read each text file in the directory and perform some kind of search operation. I take directory name as a command line argument. The error I'm getting is IsADirectoryError. Is there anyway we can make this work without any other module?
This is my code:
a = sys.argv
files = a[1:-1]
for i in files:
print(i)
f = open(i,'rb')
for line in f:
try:
for word in line.split():
'''Rest of code here'''
try this code
def read_files_from_dir(dirname):
for _file in os.listdir(dirname):
with open(os.path.join(dirname,_file), "r") as fp:
print fp.read()

Search keyword from different files and return the filename

I'm super duper new in using python and I've tried every keywords and attempts on searching to help me figure out my problem. I want to create a program that helps me search a keyword from different txt files in a folder and return the filename (python).
This has been the fruit of my searches:
import glob
import os
path = r'<filepath of the folder>'
keyword = "internet" //ex is internet
for filename in glob.glob(os.path.join(path, '*.txt')):
f = open(filename)
if keyword in f:
print("filename")
I tried running and (surprisingly) it run properly but there's nothing printed but I'm quite sure that there's a file with an internet word inside. And since it didn't print any error or anything at all, not sure if I'm even in the right direction.
You need to read the file using read()
for filename in glob.glob(os.path.join(path, '*.txt')):
with open(filename) as f:
if keyword in f.read():
print("filename")
or read each line and print the filename if the "keyword" is found.
for filename in glob.glob(os.path.join(path, '*.txt')):
with open(filename) as f:
for line in f:
if keyword in line:
print("filename")
break

Concatenate last 100 files in an only one

Begginer in Python needs a bit of help. I am using Python 2.7.
I want to make a program that concatenates the last 100 files I have in a folder. In that folder I have lots of files but I only want the concatenation of the last 100 ones. I am able to do the concatenation of all of them (if I don´t specify number and change the for loop), but I am not able to select the last 100 files. These files are saved in binary by the software.They are saved in the folder specified below. I would like to remove that 100 files once are concatenated in teh new one.The program I have done is the following:
#!/usr/bin/python
import os
import glob
os.chdir("C:\AFM_test\jpk_files")
rout=""
filename=glob.glob("*-*-*.*.*-*.*.*.jpk-force")
filename.sort(key=os.path.getmtime)
for filename in range(0,99):
filename=open(filename,"rb")
tout=filename.read()+\r\n"
rout = rout+tout
os.remove(filename)
filename.close()
fout = open("output.jpk-force","wb+")
fout.write(rout)
fout.close()
It doesn´t do anything and the error is the following:
Traceback (most recent call last):
File "C:\AFM_test\jpk_files\AFM_test.py", line 12, in <module>
filename = open(filename,"rb")
TypeError: coercing to Unicode: need string or buffer, int found
[Finished in 0.1s]
I guess the problem is the loop and its structure "range(0,99)",as when I have concatenated all the files contained in the folder:
#!/usr/bin/python
import os
import glob
os.chdir("C:\AFM_test\jpk_files")
rout=""
filename=glob.glob("*-*-*.*.*-*.*.*.jpk-force")
for filename in files:
filename=open(filename,"rb")
tout=filename.read()+\r\n"
rout = rout+tout
os.remove(filename)
filename.close()
fout = open("output.jpk-force","wb+")
fout.write(rout)
fout.close()
it worked okay except the remove order, which showed this error:
Traceback (most recent call last):
File "C:\try\AFM_test_2.py", line 17, in <module>
os.remove(filename)
must be string, not file
Any ideas how can I achieve my goal?
I hope I have explained myself properly. Maybe I have missed something important, sorry, I am just a beginner in this field.
Thank you.
TypeError: coercing to Unicode: need string or buffer, int found
That is because filename is an integer and then you are trying to concatenate it with a string.
os.remove(filename)
must be string, not file
That is because you are re-assigning the variable filename (which was a string path) to a file handle/object. os.remove(..) expects the variable from the for-loop, not the result of open(..). Its generally a good practice to give meaningful names to variables – filepath and infile etc.
A better approach would be:
def processFile(filepath):
with open(filepath) as f:
content = f.read()
os.remove(filepath)
return content
def main():
paths = glob.glob("..*..*..")
last100paths = paths[-100:]
with open(outFilePath, "w") as f:
f.write("\r\n".join(processFile(path) for path in last100paths))
You need to change:
filename=open(filename,"rb")
...to something like:
inf = open(filename, "rb")
...
inf.close()
Then, when you're calling os.remove(filename), it will still be the filename from the original loop, not a file object that your code is reassigning to this variable.
Note: rather than doing this explicit opening and closing of files, though, try using the with statement (see this helpful guide).
Checn if glob is matching patterns
pattern = r"*-*-*.*.*-*.*.*.jpk-force"
filenames=glob.glob(pattern)
if not filenames:
print 'no files matched ', pattern
sys.exit(1)
Get mtime sorted file list by building list of tuples each containing file name and mtime
filenames = [ (filename,os.stat(filename)[8]) for filename in filenames ]
sort the list with mtime in descending order
filenames.sort(key=lambda x:x[1],reverse=True)
The above two lines can be simplified as;
filenames = [ filename for filename in sorted(filenames,key=os.path.getmtime,reverse=True) ]
The above line can be refactored, because we can sort in place
filenames.sort(key=os.path.getmtime,reverse=True)
#!/usr/bin/python
import os
import glob
os.chdir("C:\AFM_test\jpk_files")
rout=""
pattern = r"*-*-*.*.*-*.*.*.jpk-force"
filenames=glob.glob(pattern)
if not filenames:
print 'no files matched ', pattern
sys.exit(1)
filenames.sort(key=os.path.getmtime,reverse=True)
for filename in filenames[:100]
filecontent=open(filename,"rb")
tout=filecontent.read()+"\r\n"
filecontent.close()
rout = rout+tout
os.remove(filename)
fout = open("output.jpk-force","wb+")
fout.write(rout)
fout.close()
You didn't check for exceptions.

Categories