multiple search and replace in python - python

I need to search in a parent folder all files that are config.xml
and in those files replace one string in another. (from this-is to where-as)

import os
parent_folder_path = 'somepath/parent_folder'
for eachFile in os.listdir(parent_folder_path):
if eachFile.endswith('.xml'):
newfilePath = parent_folder_path+'/'+eachFile
file = open(newfilePath, 'r')
xml = file.read()
file.close()
xml = xml.replace('thing to replace', 'with content')
file = open(newfilePath, 'w')
file.write(str(xml))
file.close()
Hope this is what you are looking for.

You want to take a look at os.walk() for recursively traveling through a folder and subfolders.
Then, you can read each line (for line in myfile: ...) and do a replacement (line = line.replace(old, new)) and save the line back to a temporary file (tmp.write(line)), and finally copy the temp file over the original.

Related

Compare directory content to a text file

I am trying to compare the contents of a directory to a text file.I have certain files in the directory and I also want to compare the files to this text file. How do I achieve this?
To get the directory's content (list of filenames):
import os
dir_content = os.listdir(directory_path)
To get a text-files content (line by line in a list):
with open(filename) as f:
lines = f.readlines()

How to open and read text files in a folder python

I have a folder which has a text files in it. I want to be able to put in a path to this file and have python go through the folder, open each file and append its content to a list.
import os
folderpath = "/Users/myname/Downloads/files/"
inputlst = [os.listdir(folderpath)]
filenamelist = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
filenamelist.append(filename)
print(filename list)
So far this outputs:
['test1.txt', 'test2.txt', 'test3.txt', 'test4.txt', 'test5.txt', 'test6.txt', 'test7.txt', 'test8.txt', 'test9.txt', 'test10.txt']
I want to have the code take each of these files, open them and put all of its content into a single huge list not just print the file name. Is there any way to do this?
You should use file open for this.
Read here a documentation about its advanced options
Anyway, here is one way how you can do it:
import os
folderpath = r"yourfolderpath"
inputlst = [os.listdir(folderpath)]
filenamecontent = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
f = open(os.path.join(folderpath,filename), 'r')
filenamecontent.append(f.read())
print(filenamecontent)
If you are using Python3, you can use :
for filename in filename_list :
with open(filename,"r") as file_handler :
data = file_handler.read()
Please do mind that you will need the full (either relative or absolute) path to your file in filename
This way, your file handler will be automatically closed when you get out of the with scope.
More information around here : https://docs.python.org/fr/3/library/functions.html#open
On a side note, in order to list files, you might want to have a look to glob and use :
filename_list = glob.glob("/path/to/files/*.txt")
You can use fileinput
Code:
import fileinput
folderpath = "your_path_to_directory_where_files_are_stored"
file_list = [a for a in os.listdir(folderpath) if a.endswith(".txt")]
# This will return all the files which are in .txt format
get_all_files = fileinput.input(file_list)
with open("alldata.txt", 'ab+') as writefile:
for line in get_all_files:
writefile.write(line+'\n')
The above code will read all the data from .txt from a specified directory(folderpath) and store it in alldata.txt So, you wanted to have that long list, that list is now stored in .txt file if you want, else you can remove the write process.
Links:
https://docs.python.org/3/library/fileinput.html
https://docs.python.org/3/library/functions.html#open

opening and reading all the files in a directory in python - python beginner

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document)
So far I have this code
import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)
at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?
also i get the same result when I use
file = open(os.path.join('results/',i), 'r')
in the 5th line
Please help I'm so lost
Thanks!!
Separate the different functions of the thing you want to do.
Use generators wherever possible. Especially if there are a lot of files or large files
Imports
from pathlib import Path
import sys
Deciding which files to process:
source_dir = Path('results/')
files = source_dir.iterdir()
[Optional] Filter files
For example, if you only need files with extension .ext
files = source_dir.glob('*.ext')
Process files
def process_files(files):
for file in files:
with file.open('r') as file_handle :
for line in file_handle:
# do your thing
yield line
Save the lines you want to keep
def save_lines(lines, output_file=sys.std_out):
for line in lines:
output_file.write(line)
you forgot indentation at this line allLines = file.readlines()
and maybe you can try that :
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
You forgot to indent this line allLines.append(file.read()).
Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
This also creates a file containing all the files you wanted to print.
rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):
for filename in files:
data = open(full_name).read()
f.write(data + "\n")
f.close()
This is a similar case, with more features: Copying selected lines from files in different directories to another file

Renaming files in folder from a text file

I want to know if it's possible to rename file in folder from a text file
..?
I explain:
I have a text file in which we find for each line a name and path (and checksum).
I would like to rename the name of EVERY photo file ( path).
Extract from text file:
...
15554615_05_hd.jpg /photos/FRYW-1555-16752.jpg de9da252fa1e36dc0f96a6213c0c73a3
15554615_06_hd.jpg /photos/FRYW-1555-16753.jpg 04de10fa29b2e6210d4f8159b8c3c2a8
...
My /photos folder:
Example:
Rename the file FRYW-1555-16752.jpg to 15554615_05_hd.jpg
My script (just a beginning):
for line in open("myfile.txt") :
print line.rstrip('\n') # .rstrip('\n') removes the line breaks
Something like this ought to work. Replace the txt with reading from a file and for the file names use something like os.walk
import os
import shutil
txt = """
15554615_05_hd.jpg /photos/FRYW-1555-16752.jpg de9da252fa1e36dc0f96a6213c0c73a3
15554615_06_hd.jpg /photos/FRYW-1555-16753.jpg 04de10fa29b2e6210d4f8159b8c3c2a8
"""
filenames = 'FRYW-1555-16752', 'FRYW-1555-16753.jpg'
new_names = []
old_names = []
hashes = []
for line in txt.splitlines():
if not line:
continue
new_name, old_name, hsh = line.split()
new_names.append(new_name)
old_names.append(old_name)
hashes.append(hsh)
dump_folder = os.path.expanduser('~/Desktop/dump') # or some other folder ...
if not os.path.exists(dump_folder):
os.makedirs(dump_folder)
for old_name, new_name in zip(old_names, new_names):
if os.path.exists(old_name):
base = os.path.basename(old_name)
dst = os.path.join(dump_folder, base)
shutil.copyfile(old_name, dst)
import os
with open('file.txt') as f:
for line in f:
newname, file, checksum = line.split()
if os.path.exists(file):
try:
os.rename(file, os.sep.join([os.path.dirname(file), newname]))
except OSError:
print "Got a problem with file {}. Failed to rename it to {}.".format(file, newname)
The problem can be solved by:
Looping through all files using os.listdir(). listdir will help you get all file name, with current directory, use os.listdir(".")
Then using os.rename() to rename the file: os.rename(old_name, new_name)
Sample code: assuming you're dealing with *.jpg
added = "NEW"
for image in os.listdir("."):
new_image = image[:len(image)-4] + added + image[len(image)-4:]
os.rename(image, new_image)
Yes it can be done.
You can divide your problem in sub-problems:
Open txt-file
Use line from txt-file to identify the image you want to rename and the new name you want to give to it
Open the image copy content and write it in a new file with the new name, save new file
Delete old file
I am sure there will be a faster/better/more efficient way of doing this but it all comes to dividing and conquering your problem and its sub-problems.
Can be done in python using a loop, file open in read/write modes and "os" module to access the file system.

Find one file out of many containing a desired string in Python

I have a string like 'apples'. I want to find this string, and I know that it exists in one out of hundreds of files. e.g.
file1
file2
file3
file4
file5
file6
...
file200
All of these files are in the same directory. What is the best way to find which file contains this string using python, knowing that exactly one file contains it.
I have come up with this:
for file in os.listdir(directory):
f = open(file)
for line in f:
if 'apple' in f:
print "FOUND"
f.close()
and this:
grep = subprocess.Popen(['grep','-m1','apple',directory+'/file*'],stdout=subprocess.PIPE)
found = grep.communicate()[0]
print found
Given that the files are all in the same directory, we just get a current directory listing.
import os
for fname in os.listdir('.'): # change directory as needed
if os.path.isfile(fname): # make sure it's a file, not a directory entry
with open(fname) as f: # open file
for line in f: # process line by line
if 'apples' in line: # search for string
print 'found string in file %s' %fname
break
This automatically gets the current directory listing, and checks to make sure that any given entry is a file (not a directory).
It then opens the file and reads it line by line (to avoid problems with memory it doesn't read it in all at once) and looks for the target string in each line.
When it finds the target string it prints the name of the file.
Also, since the files are opened using with they are also automatically closed when we are done (or an exception occurs).
For simplicity, this assumes your files are in the current directory:
def whichFile(query):
for root,dirs,files in os.walk('.'):
for file in files:
with open(file) as f:
if query in f.read():
return file
for x in os.listdir(path):
with open(x) as f:
if 'Apple' in f.read():
#your work
break
a lazy-evaluation, itertools-based approach
import os
from itertools import repeat, izip, chain
gen = (file for file in os.listdir("."))
gen = (file for file in gen if os.path.isfile(file) and os.access(file, os.R_OK))
gen = (izip(repeat(file), open(file)) for file in gen)
gen = chain.from_iterable(gen)
gen = (file for file, line in gen if "apple" in line)
gen = set(gen)
for file in gen:
print file
Open your terminal and write this:
Case insensitive search
grep -i 'apple' /path/to/files
Recursive search (through all sub folders)
grep -r 'apple' /path/to/files

Categories