Python: Looping through files in a different directory and scanning data - python

I am having a hard time looping through files in a directory that is different from the directory where the script was written. I also ideally would want my script through go to through all files that start with sasa. There are a couple of files in the folder such as sasa.1, sasa.2 etc... as well as other files such as doc1.pdf, doc2.pdf
I use Python Version 2.7 with windows Powershell
Locations of Everything
1) Python Script Location ex: C:Users\user\python_project
2) Main_Directory ex: C:Users\user\Desktop\Data
3) Current_Working_Directory ex: C:Users\user\python_project
Main directory contains 100 folders (folder A, B, C, D etc..)
Each of these folders contains many files including the sasa files of interest.
Attempts at running script
For 1 file the following works:
Script is run the following way: python script1.py
file_path = 'C:Users\user\Desktop\Data\A\sasa.1
def writing_function(file_path):
with open(file_path) as file_object:
lines = file_object.readlines()
for line in lines:
print(lines)
writing_function(file_path)
However, the following does not work
Script is run the following way: python script1.py A sasa.1
import os
import sys
from os.path import join
dr = sys.argv[1]
file_name = sys.argv[2]
file_path = 'C:Users\user\Desktop\Data'
new_file_path = os.path.join(file_path, dr)
new_file_path2 = os.path.join(new_file_path, file_name)
def writing_function(paths):
with open(paths) as file_object:
lines = file_object.readlines()
for line in lines:
print(line)
writing_function(new_file_path2)
I get the following error:
with open(paths) as file_object:
IO Error: [Errno 2] No such file or directory:
'C:Users\\user\\Desktop\\A\\sasa.1'
Please note right now I am just working on one file, I want to be able to loop through all of the sasa files in the folder.

It can be something in the line of:
import os
from os.path import join
def function_exec(file):
code to execute on each file
for root, dirs, files in os.walk('path/to/your/files'): # from your argv[1]
for f in files:
filename = join(root, f)
function_exec(filename)
Avoid using the variable dir. it is a python keyword. Try print(dir(os))
dir_ = argv[1] # is preferable

No one mentioned glob so far, so:
https://docs.python.org/3/library/glob.html
I think you can solve your problem using its ** magic:
If recursive is true, the pattern “**” will match any files and zero
or more directories and subdirectories. If the pattern is followed by
an os.sep, only directories and subdirectories match.

Also note you can change directory location using
os.chdir(path)

Related

How to move files based on their names with python? [duplicate]

This question already has answers here:
Moving all files from one directory to another using Python
(11 answers)
Closed 6 months ago.
I have a folder with a lot of tutorial links, so I wanted to create a script that reads the file name and for instance, if the file has in its name the word "VBA" or "Excel" it would create the folder Excel and send to it. The same would happen with files containing the word "python".
The code is running, but nothing happens and the files still in the same directory. Does anyone have an idea of what I'm doing wrong?
Here is what I have in the folder, all files are links to youtube tutorials or websites:
Please see my code below:
import os
import shutil
os.chdir(r"C:\Users\RMBORGE\Desktop\Useful stuff")
path = r"C:\Users\RMBORGE\Desktop\Useful stuff\Excel"
for f in os.listdir():
if "vba" in f:
shutil.move(os.chdir,path)
Try this
import os
import shutil
path_to_files = 'some path'
files = os.listdir(path_to_files)
for f in files:
if 'Excel' in f:
created_folder = os.path.join(path_to_files, 'Excel')
filepath = os.path.join(path_to_files, f)
os.makedirs(created_folder, exist_ok=True)
shutil.move(filepath, created_folder)
NB: You can add more if statements for different keywords like Excel
Use pathlib mkdir for creating the folders. Prepare the folders/keywords you want sort in the list 'folders'. And then what is important is skipping the folders because os.listdir() gives you the folders aswell (and there is an error if you want to move a folder into itself)
import os
import shutil
import pathlib
folders = ["vba", "Excel"]
path = "/home/vojta/Desktop/THESIS/"
for f in os.listdir():
if not os.path.isdir(os.path.join(path, f)): # skip folders
for fol in folders:
if fol in f:
pathlib.Path(os.path.join(path, fol)).mkdir(parents=True, exist_ok=True)
fol_path = os.path.join(path, fol)
shutil.move(os.path.join(path, f), os.path.join(fol_path, f))

Find a file 1 folder level down

I am trying to get full_path of places.sqlite file present in '%APPDATA%\Mozilla\Firefox\Profiles\<random_folder>\places.sqlite' using Python OS module. The issue as you can see that <random_folder> has a random name and there could be multiple folders inside the Profiles folder.
How do I navigate/find the path to the places.sqlite file?
You would ideally want to go through each folder to search for this file. In terminal 'locate file_name' command would do this for you. In python file you can use the following command:
import os
db_path = os.path.join(os.getenv('APPDATA'), r'Mozilla\Firefox\Profiles')
def find_file(file_name, path):
for root_folder, directory, file_names in os.walk(path):
if file_name in file_names:
return os.path.join(root_folder, file_name)
print(find_file('places.sqlite', db_path))
os.walk gives a list of all files in a path recusivly. Use it to search for 'places.sqlite' as follows.
path = ""
for root, dirs, files in os.walk("%APPDATA%\\Mozilla\\Firefox\\Profiles\\"):
if "places.sqlite" in files:
path = os.path.join(root, 'places.sqlite')
break
Use the os module to list out all directories in %APPDATA%\Mozilla\Firefox\Profiles\
loop over the directories until you find places.sqlite file (also using os module)
A glob might be simpler as in this case one expects the file to be there in level below the Profiles folder or not there at all.
import os
import pathlib
profiles = pathlib.Path(os.environ["APPDATA"]) / "Mozilla" / "Firefox" / "Profiles"
# rglob will recursively search as well
if places := list(profiles.rglob("places.sqlite")):
print(places[0]) # will print the sqllite file path
with places[0].open() as f:
# ....

How to write lists in files with python

My data is organized as such:
I have 30 folders. In each of them, 3 subfolders. In each of them, one file.
I would like to write a script that writes, in a text file 1 located in folder 1, the paths to the files located in the subfolders of this folder 1; and so on for every other folder.
The problem is that the script only writes, in each text file, the 3rd file (file in subfolder 3) rather than the files in subfolders 1, 2, 3.
This is what I tried:
import glob
import os
gotofolders = '/path/to/folderslocation/'
foldersname = open('/path/to/foldersname.txt').read().split()
for folders in foldersname:
foldersdirectory = os.path.join(gotofolders,foldersname)
filepaths = glob.glob(os.path.join(foldersdirectory)+'*subfolders/*files')
for filepath in filepaths:
savethepaths = os.path.join(foldersdirectory)+'files_path_in_that_folder.txt'
with open (savethepaths,'w') as f:
f.write(filepath+'\n')
As said, it almost works, excepts that in each 'files_path_in_that_folder.txt' I have the 3rd element of the "filepath" list, rather than all 3 elements.
Thanks!
Okay, I figured it out; I had to add:
with open (savethepaths,'w') as f:
f.writelines(list("%s\n" %filepath for filepath in filepaths))
import os
def directory_into_file(_path, file_obj, depth):
# depth is a string of asterisk, just for better printing. starts with empty string
file_obj.write(depth + _path + '\n')
if(os.path.isdir(_path)):
file_list = os.listdir(_path)
os.chdir(_path)
for file in file_list:
directory_into_file(file, file_obj, depth+'*')
os.chdir("..")
this should work.
_path - the path of the directory,
file_obj - send the object file to the function and first,
depth - at first call send an empty string
hope this would work. didn't try it myself...

Reading in multiple files in directory using python

I'm trying to open each file from a directory and print the contents, so I have a code as such:
import os, sys
def printFiles(dir):
os.chdir(dir)
for f in os.listdir(dir):
myFile = open(f,'r')
lines = myFile.read()
print lines
myFile.close()
printFiles(sys.argv[1])
The program runs, but the problem here is that it is only printing one of the contents of the file, probably the last file that it has read. Does this have something to do with the open() function?
Edit: added last line that takes in sys.argv. That's the whole code, and it still only prints the last file.
There is problem with directory and file paths.
Option 1 - chdir:
def printFiles(dir):
os.chdir(dir)
for f in os.listdir('.'):
myFile = open(f,'r')
# ...
Option 2 - computing full path:
def printFiles(dir):
# no chdir here
for f in os.listdir(dir):
myFile = open(os.path.join(dir, f), 'r')
# ...
But you are combining both options - that's wrong.
This is why I prefer pathlib.Path - it's much simpler:
from pathlib import Path
def printFiles(dir):
dir = Path(dir)
for f in dir.iterdir():
myFile = f.open()
# ...
The code itself certainly should print the contents of every file.
However, if you supply a local path and not a global path it will not work.
For example, imagine you have the following folder structure:
./a
./a/x.txt
./a/y.txt
./a/a
./a/a/x.txt
If you now run
printFiles('a')
you will only get the contents of x.txt, because os.listdir will be executed from within a, and will list the contents of the internal a/a folder, which only has x.txt.

Rename multiple files in a directory in Python

I'm trying to rename some files in a directory using Python.
Say I have a file called CHEESE_CHEESE_TYPE.*** and want to remove CHEESE_ so my resulting filename would be CHEESE_TYPE
I'm trying to use the os.path.split but it's not working properly. I have also considered using string manipulations, but have not been successful with that either.
Use os.rename(src, dst) to rename or move a file or a directory.
$ ls
cheese_cheese_type.bar cheese_cheese_type.foo
$ python
>>> import os
>>> for filename in os.listdir("."):
... if filename.startswith("cheese_"):
... os.rename(filename, filename[7:])
...
>>>
$ ls
cheese_type.bar cheese_type.foo
Here's a script based on your newest comment.
#!/usr/bin/env python
from os import rename, listdir
badprefix = "cheese_"
fnames = listdir('.')
for fname in fnames:
if fname.startswith(badprefix*2):
rename(fname, fname.replace(badprefix, '', 1))
The following code should work. It takes every filename in the current directory, if the filename contains the pattern CHEESE_CHEESE_ then it is renamed. If not nothing is done to the filename.
import os
for fileName in os.listdir("."):
os.rename(fileName, fileName.replace("CHEESE_CHEESE_", "CHEESE_"))
Assuming you are already in the directory, and that the "first 8 characters" from your comment hold true always. (Although "CHEESE_" is 7 characters... ? If so, change the 8 below to 7)
from glob import glob
from os import rename
for fname in glob('*.prj'):
rename(fname, fname[8:])
I have the same issue, where I want to replace the white space in any pdf file to a dash -.
But the files were in multiple sub-directories. So, I had to use os.walk().
In your case for multiple sub-directories, it could be something like this:
import os
for dpath, dnames, fnames in os.walk('/path/to/directory'):
for f in fnames:
os.chdir(dpath)
if f.startswith('cheese_'):
os.rename(f, f.replace('cheese_', ''))
Try this:
import os
import shutil
for file in os.listdir(dirpath):
newfile = os.path.join(dirpath, file.split("_",1)[1])
shutil.move(os.path.join(dirpath,file),newfile)
I'm assuming you don't want to remove the file extension, but you can just do the same split with periods.
This sort of stuff is perfectly fitted for IPython, which has shell integration.
In [1] files = !ls
In [2] for f in files:
newname = process_filename(f)
mv $f $newname
Note: to store this in a script, use the .ipy extension, and prefix all shell commands with !.
See also: http://ipython.org/ipython-doc/stable/interactive/shell.html
Here is a more general solution:
This code can be used to remove any particular character or set of characters recursively from all filenames within a directory and replace them with any other character, set of characters or no character.
import os
paths = (os.path.join(root, filename)
for root, _, filenames in os.walk('C:\FolderName')
for filename in filenames)
for path in paths:
# the '#' in the example below will be replaced by the '-' in the filenames in the directory
newname = path.replace('#', '-')
if newname != path:
os.rename(path, newname)
It seems that your problem is more in determining the new file name rather than the rename itself (for which you could use the os.rename method).
It is not clear from your question what the pattern is that you want to be renaming. There is nothing wrong with string manipulation. A regular expression may be what you need here.
import os
import string
def rename_files():
#List all files in the directory
file_list = os.listdir("/Users/tedfuller/Desktop/prank/")
print(file_list)
#Change current working directory and print out it's location
working_location = os.chdir("/Users/tedfuller/Desktop/prank/")
working_location = os.getcwd()
print(working_location)
#Rename all the files in that directory
for file_name in file_list:
os.rename(file_name, file_name.translate(str.maketrans("","",string.digits)))
rename_files()
This command will remove the initial "CHEESE_" string from all the files in the current directory, using renamer:
$ renamer --find "/^CHEESE_/" *
I was originally looking for some GUI which would allow renaming using regular expressions and which had a preview of the result before applying changes.
On Linux I have successfully used krename, on Windows Total Commander does renaming with regexes, but I found no decent free equivalent for OSX, so I ended up writing a python script which works recursively and by default only prints the new file names without making any changes. Add the '-w' switch to actually modify the file names.
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import fnmatch
import sys
import shutil
import re
def usage():
print """
Usage:
%s <work_dir> <search_regex> <replace_regex> [-w|--write]
By default no changes are made, add '-w' or '--write' as last arg to actually rename files
after you have previewed the result.
""" % (os.path.basename(sys.argv[0]))
def rename_files(directory, search_pattern, replace_pattern, write_changes=False):
pattern_old = re.compile(search_pattern)
for path, dirs, files in os.walk(os.path.abspath(directory)):
for filename in fnmatch.filter(files, "*.*"):
if pattern_old.findall(filename):
new_name = pattern_old.sub(replace_pattern, filename)
filepath_old = os.path.join(path, filename)
filepath_new = os.path.join(path, new_name)
if not filepath_new:
print 'Replacement regex {} returns empty value! Skipping'.format(replace_pattern)
continue
print new_name
if write_changes:
shutil.move(filepath_old, filepath_new)
else:
print 'Name [{}] does not match search regex [{}]'.format(filename, search_pattern)
if __name__ == '__main__':
if len(sys.argv) < 4:
usage()
sys.exit(-1)
work_dir = sys.argv[1]
search_regex = sys.argv[2]
replace_regex = sys.argv[3]
write_changes = (len(sys.argv) > 4) and sys.argv[4].lower() in ['--write', '-w']
rename_files(work_dir, search_regex, replace_regex, write_changes)
Example use case
I want to flip parts of a file name in the following manner, i.e. move the bit m7-08 to the beginning of the file name:
# Before:
Summary-building-mobile-apps-ionic-framework-angularjs-m7-08.mp4
# After:
m7-08_Summary-building-mobile-apps-ionic-framework-angularjs.mp4
This will perform a dry run, and print the new file names without actually renaming any files:
rename_files_regex.py . "([^\.]+?)-(m\\d+-\\d+)" "\\2_\\1"
This will do the actual renaming (you can use either -w or --write):
rename_files_regex.py . "([^\.]+?)-(m\\d+-\\d+)" "\\2_\\1" --write
You can use os.system function for simplicity and to invoke bash to accomplish the task:
import os
os.system('mv old_filename new_filename')
This works for me.
import os
for afile in os.listdir('.'):
filename, file_extension = os.path.splitext(afile)
if not file_extension == '.xyz':
os.rename(afile, filename + '.abc')
What about this :
import re
p = re.compile(r'_')
p.split(filename, 1) #where filename is CHEESE_CHEESE_TYPE.***

Categories