Finding files in directories in Python

Finding files in directories in Python - python

I've been doing some scripting where I need to access the os to name images (saving every subsequent zoom of the Mandelbrot set upon clicking) by counting all of the current files in the directory and then using %s to name them in the string after calling the below function and then adding an option to delete them all
I realize the below will always grab the absolute path of the file but assuming we're always in the same directory is there not a simplified version to grab the current working directory
def count_files(self):
count = 0
for files in os.listdir(os.path.abspath(__file__))):
if files.endswith(someext):
count += 1
return count
def delete_files(self):
for files in os.listdir(os.path.abspath(__file__))):
if files.endswith(.someext):
os.remove(files)

Since you're doing the .endswith thing, I think the glob module might be of some interest.
The following prints all files in the current working directory with the extension .py. Not only that, it returns only the filename, not the path, as you said you wanted:
import glob
for fn in glob.glob('*.py'): print(fn)
Output:
temp1.py
temp2.py
temp3.py
_clean.py
Edit: re-reading your question, I'm unsure of what you were really asking. If you wanted an easier way to get the current working directory than
os.path.abspath(__file__)
Then yes, os.getcwd()
But os.getcwd() will change if you change the working directory in your script (e.g. via os.chdir(), whereas your method will not.

Using antipathy* it gets a little easier:
from antipathy import Path
def count_files(pattern):
return len(Path(__file__).glob(pattern))
def deletet_files(pattern):
Path(__file__).unlink(pattern)
*Disclosure: I'm the author of antipathy.

You can use os.path.dirname(path) to get the parent directory of the thing path points to.
def count_files(self):
count = 0
for files in os.listdir(os.path.dirname(os.path.abspath(__file__)))):
if files.endswith(someext):
count += 1
return count

Related

How to get path to file while the name of file is not full in python

I'll try to describe the problem in a simple way.
I have a .txt file that I can not know the full name of it which located under constant path
[for example: the full name is: Hello_stack.txt, I only can give to function the part: 'Hello_']
the input is: Path_to_file/ + 'Hello_'
the expected output is: Path_to_file/Hello_stack.txt
How can i do that?
I tried to give a path and check recursively if part of my file name is exist and if so, to return it's path.
this is my implementation: [of course I'd like to get another way if it works]
def get_CS_R2M_METRO_CALLBACK_FILE_PATH():
directory = 'path_of_file'
file_name = directory + 'part_of_file_name'
const_path = Path(file_name)
for path in [p for p in const_path.rglob("*")]:
if path.is_file():
return path
Thanks for help.

You might retrieve the file list in your path and then select from the list based upon your partial file name. Here is a snippet of code to perform that type of function on a Linux machine.
import os
dir = '/home/craig/Python_Programs/GetFile'
files = os.listdir(dir)
print('Files--> ', files)
for i in files:
myfile = 'Hello_'
if (myfile[0:4] == i[0:4]):
print('File(s) like \"Hello_\"-->', i)
When I executed this simple program over a directory/folder that had various files in the directory, here was the output to the terminal.
Una:~/Python_Programs/GetFile$ python3 GetFile.py
Files--> ['Hello_Stack.txt', 'Okay.txt', 'Hi_Stack.txt', 'GetFile.py', 'Hello_Stack.bak']
File(s) like "Hello_"--> Hello_Stack.txt
File(s) like "Hello_"--> Hello_Stack.bak
The literal value for your path would be different on a Windows machine. I hope this might provide you with a method to achieve your goal.
Regards.

Moving files to a folder in python

I have a code that locates files in a folder for me by name and moves them to another set path.
import os, shutil
files = os.listdir("D:/Python/Test")
one = "one"
two = "two"
oney = "fold1"
twoy="fold2"
def findfile(x,y):
for file in files:
if x in file.lower():
while x in file.lower():
src = ('D:/Python/Test/'+''.join(file))
dest = ('D:/Python/Test/'+y)
if not os.path.exists(dest):
os.makedirs(dest)
shutil.move(src,dest)
break
findfile(one,oney)
findfile(two,twoy)
In this case, this program moves all the files in the Test folder to another path depending on the name, let's say one as an example:
If there is a .png named one, it will move it to the fold1 folder. The problem is that my code does not distinguish between types of files and what I would like is that it excludes the folders from the search.
If there is a folder called one, don't move it to the fold1 folder, only move it if it is a folder! The other files if you have to move them.
The files in the folder contain the string one, they are not called exactly that.
I don't know if I have explained myself very well, if something is not clear leave me a comment and I will try to explain it better!
Thanks in advance for your help!

os.path.isdir(path)
os.path.isdir() method in Python is used to check whether the specified path is an existing directory or not. This method follows symbolic link, that means if the specified path is a symbolic link pointing to a directory then the method will return True.
Check with that function before.
Home this helps :)

A way to create files and directories without overwriting

You know how when you download something and the downloads folder contains a file with the same name, instead of overwriting it or throwing an error, the file ends up with a number appended to the end? For example, if I want to download my_file.txt, but it already exists in the target folder, the new file will be named my_file(2).txt. And if I try again, it will be my_file(3).txt.
I was wondering if there is a way in Python 3.x to check that and get a unique name (not necessarily create the file or directory). I'm currently implementing it doing this:
import os
def new_name(name, newseparator='_')
#name can be either a file or directory name
base, extension = os.path.splitext(name)
i = 2
while os.path.exists(name):
name = base + newseparator + str(i) + extension
i += 1
return name
In the example above, running new_file('my_file.txt') would return my_file_2.txt if my_file.txt already exists in the cwd. name can also contain the full or relative path, it will work as well.

I would use PathLib and do something along these lines:
from pathlib import Path
def new_fn(fn, sep='_'):
p=Path(fn)
if p.exists():
if not p.is_file():
raise TypeError
np=p.resolve(strict=True)
parent=str(np.parent)
extens=''.join(np.suffixes) # handle multiple ext such as .tar.gz
base=str(np.name).replace(extens,'')
i=2
nf=parent+base+sep+str(i)+extens
while Path(nf).exists():
i+=1
nf=parent+base+sep+str(i)+extens
return nf
else:
return p.parent.resolve(strict=True) / p
This only handles files as written but the same approach would work with directories (which you added later.) I will leave that as a project for the reader.

Another way of getting a new name would be using the built-in tempfile module:
from pathlib import Path
from tempfile import NamedTemporaryFile
def new_path(path: Path, new_separator='_'):
prefix = str(path.stem) + new_separator
dir = path.parent
suffix = ''.join(path.suffixes)
with NamedTemporaryFile(prefix=prefix, suffix=suffix, delete=False, dir=dir) as f:
return f.name
If you execute this function from within Downloads directory, you will get something like:
>>> new_path(Path('my_file.txt'))
'/home/krassowski/Downloads/my_file_90_lv301.txt'
where the 90_lv301 part was generated internally by the Python's tempfile module.
Note: with the delete=False argument, the function will create (and leave undeleted) an empty file with the new name. If you do not want to have an empty file created that way, just remove the delete=False, however keeping it will prevent anyone else from creating a new file with such name before your next operation (though they could still overwrite it).
Simply put, having delete=False prevents concurrency issues if you (or the end-user) were to run your program twice at the same time.

Find/Remove oldest file in directory

I am trying to delete oldest file in directory when number of files reaches a threshold.
list_of_files = os.listdir('log')
if len([name for name in list_of_files]) == 25:
oldest_file = min(list_of_files, key=os.path.getctime)
os.remove('log/'+oldest_file)
Problem: The issue is in min method. list_of_files does not contain full path, so it is trying to search file in current directory and failing. How can I pass directory name ('log') to min()?

list_of_files = os.listdir('log')
full_path = ["log/{0}".format(x) for x in list_of_files]
if len(list_of_files) == 25:
oldest_file = min(full_path, key=os.path.getctime)
os.remove(oldest_file)

os.listdir will return relative paths - those are ones that are relative to your current/present working directory/context of what your Python script was executed in (you can see that via os.getcwd()).
Now, the os.remove function expects a full path/absolute path - shells/command line interfaces infer this and do it on your behalf - but Python doesn't. You can get that via using os.path.abspath, so you can change your code to be (and since os.listdir returns a list anyway, we don't need to add a list-comp over it to be able to check its length)...:
list_of_files = os.listdir('log')
if len(list_of_files) >= 25:
oldest_file = min(list_of_files, key=os.path.getctime)
os.remove(os.path.abspath(oldest_file))
That keeps it generic as to where it came from - ie, whatever was produced in the result of os.listdir - you don't need to worry about prepending suitable file paths.

I was trying to achieve the same as what you are trying to achieve. and was facing a similar issue related to os.path.abspath()
I am using a windows system with python 3.7
and the issue is that os.path.abspath() gives one folder up location
replace "Yourpath" and with the path of folder in which your file is and code should work fine
import os
import time
oldest_file = sorted([ "Yourpath"+f for f in os.listdir("Yourpath")], key=os.path.getctime)[0]
print (oldest_file)
os.remove(oldest_file)
print ("{0} has been deleted".format(oldest_file))`
There must be some cleaner method to do the same
I'll update when I get it

The glob library gives on the one hand full paths and allows filtering for file patterns. The above solution resulted in the directory itself as the oldest file, which is not that what I wanted. For me, the following is suitable (a blend of glob and the solution of #Ivan Motin)
import glob
sorted(glob.glob("/home/pi/Pictures/*.jpg"), key=os.path.getctime)[0]

using comprehension (sorry, couldn't resist):
oldest_file = sorted([os.path.abspath(f) for f in os.listdir('log') ], key=os.path.getctime)[0]

Find files that have been changed recursively

I am attempting to write a simple script to recursively rip through a directory and check if any of the files have been changed. I only have the traversal so far:
import fnmatch
import os
from optparse import OptionParser
rootPath = os.getcwd()
pattern = '*.js'
for root, dirs, files in os.walk(rootPath):
for filename in files:
print( os.path.join(root, filename))
I have two issues:
1. How do I tell if a file has been modified?
2. How can I check if a directory has been modified? - I need to do this because the folder I wish to traverse is huge. If I can check if the dir has been modified and not recursively rip through an unchanged dir, this would greatly help.
Thanks!

If you are comparing two files between two folders, you can use os.path.getmtime() on both files and compare the results. If they're the same, they haven't been modified. Note that this will work on both files and folders.

The typical fast way to tell if a file has been modified is to use os.path.getmtime(path) (assuming a Linux or similar environment). This will give you the modification timestamp, which you can compare to a stored timestamp to determine if a file has been modified.
getmtime() works on directories too, but it will only tell you whether a file has been added, removed or renamed in the directory; it will not tell you whether a file has been modified inside the directory.

This is my own implementation of what you might be looking for. Mind that, beside timestamps you might want to track files that have been added or deleted too (like i do). If not you can just change the code on line:
if now == before:
here is the code:
# check if any txt file in folder "wd" has been modified (rewritten added or deleted)
def src_dir_modified(wd):
now = []
global before
all_files = glob.glob(os.path.join(wd,'*.txt'))
for infile in all_files:
now.append([infile, os.stat(infile).st_mtime])
if now == before: # compare files and their time stamps
return False
else:
before = now
print 'Source code has been modified.'
return True

If you can admit the use of a command-line tool, you could use rsync instead of re-inventing the wheel. rsync uses file modification time and file size to decide if a file has been changed or not.
rsync --verbose --recursive --dry-run dir1 dir2 should get the differences between files in dir1 and dir2. You can write the output to a log file to act on it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding files in directories in Python - python

Using antipathy* it gets a little easier: from antipathy import Path def count_files(pattern): return len(Path(file).glob(pattern)) def deletet_files(pattern): Path(file).unlink(pattern) *Disclosure: I'm the author of antipathy.

You can use os.path.dirname(path) to get the parent directory of the thing path points to. def count_files(self): count = 0 for files in os.listdir(os.path.dirname(os.path.abspath(file)))): if files.endswith(someext): count += 1 return count

Related

How to get path to file while the name of file is not full in python

Moving files to a folder in python

A way to create files and directories without overwriting

Find/Remove oldest file in directory

Find files that have been changed recursively

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding files in directories in Python - python

Using antipathy* it gets a little easier: from antipathy import Path def count_files(pattern): return len(Path(__file__).glob(pattern)) def deletet_files(pattern): Path(__file__).unlink(pattern) *Disclosure: I'm the author of antipathy.

You can use os.path.dirname(path) to get the parent directory of the thing path points to. def count_files(self): count = 0 for files in os.listdir(os.path.dirname(os.path.abspath(__file__)))): if files.endswith(someext): count += 1 return count

Related

How to get path to file while the name of file is not full in python

Moving files to a folder in python

A way to create files and directories without overwriting

Find/Remove oldest file in directory

Find files that have been changed recursively

Categories

Resources

Using antipathy* it gets a little easier: from antipathy import Path def count_files(pattern): return len(Path(file).glob(pattern)) def deletet_files(pattern): Path(file).unlink(pattern) *Disclosure: I'm the author of antipathy.

You can use os.path.dirname(path) to get the parent directory of the thing path points to. def count_files(self): count = 0 for files in os.listdir(os.path.dirname(os.path.abspath(file)))): if files.endswith(someext): count += 1 return count