I'm trying to create a script in order to move files from a list I have. I'd like to create some conditions to that but I'm afraid that's where my Python knowledge fails me. I have a list of names (AAA, BBB, CCC).
For each of those, there are six different files with six different extensions that need to be moved (AAA.1, AAA.2, AAA.3, AAA.4, AAA.5, AAA.6). Those files might be in 3 different folders. Let's suppose, either AAA/AAA or BBB/IOL or BBB/ABC. I want all of those files to be moved to REAL/AAA. The thing is, on the folder AAA/AAA there are some AAAXXX.1 files that I do not want to be moved.
I'm completely lost and new to Python (basically, it's my first week :p).
import os
import shutil
import fnmatch
source = os.listdir(r"\\enterprise\AAA\AAA")
destination = os.listdir(r"\\enterprise\REAL\AAA")
set = {
"AAA",
"BBB",
"CCC"
}
for file in source:
for x in set:
if file.__contains__(str(x)):
print(file)
I don't know how could I specify that AAAXXX, BBBXXX and etc shall not be moved.
I don't how how to insert multiple folders for searching files with conditions (If not in folder AAA/AAA, try BBB/IOL and if not BBB/ABC)
I don't know how could I specify that AAAXXX, BBBXXX and etc shall not
be moved.
The simplest way is something like this:
if 'XXX' in file:
continue # this means skip the rest of the cycle and move to next file
I don't how how to insert multiple folders for searching files with
conditions (If not in folder AAA/AAA, try BBB/IOL and if not BBB/ABC)
The most blunt approach would be
if file_name in os.listdir('first_folder'):
# move from first
elif file_name in os.listdir('second_folder'):
# move from second
# continue adding elif for every folder
else:
print(f'file {file_name) is not found')
But I'd probably just scanned every folder and then moved everything matching given name to the destination. Although this might not be what you want if you've got duplicate names and different file contents, and you have some folder particular folder precedence.
Related
I have a code that locates files in a folder for me by name and moves them to another set path.
import os, shutil
files = os.listdir("D:/Python/Test")
one = "one"
two = "two"
oney = "fold1"
twoy="fold2"
def findfile(x,y):
for file in files:
if x in file.lower():
while x in file.lower():
src = ('D:/Python/Test/'+''.join(file))
dest = ('D:/Python/Test/'+y)
if not os.path.exists(dest):
os.makedirs(dest)
shutil.move(src,dest)
break
findfile(one,oney)
findfile(two,twoy)
In this case, this program moves all the files in the Test folder to another path depending on the name, let's say one as an example:
If there is a .png named one, it will move it to the fold1 folder. The problem is that my code does not distinguish between types of files and what I would like is that it excludes the folders from the search.
If there is a folder called one, don't move it to the fold1 folder, only move it if it is a folder! The other files if you have to move them.
The files in the folder contain the string one, they are not called exactly that.
I don't know if I have explained myself very well, if something is not clear leave me a comment and I will try to explain it better!
Thanks in advance for your help!
os.path.isdir(path)
os.path.isdir() method in Python is used to check whether the specified path is an existing directory or not. This method follows symbolic link, that means if the specified path is a symbolic link pointing to a directory then the method will return True.
Check with that function before.
Home this helps :)
I am trying to make a python script that automatically moves files from my internal drive to any usb drive that is plugged in. However this destination path is unpredictable because I am not using the same usb drives everytime. With Raspbian Buster full version, the best I can do so far is automount into /media/pi/xxxxx, where that xxxxxx part is unpredictable. I am trying to make my script account for that. I can get the drive mounting points with
drives = os.listdir("/media/pi/")
but I am worried some will be invalid because of not being unmounted before they're yanked out (I need to run this w/o a monitor or keyboard or VNC or any user input other than replacing USB drives). So I'd think I'd need to do a series of try catch statements perhaps in an if elif elif elif chain, to make sure that the destination is truly valid, but I don't know how to do that w/o hardcoding the names in. The only way I know how to iterate thru a set of names I don't know is
for candidate_drive in drives:
but I don't know how to make it go onto the next candidate drive only if the current one is throwing an exception.
System: Raspberry Pi 4, Raspbian buster full, Python 3.7.
Side note: I am also trying this on Buster lite w/ usbmount, which does have predictable mounting names, but I can't get exfat and ntfs to mount and that is question for another post.
Update: I was thinking about this more and maybe I need a try, except, else statement where the except is pas and the else is break? I am trying it out.
Update2: I rethought my problem and maybe instead of looking for the exception to determine when to try the next possible drive, perhaps I could instead look for a successful transfer and break the loop if so.
import os
import shutil
files_to_send = os.listdir("/home/outgoing_files/")
source_path = "/home/outgoing_files/"
possible_USB_drives = os.listdir("/media/")
for a_possible_drive in possible_USB_drives:
try:
destpath = os.path.join("/media/", a_possible_drive)
for a_file in files_to_send:
shutil.copy(source_path + a_file, destpath)
except:
pass # transfer to possible drive didn't work
else:
break # Stops this entire program bc the transfer worked!
If you have an unsigned number of directories in side of a directory, etc... You cannot use nested for cicles. You need to implement a recursive call function. If you have directories inside a directory, you would like to review all the directories, this is posible iterating over the list of directories using a same function that iterate over them, over and over, until it founds the file.
Lets see an example, you have a path structure like this:
path
dir0
dir2
file3
dir1
file2
file0
file1
You have no way to now how many for cicles are required to iterate over al elements in this structure. You can call an iteration (for cicle) over all elements of a single directory, and do the same to the elements inside that elements. In this structure dirN where N is a number means a directory, and fileN means a file.
You can use os.listdir() function to get the contents of a directory:
print(os.listdir("path/"))
returns:
['dir0', 'dir1', 'file0.txt', 'file1.txt']
Using this function with all the directories you can get all the elements of the structure. You only need to iterate over all the directories. Specificly directories, because if you use a file:
print(os.listdir("path/file0.txt"))
you get an error:
NotADirectoryError: [WinError 267]
But, remember that in Python exists the generator expressions.
String work
If you have a mainpath you need to get access to a a directory inside this with a full string reference: "path/dirN". You cannot access directly to the file that does not is in the scope of the .py script:
print(os.listdir("dir0/"))
gets an error
FileNotFoundError: [WinError 3]
So you need to always format the initial mainpath with the actual path, in this way you can get access to al the elements of the structure.
Filtering
I said that you could use an generator expression to get just the directories of the structure, recursively. Lets take a look to a function:
def searchfile(paths: list,search: str):
for path in paths:
print("Actual path: {}".format(path))
contents = os.listdir(path)
print("Contents: {}".format(contents))
dirs = ["{}{}/".format(path,x) for x in contents if os.path.isdir("{}/{}/".format(path,x)) == True]
print("Directories: {} \n".format(dirs))
searchfile(dirs,search)
In this function we are getting the contents of the actual path, with os.listdir() and then filtering it with a generator expression. Obviusly we use recursive function call with the dirs of the actual path: searchfile(dirs,search)
This algorithm can be applied to any file structure, because the path argument is a list. So you can iterate over directories with directories with directories, and that directories with more directories inside of them.
If you want to get an specific file you could use the second argument, search. Also you can implement a conditional and get the specific path of the file found:
if search in contents:
print("File found! \n")
print("{}".format(os.path.abspath(search)))
sys.exit()
I hope have helped you.
I cannot figure out the best way to find a specific folder and send files to another specific folder, especially if the users directory is slightly different than what I have coded.
I'm working on a program that has a folder of content to grab from and the user basically picks items and when they're done, it creates a folder full of things including the images they chose. I've gotten it to work (and creating all necessary folders in the user's directory works fine but once it becomes more complex, it doesn't work some of the time) but I would like it to work every time, regardless of the user and where they've placed my program on their computer.
an example of relevant code I currently have which I'm sure is redundant compared to what I could be using instead:
init python:
import os
import shutil
current_dir = os.path.normpath(os.getcwd() + "../../")
def grab_content():
filetocopy = "image%s.png"%image_choice ##(this is a separate variable within the program that determines if it is image1.png, image2.png etc)
file_path = os.path.join(current_dir, "Program folder", "stuff", "content")
images_path = os.path.join(file_path, "images")
new_images_path = os.path.join(current_dir, "My Templates", anothervariable_name, "game", "template", "image_choices")
try:
shutil.copy(images_path + "\\" + filetocopy, new_images_path)
except:
print("error")
(folders listed in this have been checked if existing and placed if not - but not for the new file path due to that needing to be in a specific place within the main folder)
It either works if I have the files set up just right (for my own machine which defeats the purpose) or it doesn't do anything or I get an error saying the path doesn't exist. I have code prior to this that creates the folders needed but I'm trying to grab images from the folder that belongs to the actual program and put them (only ones I specify) into a new folder I create through the program.
Would i use os.walk? I was looking at all the os code but this is my first time dealing with any of it so any advice is helpful.
I am trying to write a piece of code that will recursively iterate through the subdirectories of a specific directory and stop only when reaching files with a '.nii' extension, appending these files to a list called images - a form of a breadth first search. Whenever I run this code, however, I keep receiving [Errno 20] Not a directory: '/Volumes/ARLO/ADNI/.DS_Store'
*/Volumes/ARLO/ADNI is the folder I wish to traverse through
*I am doing this in Mac using the Spyder IDE from Anaconda because it is the only way I can use the numpy and nibabel libraries, which will become important later
*I have already checked that this folder directly contains only other folders and not files
#preprocessing all the MCIc files
import os
#import nibabel as nib
#import numpy as np
def pwd():
cmd = 'pwd'
os.system(cmd)
print(os.getcwd())
#Part 1
os.chdir('/Volumes/ARLO')
images = [] #creating an empty list to store MRI images
os.chdir('/Volumes/ARLO/ADNI')
list_sample = [] #just an empty list for an earlier version of
#the program
#Part 2
#function to recursively iterate through folder of raw MRI
#images and extract them into a list
#breadth first search
def extract(dir):
#dir = dir.replace('.DS_Store', '')
lyst = os.listdir(dir) #DS issue
for item in lyst:
if 'nii' not in item: #if item is not a .nii file, if
#item is another folder
newpath = dir + '/' + item
#os.chdir(newpath) #DS issue
extract(newpath)
else: #if item is the desired file type, append it to
#the list images
images.append(item)
#Part 3
adni = os.getcwd() #big folder I want to traverse
#print(adni) #adni is a string containing the path to the ADNI
#folder w/ all the images
#print(os.listdir(adni)) this also works, prints the actual list
"""adni = adni + '/' + '005_S_0222'
os.chdir(adni)
print(os.listdir(adni))""" #one iteration of the recursion,
#works
extract(adni)
print(images)
With every iteration, I wish to traverse further into the nested folders by appending the folder name to the growing path, and part 3 of the code works, i.e. I know that a single iteration works. Why does os keep adding the '.DS_Store' part to my directories in the extract() function? How can I correct my code so that the breadth first traversal can work? This folder contains hundreds of MRI images, I cannot do it without automation.
Thank you.
The .DS_Store files are not being created by the os module, but by the Finder (or, I think, sometimes Spotlight). They're where macOS stores things like the view options and icon layout for each directory on your system.
And they've probably always been there. The reason you didn't see them when you looked is that files that start with a . are "hidden by convention" on Unix, including macOS. Finder won't show them unless you ask it to show hidden files; ls won't show them unless you pass the -a flag; etc.
So, that's your core problem:
I have already checked that this folder directly contains only other folders and not files
… is wrong. The folder does contain at least one regular file; .DS_Store.
So, what can you do about that?
You could add special handling for .DS_Store.
But a better solution is probably to just check each file to see if it's a file or directory, by calling os.path.isdir on it.
Or, even better, use os.scandir instead of listdir, which gives you entries with more information than just the name, so you don't need to make extra calls like isdir.
Or, best of all, just throw out this code and use os.walk to recursively visit every file in every directory underneath your top-level directory.
I am using Python 3.5 to analyze data contained in csv files. These files are contained in a "figs" directory, which is contained in a case directory, which is contained in an overall data directory, e.g.:
/strm1/serino/DATA/06052009/figs
Or more generally:
/strm1/serino/DATA/case_date_in_MMDDYYYY/figs
The directory I am starting in is '/strm1/serino/DATA/,' and each subdirectory is the month, day, and year of a case I am working with. Each subdirectory contains another subdirectory named 'figs,' and that is the location of each case's csv file. To be exact:
/strm1/serino/DATA/case_date_in_MMDDYYYY/figs/case_date_in_MMDDYYYY.csv
So, I would like to start in my DATA directory and go through its subdirectories to find those that have the MMDDYYYY naming. However, some of the case directories may be named with a state abbreviation at the end, like: '06052009_TX.' Therefore, instead of matching the MMDDYYYY naming exactly, it could be something as simple as verifying that the directory name contains any number 1 through 9.
Once I am in the first subdirectory (the case directory) I would like to move into the 'figs' subdirectory. Once there, I want to access the csv file with the same naming convention as the first subdirectory (the case directory). I will fill existing arrays with the data contained in each csv file.
Basically, my question concerns navigating through multiple subdirectories that match a certain naming convention and ultimately accessing the data file at the "end." I was naively playing around with glob, fnmatch, os.listdir, and os.walk, but I could not get anything close enough to working that I feel would be helpful to include. I am not very familiar with those modules. What I can include is what I am going for:
for dirs in data_dir that contain a number:
go into this directory
go into 'figs' directory
read data from the csv file whose name matches its case directory name (or whose name format matches the case directory name format)
I have come across related questions, but I have not been able to apply their answers in the way that I would like, especially with nested directories. I really appreciate the help, and let me know if I need to clarify anything.
The following should get you going. It uses the datetime.strptime() function to attempt to convert each folder name into a valid datetime object. If the conversion fails, then you know that the folder name is not in the correct format and can be skipped. It then attempts to parse any CSV file found in the corresponding fig folder:
from datetime import datetime
import glob
import csv
import os
dirpath, dirnames, filenames = next(os.walk('/strm1/serino/DATA'))
for dirname in dirnames:
if len(dirname) >= 8:
try:
dt = datetime.strptime(dirname[:8], '%m%d%Y')
print(dt, dirname)
csv_folder = os.path.join(dirpath, dirname)
for csv_file in glob.glob(os.path.join(csv_folder, 'figs', '*.csv')):
with open(csv_file, newline='') as f_input:
csv_input = csv.reader(f_input)
for row in csv_input:
print(row)
except ValueError as e:
pass
You listed several problems above. Which one are you stuck on? It seems like you already know how to navigate the file storage system using os.path. You may not know of the function os.path.join() which allows you to manually specify a file path relative to a file as such:
os.path.abspath(os.path.join(os.path.dirname(__file__), '../..', 'Data/TrailShelters/'))
To break down the above:
os.path.dirname(__file__) returns the path of the current file. '../..' means: go up two levels in the folder hierarchy. And Data/TrailShelters/ is the directory I wish to navigate to.
How does this apply to your particular case? Well, you will need to make some adaptations but you can store the os.path of the parent directory in a variable. Then you can essentially use a while sub_dir is not null loop to iterate through subdirectories. For every subdirectory you will want to examine its os.path and extract the particular part of the path you are interested in. Then you can simply use something like: if 'TN' in subdirectory_name to determine if it is a subdirectory you are interested in. If so; then update the saved os.path of the parent directory by appending the path to the subdirectory. Does that make any sense?