I have 5 files and they exist in 5 different locations
I would like to write a piece of code preferably in Python(I am a newbie) and the code should check all these 5 folders and code should check if the file exists, if it does, it should move over all those files from different locations to a single shared drive location. I tried the below URL and used import shutil, this worked but this is for one or many files from the same location to another location. Any pointers, thoughts, suggestions on how I can do this will be greatly appreciated.
https://linuxhint.com/move-file-to-other-directory-python/
Speaking just about principles on how to solve your problem without writing any code:
From the code you linked, you have a solution for copying one file at a time. Moving that code inside a function will let you easily re-use it for many different input files, and even more than one destination file. You can define the file(s) to be copied and the destination folder as arguments.
Python's os library is great for operating system actions like checking if file exists, making the target directory if needed, just to name a few that might be useful. But this is just one tool for getting there, you'll likely see a few different but valid answers.
Try Pathlib ->
from pathlib import Path
import shutil
inp_file_paths = [Path('input_file_path_1'),
Path('input_file_path_2'),
Path('input_file_path_3')]
destination_path = Path('destination_path')
def check(inp_files):
for file in inp_files:
if file.exists() == False:
print(f'file = {file.name} doesn\'t exists please check the path')
inp_files.remove(file)
return inp_files
def move_files(inp_files):
try:
for file in inp_files:
dest_path = destination_path / file.name
shutil.copy(file, dest_path)
except Exception as e:
print(f'Exception {e} occurred while moving the file {file.name}')
inp_file_paths = check(inp_file_paths)
move_files(inp_file_paths)
Related
I had 120 files in my source folder which I need to move to a new directory (destination). The destination is made in the function I wrote, based on the string in the filename. For example, here is the function I used.
path ='/path/to/source'
dropbox='/path/to/dropbox'
files = = [os.path.join(path,i).split('/')[-1] for i in os.listdir(path) if i.startswith("SSE")]
sam_lis =list()
for sam in files:
sam_list =sam.split('_')[5]
sam_lis.append(sam_list)
sam_lis =pd.unique(sam_lis).tolist()
# Using the above list
ID = sam_lis
def filemover(ID,files,dropbox):
"""
Function to move files from the common place to the destination folder
"""
for samples in ID:
for fs in files:
if samples in fs:
desination = dropbox + "/"+ samples + "/raw/"
if not os.path.isdir(desination):
os.makedirs(desination)
for rawfiles in fnmatch.filter(files, pat="*"):
if samples in rawfiles:
shutil.move(os.path.join(path,rawfiles),
os.path.join(desination,rawfiles))
In the function, I am creating the destination folders, based on the ID's derived from the files list. When I tried to run this for the first time it threw me FILE NOT exists error.
However, later when I checked the source all files starting with SSE were missing. In the beginning, the files were there. I want some insights here;
Whether or not os.shutil.move moves the files to somewhere like a temp folder instead of destination folder?
whether or not the os.shutil.move deletes the files from the source in any circumstance?
Is there any way I can test my script to find the potential reasons for missing files?
Any help or suggestions are much appreciated?
It is late but people don't understand the op's question. If you move a file into a non-existing folder, the file seems to become a compressed binary and get lost forever. It has happened to me twice, once in git bash and the other time using shutil.move in Python. I remember the python happens when your shutil.move destination points to a folder instead of to a copy of the full file path.
For example, if you run the code below, a similar situation to what the op described will happen:
src_folder = r'C:/Users/name'
dst_folder = r'C:/Users/name/data_images'
file_names = glob.glob(r'C:/Users/name/*.jpg')
for file in file_names:
file_name = os.path.basename(file)
shutil.move(os.path.join(src_folder, file_name), dst_folder)
Note that dst_folder in the else block is just a folder. It should be dst_folder + file_name. This will cause what the Op described in his question. I find something similar on the link here with a more detailed explanation of what went wrong: File moving mistake with Python
shutil.move does not delete your files, if for any reason your files failed to move to a given location, check the directory where your code is stored, for a '+' folder your files are most likely stored there.
I am trying to access a file from a Box folder as I am working on two different computers. So the file path is pretty much the same except for the username.
I am trying to load a numpy array from a .npy file and I could easily change the path each time, but it would be nice if I could make it universal.
Here is what the line of code looks like on my one computer:
y_pred_walking = np.load('C:/Users/Eric/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
And here is what the line of code looks like on the other computer:
y_pred_walking = 'C:/Users/erapp/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
The only difference is that the username on one computer is Eric and the other is erapp, but is there a way where I can make the line universal to all computers where all computers will have the Box folder?
You could either save the file to a path that doesn't depend on the user: e.g. 'C:/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'
Or you could do some string formatting. One way would be with an environment or configuration variable that indicates which is the relevant user, and then for your load statement:
import os
current_user = os.environ.get("USERNAME") # assuming you're running on the Windows box as the relevant user
# Now load the formatted string. f-strings are better, but this is more obvious since f-strings are still very new to Python
y_pred_walking = 'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy'.format(user=current_user)
Yes, there is a way, at least for the problem as it is right now solution is pretty simple: to use f-strings
user='Eric'
y_pred_walking =np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
or more general
def pred_walking(user):
return np.load(f'C:/Users/{user}/Box/CMU_MBL/Data/Calgary/4_Best_Results/Walking/Knee/bidir_lstm_50_50/predictions/y_pred_test.npy')
so on any machine you just do
y_pred_walking=pred_walking(user)
with defined user before, to receive the result
Simply search the folders recursivly for your file:
filename = 'y_pred_test.npy'
import os
import random
# creates 1000 directories with a 1% chance of having the file as well
for k in range(20):
for i in range(10):
for j in range(5):
os.makedirs(f"./{k}/{i}/{j}")
if random.randint(1,100) == 2:
with open(f"./{k}/{i}/{j}/{filename}","w") as f:
f.write(" ")
# search the directories for your file
found_in = []
# this starts searching in your current folder - you can give it your c:\Users\ instead
for root,dirs,files in os.walk("./"):
if filename in files:
found_in.append(os.path.join(root,filename))
print(*found_in,sep = "\n")
File found in:
./17/3/1/y_pred_test.npy
./3/8/1/y_pred_test.npy
./16/3/4/y_pred_test.npy
./16/5/3/y_pred_test.npy
./14/2/3/y_pred_test.npy
./0/5/4/y_pred_test.npy
./11/9/0/y_pred_test.npy
./9/8/1/y_pred_test.npy
If you get read errors because of missing file/directory permissions you can start directly in the users folder:
# Source: https://stackoverflow.com/a/4028943/7505395
from pathlib import Path
home = str(Path.home())
found_in = []
for root,dirs,files in os.walk(home):
if filename in files:
found_in.append(os.path.join(root,filename))
# use found_in[0] or break as soon as you find first file
You can use the expanduser function in the os.path module to modify a path to start from the home directory of a user
https://docs.python.org/3/library/os.path.html#os.path.expanduser
Cheers everybody,
I need help with something in python 3.6 exactly. So i have structure of data like this:
|main directory
| |subdirectory's(plural)
| | |.wav files
I'm currently working from a directory where main directory is placed so I don't need to specify paths before that. So firstly I wanna iterate over my main directory and find all subdirectorys. Then in each of them I wanna find the .wav files, and when done with processing them I wanna go to next subdirectory and so on until all of them are opened, and all .wav files are processed. Exactly what I wanna do with those .wav files is input them in my program, process them so i can convert them to numpy arrays, and then I convert that numpy array into some other object (working with tensorflow to be exact, and wanna convert to TF object). I wrote about the whole process if anybody has any fast advices on doing that too so why not.
I tried doing it with for loops like:
for subdirectorys in open(data_path, "r"):
for files in subdirectorys:
#doing some processing stuff with the file
The problem is that it always raises error 13, Permission denied showing on that data_path I gave him but when I go to properties there it seems okay and all permissions are fine.
I tried some other ways like with os.open or i replaced for loop with:
with open(data_path, "r") as data:
and it always raises permission denied error.
os.walk works in some way but it's not what I need, and when i tried to modify it id didn't give errors but it also didnt do anything.
Just to say I'm not any pro programmer in python so I may be missing an obvious thing but ehh, I'm here to ask and learn. I also saw a lot of similiar questions but they mainly focus on .txt files and not specificaly in my case so I need to ask it here.
Anyway thanks for help in advance.
Edit: If you want an example for glob (more sane), here it is:
from pathlib import Path
# The pattern "**" means all subdirectories recursively,
# with "*.wav" meaning all files with any name ending in ".wav".
for file in Path(data_path).glob("**/*.wav"):
if not file.is_file(): # Skip directories
continue
with open(file, "w") as f:
# do stuff
For more info see Path.glob() on the documentation. Glob patterns are a useful thing to know.
Previous answer:
Try using either glob or os.walk(). Here is an example for os.walk().
from os import walk, path
# Recursively walk the directory data_path
for root, _, files in walk(data_path):
# files is a list of files in the current root, so iterate them
for file in files:
# Skip the file if it is not *.wav
if not file.endswith(".wav"):
continue
# os.path.join() will create the path for the file
file = path.join(root, files)
# Do what you need with the file
# You can also use block context to open the files like this
with open(file, "w") as f: # "w" means permission to write. If reading, use "r"
# Do stuff
Note that you may be confused about what open() does. It opens a file for reading, writing, and appending. Directories are not files, and therefore cannot be opened.
I suggest that you Google for documentation and do more reading about the functions used. The documentation will help more than I can.
Another good answer explaining in more detail can be seen here.
import glob
import os
main = '/main_wavs'
wavs = [w for w in glob.glob(os.path.join(main, '*/*.wav')) if os.path.isfile(w)]
In terms of permissions on a path A/B/C... A, B and C must all be accessible. For files that means read permission. For directories, it means read and execute permissions (listing contents).
How do I get the data from multiple txt files that placed in a specific folder. I started with this could not fix. It gives an error like 'No such file or directory: '.idea' (??)
(Let's say I have an A folder and in that, there are x.txt, y.txt, z.txt and so on. I am trying to get and print the information from all the files x,y,z)
def find_get(folder):
for file in os.listdir(folder):
f = open(file, 'r')
for data in open(file, 'r'):
print data
find_get('filex')
Thanks.
If you just want to print each line:
import glob
import os
def find_get(path):
for f in glob.glob(os.path.join(path,"*.txt")):
with open(os.path.join(path, f)) as data:
for line in data:
print(line)
glob will find only your .txt files in the specified path.
Your error comes from not joining the path to the filename, unless the file was in the same directory you were running the code from python would not be able to find the file without the full path. Another issue is you seem to have a directory .idea which would also give you an error when trying to open it as a file. This also presumes you actually have permissions to read the files in the directory.
If your files were larger I would avoid reading all into memory and/or storing the full content.
First of all make sure you add the folder name to the file name, so you can find the file relative to where the script is executed.
To do so you want to use os.path.join, which as it's name suggests - joins paths. So, using a generator:
def find_get(folder):
for filename in os.listdir(folder):
relative_file_path = os.path.join(folder, filename)
with open(relative_file_path) as f:
# read() gives the entire data from the file
yield f.read()
# this consumes the generator to a list
files_data = list(find_get('filex'))
See what we got in the list that consumed the generator:
print files_data
It may be more convenient to produce tuples which can be used to construct a dict:
def find_get(folder):
for filename in os.listdir(folder):
relative_file_path = os.path.join(folder, filename)
with open(relative_file_path) as f:
# read() gives the entire data from the file
yield (relative_file_path, f.read(), )
# this consumes the generator to a list
files_data = dict(find_get('filex'))
You will now have a mapping from the file's name to it's content.
Also, take a look at the answer by #Padraic Cunningham . He brought up the glob module which is suitable in this case.
The error you're facing is simple: listdir returns filenames, not full pathnames. To turn them into pathnames you can access from your current working directory, you have to join them to the directory path:
for filename in os.listdir(directory):
pathname = os.path.join(directory, filename)
with open(pathname) as f:
# do stuff
So, in your case, there's a file named .idea in the folder directory, but you're trying to open a file named .idea in the current working directory, and there is no such file.
There are at least four other potential problems with your code that you also need to think about and possibly fix after this one:
You don't handle errors. There are many very common reasons you may not be able to open and read a file--it may be a directory, you may not have read access, it may be exclusively locked, it may have been moved since your listdir, etc. And those aren't logic errors in your code or user errors in specifying the wrong directory, they're part of the normal flow of events, so your code should handle them, not just die. Which means you need a try statement.
You don't do anything with the files but print out every line. Basically, this is like running cat folder/* from the shell. Is that what you want? If not, you have to figure out what you want and write the corresponding code.
You open the same file twice in a row, without closing in between. At best this is wasteful, at worst it will mean your code doesn't run on any system where opens are exclusive by default. (Are there such systems? Unless you know the answer to that is "no", you should assume there are.)
You don't close your files. Sure, the garbage collector will get to them eventually--and if you're using CPython and know how it works, you can even prove the maximum number of open file handles that your code can accumulate is fixed and pretty small. But why rely on that? Just use a with statement, or call close.
However, none of those problems are related to your current error. So, while you have to fix them too, don't expect fixing one of them to make the first problem go away.
Full variant:
import os
def find_get(path):
files = {}
for file in os.listdir(path):
if os.path.isfile(os.path.join(path,file)):
with open(os.path.join(path,file), "r") as data:
files[file] = data.read()
return files
print(find_get("filex"))
Output:
{'1.txt': 'dsad', '2.txt': 'fsdfs'}
After the you could generate one file from that content, etc.
Key-thing:
os.listdir return a list of files without full path, so you need to concatenate initial path with fount item to operate.
there could be ideally used dicts :)
os.listdir return files and folders, so you need to check if list item is really file
You should check if the file is actually file and not a folder, since you can't open folders for reading. Also, you can't just open a relative path file, since it is under a folder, so you should get the correct path with os.path.join. Check below:
import os
def find_get(folder):
for file in os.listdir(folder):
if not os.path.isfile(file):
continue # skip other directories
f = open(os.path.join(folder, file), 'r')
for line in f:
print line
This is my first time using python and I keep running into error 183. The script I created searches the network for all '.py' files and copies them to my backup drive. Please don't laugh at my script as this is my first.
Any clue to what I am doing wrong in the script?
import os
import shutil
import datetime
today = datetime.date.today()
rundate = today.strftime("%Y%m%d")
for root,dirr,filename in os.walk("p:\\"):
for files in filename:
if files.endswith(".py"):
sDir = os.path.join(root, files)
dDir = "B:\\Scripts\\20120124"
modname = rundate + '_' + files
shutil.copy(sDir, dDir)
os.rename(os.path.join(dDir, files), os.path.join(dDir, modname))
print "Renamed %s to %s in %s" % (files, modname, dDir)
I'm guessing you are running the script on windows. According to the list of windows error codes error 183 is ERROR_ALREADY_EXISTS
So I would guess the script is failing because you're attempting to rename a file over an existing file.
Perhaps you are running the script more than once per day? That would result in all the destination files already being there, so the rename is failing when the script is run additional times.
If you specifically want to overwrite the files, then you should probably delete them using os.unlink first.
Given the fact that error 183 is [Error 183] Cannot create a file when that file already exists, you're most likely finding 2 files with the same name in the os.walk() call and after the first one is renamed successfully, the second one will fail to be renamed to the same name so you'll get a file already exists error.
I suggest a try/except around the os.rename() call to treat this case (append a digit after the name or something).
[Yes, i know it's been 7 years since this question was asked but if I got here from a google search maybe others are reaching it too and this answer might help.]
I just encounter the same issue, when you trying to rename a folder with a folder that existed in the same directory has the same name, Python will raise an error.
If you trying to do that in Windows Explorer, it will ask you if you want to merge these two folders. however, Python doesn't have this feature.
Below is my codes to achieve the goal of rename a folder while a same name folder already exist, which is actually merging folders.
import os, shutil
DEST = 'D:/dest/'
SRC = 'D:/src/'
for filename in os.listdir(SRC): # move files from SRC to DEST folder.
try:
shutil.move(SRC + filename, DEST)
# In case the file you're trying to move already has a copy in DEST folder.
except shutil.Error: # shutil.Error: Destination path 'D:/DEST/xxxx.xxx' already exists
pass
# Now delete the SRC folder.
# To delete a folder, you have to empty its files first.
if os.path.exists(SRC):
for i in os.listdir(SRC):
os.remove(os.path.join(SRC, i))
# delete the empty folder
os.rmdir(SRC)