I have a fun little script that i would like to make a copy of itself in a random directory - then run that copy of itself.
I know how to run files with (hacky):
os.system('Filename.py')
And i know how to replicate files with shuttle - but i am stuck at the random directory. Maybe if i could somehow get a list of all directories available and then pick one at random from the list - then remove this directory from the list?
Thanks,
Itechmatrix
You can get list of all dirs and subdirs, and shuffle it in random as follows:
import os
import random
all_dirs = [x[0] for x in os.walk('/tmp')]
random.shuffle(all_dirs)
for a_dir in all_dirs:
print(a_dir)
# do something witch each directory, e.g. copy some file there.
You can get a list of directories and then randomly select:
import os
import random
dirs = [d for d in os.listdir('.') if os.path.isdir(d)]
n = random.randrange(len(dirs))
print(dirs[n])
If you're on a Mac, there are a fair amount of hidden and restricted directories near the root. You can potentially run into errors with readability and writability. One way to get around that is to iterate through the available directories and sort all the no goes using the os module.
After that you can use the random.choice module to pick a random directory from that list.
import os, random
writing_dir = []
for directory in os.listdir():
if os.access(directory, W_OK) # W_OK ensures that the path is writable
writing_dir.append(directory)
path = random.choice(writing_dir)
I'm working on a similar script right now.
Related
I have a small problem. I am trying to move 20x500 images in 20 predefined folders. I can make this work with just 500 random images and I have identified the problem; I draw 500 random files, move them and then it tries doing it again but since it doesn't update the random list, it fails when it reaches an image that it thinks is part of the random group but it has already been moved and thus fails. How do I "update" the random list of files so that it doesn't fail because I move stuff? The code is:
import os
import shutil
import random
folders = os.listdir(r'place_where_20_folders_are')
files = os.listdir(r'place_where_images_are')
string=r"string_to_add_to_make_full_path_of_each_file"
folders=[string+s for s in folders]
for folder in folders:
for fileName in random.sample(files, min(len(files), 500)):
path = os.path.join(r'place_where_images_are', fileName)
shutil.move(path, folder)
I think the problem in your code is that the random.sample() method leaves the original files list unchanged. Because of this you have a chance of getting the same filename twice, but as you already moved it before you will have an error.
Instead of using sample you could use this snippet:
files_to_move = [files.pop(random.randrange(0, len(files))) for _ in range(500)]
This will pop (thus removing) 500 random files from the files list and save them in files_to_move. As you repeat this, the files list becomes smaller.
This answer was inspired by this answer to the question Random Sample with remove from List.
This would be used like this:
import os
import shutil
import random
folders = os.listdir(r'place_where_20_folders_are')
files = os.listdir(r'place_where_images_are')
string=r"string_to_add_to_make_full_path_of_each_file"
folders=[string+s for s in folders]
for folder in folders:
files_to_move = [files.pop(random.randrange(0, len(files))) for _ in range(500)]
for file_to_move in files_to_move:
path = os.path.join(r'place_where_images_are', file_to_move)
shutil.move(path, folder)
I would start by making a list of random sample first and then pass it for moving in different location, and removing my list by using random libraries remove() , or just clearing/deleting/popping the list itself before the loop starts again.
Hope its helps.
I am trying to write a piece of code that will recursively iterate through the subdirectories of a specific directory and stop only when reaching files with a '.nii' extension, appending these files to a list called images - a form of a breadth first search. Whenever I run this code, however, I keep receiving [Errno 20] Not a directory: '/Volumes/ARLO/ADNI/.DS_Store'
*/Volumes/ARLO/ADNI is the folder I wish to traverse through
*I am doing this in Mac using the Spyder IDE from Anaconda because it is the only way I can use the numpy and nibabel libraries, which will become important later
*I have already checked that this folder directly contains only other folders and not files
#preprocessing all the MCIc files
import os
#import nibabel as nib
#import numpy as np
def pwd():
cmd = 'pwd'
os.system(cmd)
print(os.getcwd())
#Part 1
os.chdir('/Volumes/ARLO')
images = [] #creating an empty list to store MRI images
os.chdir('/Volumes/ARLO/ADNI')
list_sample = [] #just an empty list for an earlier version of
#the program
#Part 2
#function to recursively iterate through folder of raw MRI
#images and extract them into a list
#breadth first search
def extract(dir):
#dir = dir.replace('.DS_Store', '')
lyst = os.listdir(dir) #DS issue
for item in lyst:
if 'nii' not in item: #if item is not a .nii file, if
#item is another folder
newpath = dir + '/' + item
#os.chdir(newpath) #DS issue
extract(newpath)
else: #if item is the desired file type, append it to
#the list images
images.append(item)
#Part 3
adni = os.getcwd() #big folder I want to traverse
#print(adni) #adni is a string containing the path to the ADNI
#folder w/ all the images
#print(os.listdir(adni)) this also works, prints the actual list
"""adni = adni + '/' + '005_S_0222'
os.chdir(adni)
print(os.listdir(adni))""" #one iteration of the recursion,
#works
extract(adni)
print(images)
With every iteration, I wish to traverse further into the nested folders by appending the folder name to the growing path, and part 3 of the code works, i.e. I know that a single iteration works. Why does os keep adding the '.DS_Store' part to my directories in the extract() function? How can I correct my code so that the breadth first traversal can work? This folder contains hundreds of MRI images, I cannot do it without automation.
Thank you.
The .DS_Store files are not being created by the os module, but by the Finder (or, I think, sometimes Spotlight). They're where macOS stores things like the view options and icon layout for each directory on your system.
And they've probably always been there. The reason you didn't see them when you looked is that files that start with a . are "hidden by convention" on Unix, including macOS. Finder won't show them unless you ask it to show hidden files; ls won't show them unless you pass the -a flag; etc.
So, that's your core problem:
I have already checked that this folder directly contains only other folders and not files
… is wrong. The folder does contain at least one regular file; .DS_Store.
So, what can you do about that?
You could add special handling for .DS_Store.
But a better solution is probably to just check each file to see if it's a file or directory, by calling os.path.isdir on it.
Or, even better, use os.scandir instead of listdir, which gives you entries with more information than just the name, so you don't need to make extra calls like isdir.
Or, best of all, just throw out this code and use os.walk to recursively visit every file in every directory underneath your top-level directory.
I have a script that runs on a folder to create contour lines. Since I have roughly 2700 DEM which need to be processed, I need a way using the script to run on all folders within the parent folder saving them to an output folder. I am not sure how to script this but it would be greatly appreciated if I could get some guidance.
The following is the script I currently have which works on a single folder.
import arcpy
from arcpy import env
from arcpy.sa import *
env.workspace = "C:/DATA/ScriptTesting/test"
inRaster = "1km17670"
contourInterval = 5
baseContour = 0
outContours = "C:/DATA/ScriptTesting/test/output/contours5.shp"
arcpy.CheckOutExtension("Spatial")
Contour(inRaster,outContours, contourInterval, baseContour)
You're probably looking for os.walk(), which can recursively walk through all subdirectories of the given directory. You can either use the current working directory, or calculate your own parent folder and start from there, or whatever - but it'll give you the filenames for everything beneath what it starts with. From there, you can make a subroutine to determine whether or not to perform your script on that file.
You can get a list of all directories like this:
import arcpy
from arcpy import env
from arcpy.sa import *
import os
# pass in your root directory here
directories = os.listdir(root_dir)
Then you can iterate over this dirs:
for directory in directories:
# I assume you want the workspace attribute set to the subfolders
env.workspace = os.path.realpath(directory)
inRaster = "1km17670"
contourInterval = 5
baseContour = 0
# here you need to adjust the outputfile name if there is a file for every subdir
outContours = "C:/DATA/ScriptTesting/test/output/contours5.shp"
arcpy.CheckOutExtension("Spatial")
Contour(inRaster,outContours, contourInterval, baseContour)
As #a625993 mentioned, os.walk could be useful too if you have recursively nested directories. But as I can read from your question, you have just single subdirectories which directly contain the files and no further directories. That's why listing just the dirs underneath your root directory should be enough.
hello I want to move or copy many folders from some folder list to other folder list I use glob and shutil libraries for this work.
first I create a folder list :
import glob
#paths from source folder
sourcepath='C:/my/store/path/*'
paths = glob.glob(sourcepath)
my_file='10'
selected_path = filter(lambda x: my_file in x, paths)
#paths from destination folder
destpath='C:/my/store/path/*'
paths2 = glob.glob(destpath)
my_file1='20'
selected_path1 = filter(lambda x: my_file1 in x, paths2)
and now I have two lists from paths(selected_path,selected_path1)
now I want to movie or copy folder from first list(selected_path) to second list(selected_path1)
finaly I try this code to move folders but without success :
import shutil
for I,j in zip(selected_path,selected_path1)
shutil.move(i, j)
but that cant work,any ide how to do my code to work ?
First, Obviously your use of lambda isn't useful, glob function can perform this filtering. This is what glob really does, so you're basically littering your code with more unnecessary function call, which is quite expensive in terms of performance.
Look at this example, identical to yours:
import glob
# Find all .py files
sourcepath= 'C:/my/store/path/*.py'
paths = glob.glob(sourcepath)
# Find files that end with 'codes'
destpath= 'C:/my/store/path/*codes'
paths2 = glob.glob(destpath)
Second, the second glob function call may or may not return a list of directories to move your directories/files to. This makes your code dependent on what C:/my/store/pathcontains. That is, you must guarantee that 'C:/my/store/path must contain only directories and never files, so glob will return only directories to be used in shutil.move. If the user later added files not folders to C:/my/store/path that happened to end with the name 'codes' and they didn't specify any extensions (e.g, codes.txt, codes.py...) then you'll find this file in the returned list of glob in paths2. Of course, guaranteeing a directory to contain only subdirectories is problematic and not a good idea, not at all. You can test for directories through os.path.isdir
Notice something, you're using lambda with the help of filter to filter out any string that doesn't contain 10 in your first call to filter, something you can achieve with glob itself:
glob.glob('C:/my/store/path/*10*')
Now any file or subdirectory of C:/my/store/path that contains 10 in its name will be collected in the returned list of the glob function.
Third, zip truncates to the shortest iterable in its argument list. In other words, if you would like to move every path in paths to every path in paths2, you need len(paths) == len(paths2) so each file or directory in paths has a directory to be moved to in paths2.
Fourth, You missed the semicolon for the for loop and in the call for shutil.move you used i instead of I. Python is a case-sensitive language, and I uppercase isn't exactly the same as i lowercase:
import shutil
for I,j in zip(selected_path,selected_path1) # missing :
shutil.move(i, j) # i not I
Corrected code:
import shutil
for I,j in zip(selected_path,selected_path1) # missing :
shutil.move(I, j) # i not I
Presumably, paths2 contains only subdirectories of C:/my/store/path directory, this is a better approach to write your code, but definitely not the best:
import glob
#paths from source folder
sourcepath='C:/my/store/path/*10*'
paths = glob.glob(sourcepath)
#paths from destination folder
destpath='C:/my/store/path/*20*'
paths2 = glob.glob(destpath)
import shutil
for i,j in zip(paths,paths2):
shutil.move(i, j)
*Still some of the previous issues that I mentioned above apply to this code.
And now that you finished the long marathon of reading this answer, what would you like to do to improve your code? I'll be glad to help if you still find something ambiguous.
Good luck :)
What is the best way to choose a random file from a directory in Python?
Edit: Here is what I am doing:
import os
import random
import dircache
dir = 'some/directory'
filename = random.choice(dircache.listdir(dir))
path = os.path.join(dir, filename)
Is this particularly bad, or is there a particularly better way?
import os, random
random.choice(os.listdir("C:\\")) #change dir name to whatever
Regarding your edited question: first, I assume you know the risks of using a dircache, as well as the fact that it is deprecated since 2.6, and removed in 3.0.
Second of all, I don't see where any race condition exists here. Your dircache object is basically immutable (after directory listing is cached, it is never read again), so no harm in concurrent reads from it.
Other than that, I do not understand why you see any problem with this solution. It is fine.
If you want directories included, Yuval A's answer. Otherwise:
import os, random
random.choice([x for x in os.listdir("C:\\") if os.path.isfile(os.path.join("C:\\", x))])
The simplest solution is to make use of os.listdir & random.choice methods
random_file=random.choice(os.listdir("Folder_Destination"))
Let's take a look at it step by step :-
1} os.listdir method returns the list containing the name of
entries (files) in the path specified.
2} This list is then passed as a parameter to random.choice method
which returns a random file name from the list.
3} The file name is stored in random_file variable.
Considering a real time application
Here's a sample python code which will move random files from one directory to another
import os, random, shutil
#Prompting user to enter number of files to select randomly along with directory
source=input("Enter the Source Directory : ")
dest=input("Enter the Destination Directory : ")
no_of_files=int(input("Enter The Number of Files To Select : "))
print("%"*25+"{ Details Of Transfer }"+"%"*25)
print("\n\nList of Files Moved to %s :-"%(dest))
#Using for loop to randomly choose multiple files
for i in range(no_of_files):
#Variable random_file stores the name of the random file chosen
random_file=random.choice(os.listdir(source))
print("%d} %s"%(i+1,random_file))
source_file="%s\%s"%(source,random_file)
dest_file=dest
#"shutil.move" function moves file from one directory to another
shutil.move(source_file,dest_file)
print("\n\n"+"$"*33+"[ Files Moved Successfully ]"+"$"*33)
You can check out the whole project on github
Random File Picker
For addition reference about os.listdir & random.choice method you can refer to tutorialspoint learn python
os.listdir :- Python listdir() method
random.choice :- Python choice() method
The problem with most of the solutions given is you load all your input into memory which can become a problem for large inputs/hierarchies. Here's a solution adapted from The Perl Cookbook by Tom Christiansen and Nat Torkington. To get a random file anywhere beneath a directory:
#! /usr/bin/env python
import os, random
n=0
random.seed();
for root, dirs, files in os.walk('/tmp/foo'):
for name in files:
n += 1
if random.uniform(0, n) < 1:
rfile=os.path.join(root, name)
print rfile
Generalizing a bit makes a handy script:
$ cat /tmp/randy.py
#! /usr/bin/env python
import sys, random
random.seed()
n = 1
for line in sys.stdin:
if random.uniform(0, n) < 1:
rline=line
n += 1
sys.stdout.write(rline)
$ /tmp/randy.py < /usr/share/dict/words
chrysochlore
$ find /tmp/foo -type f | /tmp/randy.py
/tmp/foo/bar
Language agnostic solution:
1) Get the total no. of files in specified directory.
2) Pick a random number from 0 to [total no. of files - 1].
3) Get the list of filenames as a suitably indexed collection or such.
4) Pick the nth element, where n is the random number.
Independant from the language used, you can read all references to the files in a directory into a datastructure like an array (something like 'listFiles'), get the length of the array. calculate a random number in the range of '0' to 'arrayLength-1' and access the file at the certain index. This should work, not only in python.
If you don't know before hand what files are there, you will need to get a list, then just pick a random index in the list.
Here's one attempt:
import os
import random
def getRandomFile(path):
"""
Returns a random filename, chosen among the files of the given path.
"""
files = os.listdir(path)
index = random.randrange(0, len(files))
return files[index]
EDIT: The question now mentions a fear of a "race condition", which I can only assume is the typical problem of files being added/removed while you are in the process of trying to pick a random file.
I don't believe there is a way around that, other than keeping in mind that any I/O operation is inherently "unsafe", i.e. it can fail. So, the algorithm to open a randomly chosen file in a given directory should:
Actually open() the file selected, and handle a failure, since the file might no longer be there
Probably limit itself to a set number of tries, so it doesn't die if the directory is empty or if none of the files are readable
Python 3 has the pathlib module, which can be used to reason about files and directories in a more object oriented fashion:
from random import choice
from pathlib import Path
path: Path = Path()
# The Path.iterdir method returns a generator, so we must convert it to a list
# before passing it to random.choice, which expects an iterable.
random_path = choice(list(path.iterdir()))
This code don't repeat the file names:
def random_files(num, list_):
file_names = []
while True:
ap = random.choice(list_)
if ap not in file_names:
file_names.append(ap)
if len(file_names) == num:
return file_names
random_200_files = random_files(200, list_of_files)
For those who come here with the need to pick a large number of files from a larger number of files, and maybe copy or move them in another dir, the proposed approach is of course too slow.
Having enough memory, one could read all the directory content in a list, and then use the random.choices function to select 17 elements, for example:
from random import choices
from glob import glob
from shutil import copy
file_list = glob([SRC DIR] + '*' + [FILE EXTENSION])
picked_files = choices(file_list, k=17)
now picked_filesis a list of 20 filenames picked at random, that can be copied/moved even in parallel, for example:
import multiprocessing as mp
from itertools import repeat
from shutil import copy
def copy_files(filename, dest):
print(f"Working on file: {filename}")
copy(filename, dest)
with mp.Pool(processes=(mp.cpu_count() - 1) or 1) as p:
p.starmap(copy_files, zip(picked_files, repeat([DEST PATH])))