python worm how to make it more complex? - python

Please be kind this is my second post and i hope you all like.
Here I have made a program that makes directories inside directories,
but the problem is I would like a way to make it self replicate.
Any ideas and help is greatly appreciated.
Before:
user/scripts
After:
user/scripts/worm1/worm2/worm3
The script is as follows:
import os, sys, string, random
worms_made = 0
stop = 20
patha = ''
pathb = '/'
pathc = ''
def fileworm(worms_made, stop, patha, pathb, pathc):
filename = (''.join(random.choice(string.ascii_lowercase
+string.ascii_uppercase + string.digits) for i in range(8)))
pathc = patha + filename + pathb
worms_made = worms_made + 1
os.system("mkdir %s" % filename)
os.chdir(pathc)
print "Worms made: %r" % worms_made
if worms_made == stop:
print "*Done"
exit(0)
elif worms_made != stop:
pass
fileworm(worms_made, stop, patha, pathb, pathc)
fileworm(worms_made, stop, patha, pathb, pathc)

To create a variable depth, you could do something like this:
import os
depth = 3
worms = ['worm{}'.format(x) for x in range(1, depth+1)]
path = os.path.join(r'/user/scripts', *worms)
os.path.makedirs(path)
As mentioned, os.path.makedirs() will create all the required folders in one call. You just need to build the full path.
Python has a function to help with creating paths called os.path.join(). This makes sure the correct / or \ is automatically added for the current operating system between each part.
worms is a list containing ["worm1", "worm2", "worm3"], it is created using a Python feature called a list comprehension. This is passed to the os.path.join() function using * meaning the each element of the list is passed as a separate parameter.
I suggest you try adding print worms or print path to see how it works.
The result is that a string looking something like as follows is passed to the function to create your folder structure:
/user/scripts/worm1/worm2/worm3

Related

Get number of files in directory with pathlib python

I have a two directories with csv files. Both should be of the same length, as I am looping over both of them with zip. Therefor I have a check to see if the length of them are the same. The code looks like this:
from pathlib import Path
def check():
base = Path('home/user/src/log').rglob('*.csv')
test = Path('home/user/src/log').rglob('*.csv')
print(list(base))
if len(list(base)) != len(list(test):
print(f"Wrong number of files in {str(base)} and {str(test)}")
return -1
for base, test in zip(base, test):
x = pd.read_csv(base)
y = pd.read_csv(test)
print(x)
print(y)
if __name__ == '__main__':
check()
The list(base) gives the list of files, but it also silent kills the program. So if I have print(list(base)) it will print the files in base and then the program terminates.
The str(base) does also not work, but this is because I havent found a way to print out the directory path without the program terminating afterwards. Any tips to get the length of the list and print the directory without killing the program.
Note: I now I can use 'os' but would like to use pathlib if possible
rglob returns a generator. Calling list on the generator consumes all items.
You could however convert it to a list initially and then keep working with the list afterwards:
from pathlib import Path
def check():
base = list(Path('home/user/src/log').rglob('*.csv'))
test = list(Path('home/user/src/log').rglob('*.csv'))
print(base)
if len(base) != len(test):
print(f"Wrong number of files in {str(base)} and {str(test)}")
return -1
for base, test in zip(base, test):
x = pd.read_csv(base)
y = pd.read_csv(test)
print(x)
print(y)
if __name__ == '__main__':
check()

Not sure why path only exists for portion of while-loop

At a certain point in my while loop I run os.listdir on a three item index to generate a list of files and I get "Windows Error 3 - Path doesn't exist" even though I have called this path previously in my script successfully.
I ran an os.path.exists on it after and it says that for the first two loops the directory evaluates as False but on the third it evaluates as True.
I've tried using glob.glob and that also only returns files on the third loop.
The trouble is in the while loop under the "creates Read nodes based on number of shots found" comment.
Any help to have to read it as True from the beginning would be appreciated, thanks!
# Iterating using while loop, this gets every version folder for each shots' plates and stores to a "version" list
while shotIndex < shotAmountTotal:
nextShot = (shots[shotIndex])
shotIndex += 1
verSearchPath = shotSearchPath + '/' + nextShot + '/' + compFolder + '/' + platesFolder
foundVerList = os.listdir(verSearchPath)
verListCombined.append(foundVerList)
verListSorted = list(chain.from_iterable(verListCombined))
#this groups the like folder names, splits them at the underscore before the version number and then returns only the highest version number of each group
groupedShotFolders = groupby(verListSorted, key=lambda version: version.rsplit('_', 1)[0])
latestShotVer = [sorted(group, reverse=True)[0] for key, group in groupedShotFolders]
#creates Read nodes based on number of shots found
latestShotAmount = len(latestShotVer)
latestShotIndex = 0
while latestShotIndex < latestShotAmount:
latestShot = (latestShotVer[latestShotIndex])
frameListerPath = verSearchPath + '/' + latestShot + '/' + fileExtension + '/'
print os.path.exists(frameListerPath)
frameLister = os.listdir(verSearchPath + '/' + latestShot + '/' + fileExtension + '/')
The terminal output I am getting is:
Result: False
E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_010_BG_001_v002/exr/
[]
False
E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_010_FG_001_v003/exr/
[]
True
E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_020_BG_001_v003/exr/
['E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_020_BG_001_v003/exr\\MRS_103_005_020_BG_001_v003.0999.exr', 'E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_020_BG_001_v003/exr\\MRS_103_005_020_BG_001_v003.1000.exr', 'E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_020_BG_001_v003/exr\\MRS_103_005_020_BG_001_v003.1001.exr', 'E:/projects/MBR/shots/103/MRS_103_005_020/2d/plates/MRS_103_005_020_BG_001_v003/exr\\MRS_103_005_020_BG_001_v003.1002.exr']
Well, I'm going to chalk this one up to user error. Firstly I switched to the "for" and "join" methods to clean things up a bit. Next it turns out that the "nextShot" variable in my previous code wasn't iterating properly so it only output the last item of the iteration which is why the first two paths evaluated to False. I fixed this by putting the "latestShot" variable in its place and trimming off the descriptors.
To avoid the double-slashing that happens with the "os.listdir" I split it out and rejoined with forwardslashes.
Thanks for your thoughts and guidance on this.
for latestShot in latestShotVer:
nuke.createNode("Read", inpanel = False)
frameListerPath = os.path.join(shotSearchPath, latestShot[:-12], compFolder, platesFolder, latestShot, fileExtension)
flsplit = frameListerPath.split('\\')
fljoin = '/'.join(flsplit)
frameListerPath = fljoin
frameLister = os.listdir(frameListerPath)

Time module and file changes

I need to write a script that does the following
Write a python script to list all of the files and directories in the current directory and all subdirectories that have been modified in the last X minutes.
X should be taken in as a command-line argument.
Check that this argument exists, and exit with a suitable error message if it doesn’t.
X should be an int which is less than or equal to 120. If not, exit with a suitable error message.
For each of these files and directories, list the time of modification, whether it is a file or directory,
and its size.
I have come up with this
#!/usr/bin/python
import os,sys,time
total = len(sys.argv)
if total < 2:
print "You need to enter a value in minutes"
sys.exit()
var = int(sys.argv[1])
if var < 1 or var > 120 :
print "The value has to be between 1 and 120"
sys.exit()
past = time.time() - var * 60
result = []
dir = os.getcwd()
for p, ds, fs in os.walk(dir):
for fn in fs:
filepath = os.path.join(p, fn)
status = os.stat(filepath).st_mtime
if os.path.getmtime(filepath) >= past:
size = os.path.getsize(filepath)
result.append(filepath)
created = os.stat(fn).st_mtime
asciiTime = time.asctime( time.gmtime( created ) )
print "Files that have changed are %s"%(result)
print "Size of file is %s"%(size)
So it reports back with something like this
Files that have changed are ['/home/admin/Python/osglob2.py']
Size of file is 729
Files that have changed are ['/home/admin/Python/osglob2.py', '/home/admin/Python/endswith.py']
Size of file is 285
Files that have changed are ['/home/admin/Python/osglob2.py', '/home/admin/Python/endswith.py', '/home/admin/Python/glob3.py']
Size of file is 633
How can i get this to stop reepeating the files ?
The reason your code builds a list of all the files it's encountered is
result.append(filepath)
and the reason it prints out that whole list every time is
print "Files that have changed are %s"%(result)
So you will need to change one of those lines: either replace the list, rather than appending to it, or (much more sensible IMO) just print out the one latest filename found, rather than the whole list.
You aren't clearing your result list at the end of each iteration. Try something like result.clear() after your second print statement. Make sure it is on the same indent as the for though, not the print.

Recursive printing of file directory in Python

I'm trying to figure out how to print out each item in a directory with proper indentation. The code I have so far is below:
import os
def traverse(pathname,d):
'prints a given nested directory with proper indentation'
indent = ''
for i in range(d):
indent = indent + ' '
for item in os.listdir(pathname):
try:
newItem = os.path.join(pathname, item)
traverse(newItem,d+1)
except:
print(indent + newItem)
The output that I have prints out all the files in the test directory, but does not print out the folder names. What I get is this:
>>> traverse('test',0)
test/fileA.txt
test/folder1/fileB.txt
test/folder1/fileC.txt
test/folder1/folder11/fileD.txt
test/folder2/fileD.txt
test/folder2/fileE.txt
>>>
What the output should be:
>>> traverse('test',0)
test/fileA.txt
test/folder1
test/folder1/fileB.txt
test/folder1/fileC.txt
test/folder1/folder11
test/folder1/folder11/fileD.txt
test/folder2
test/folder2/fileD.txt
test/folder2/fileE.txt
>>>
Can anyone let me know what I need to be doing with the code to get the folder names to show up? I've tried to print out the pathname, but it just repeats the folder name every time Python prints out a file name since it is in a for loop. A nudge in the right direction would be greatly appreciated!
You need to print the file name whether it's a directory or not, something like:
for item in os.listdir(pathname):
try:
newItem = os.path.join(pathname, item)
print(indent + newItem)
traverse(newItem,d+1)
except:
pass
Though I would rather not use an exception to detect whether it's a directory, so if os.path.isdir is allowed:
for item in os.listdir(pathname):
newItem = os.path.join(pathname, item)
print(indent + newItem)
if (os.path.isdir(newItem)):
traverse(newItem,d+1)
A recursive directory printing function that does not use os.walk might look something like this:
def traverse(root, depth, indent=''):
for file_ in os.listdir(root):
leaf = os.path.join(root, file_)
print indent, leaf
if os.path.isdir(leaf) and depth:
traverse(leaf, depth-1, indent + ' ')
I don't approve of teaching people to avoid built ins to try to drive home a concept. There are plenty of ways to teach recursion without making someone neglect the finer aspects of the language they're learning. That's my opinion though, and apparently not one shared by many professors.

execute python script multiple times

Im not sure about the best way to do this but I have a python script saved as a .py. The final output of this script is two files x1.txt and y1.txt.
Basically I want to run this script say 1000 times and each run write my two text files with new names i.e x1.txt + y1.txt then second run x2.txt and y2.txt.
Thinking about this it seems it might be better to start the whole script with something like
runs=xrange(:999)
for i in runs:
##run the script
and then finish with something that does
for i in runs:
filnameA=prefix += "a"+i
open("filnamea.txt", "w").write('\n'.join('\t'.join(x for x in g if x) for g in grouper(7, values)))
for i in runs:
filnameB=prefix += "a"+i
open("filnameB.txt", "w").write('\n'.join('\t'.join(x for x in g if x) for g in grouper(7, values)))
Is this really the best way to do it? I bet its not..better ideas?
I know you can import time and write a filename that mathes time but this would be annoying for processing later.
If your computer has the resources to run these in parallel, you can use multiprocessing to do it. Otherwise use a loop to execute them sequentially.
Your question isn't quite explicit about which part you're stuck with. Do you just need advice about whether you should use a loop? If yes, my answer is above. Or do you also need help with forming the filenames? You can do that part like this:
import sys
def myscript(iteration_number):
xfile_name = "x%d.txt" % iteration_number
yfile_name = "y%d.txt" % iteration_number
with open(xfile_name, "w") as xf:
with open(yfile_name, "w") as yf:
... whatever your script does goes here
def main(unused_command_line_args):
for i in xrange(1000):
myscript(i)
return 0
if __name__ == '__main__':
sys.exit(main(sys.argv))
import subprocess
import sys
script_name = 'dummy_file.py'
output_prefix = 'out'
n_iter = 5
for i in range(n_iter):
output_file = output_prefix + '_' + str(i) + '.txt'
sys.stdout = open(output_file, 'w')
subprocess.call(['python', script_name], stdout=sys.stdout, stderr=subprocess.STDOUT)
On running this, you'll get 5 output text files (out_0.txt, ..., out_4.txt)
I'm not sure, but maybe, it can help:
Suppose, I want to print 'hello' 10 times, without manually writing it 10 times. For doing this, I can define a function :
#Function for printing hello 10 times:
def func(x):
x="hello"
i=1
while i<10 :
print(x)
i += 1
else :
print(x)
print(func(1))

Categories