I am new to Python and want an auto-completing variable inside a while-loop. I try to give a minimal example. Lets assume I have the following files in my folder, each starting with the same letters and an increasing number but the end of the filenames consists of just random numbers
a_i=1_404
a_i=2_383
a_i=3_180
I want a while loop like
while 1 <= 3:
old_timestep = 'a_i=n_*'
actual_timestep = 'a_i=(n+1)_*'
... (some functions that need the filenames saved in the two above initialised variables)
n = n+1
So if I start the loop I want it to automatically process all files in my directory. As a result there are two questions:
1) How do I tell python that (in my example I used the '*') I want the filename to be completed automatically?
2) How to I use a formula inside the filename (in my example the '(n+1)')?
Many thanks in advance!
1) To my knowledge you can't do that automatically. I would store all filenames in the directory in a list and then do a search through that list
from os import listdir
from os.path import isfile, join
dir_files = [ f for f in listdir('.') if isfile(join('.',f)) ]
while i <= 3:
old_timestep = "a_i=n_"
for f in dir_files:
if f.startswith(old_timestep):
# process file
i += 1
2) You can use string concatenation
f = open("a_i=" + str(n + 1) + "remainder of filename", 'w')
You can use the glob module to do * expansion:
import glob
old = next(glob.glob('a_i={}_*'.format(n)))
actual = next(glob.glob('a_i={}_*'.format(n + 1)))
I found a way to solve 1) myself by first doing a) then b):
1) a) Truncate the file names so that only the first x characters are left:
for i in {1..3}; do
cp a_i=$((i))_* a_i=$((i))
done
b)
n = 1
while n <= 3:
old = 'a_i=' + str(n)
new = 'a_i=' + str(n+1)
the str() is for converting the integer n into a string in order to concatenate
thanks for your input!
Related
I have a scanner that creates a folder of images named like this:
A1.jpg A2.jpg A3.jpg...A24.jpg -> B1.jpg B2.jpg B3.jpg...B24.jpg
There are 16 rows and 24 images per letter row i.e A1 to P24, 384 images total.
I would like to rename them by reversing the order. The first file should take the name of the last and vice versa. Consider first to be A1 (which is also the first created during scanning)
The closest example I can find is in shell but that is not really what I want:
for i in {1..50}; do
mv "$i.txt" "renamed/$(( 50 - $i + 1 )).txt"
done
Perhaps I need to save the filenames into a list (natsort maybe?) then use those names somehow?
I also thought I could use the image creation time as the scanner always creates the files in the same order with the same names. In saying that, any solutions may not be so useful for others with the same challenge.
What is a sensible approach to this problem?
I don't know if this is the most optimal way of doing that, but here it is:
import os
folder_name = "test"
new_folder_name = folder_name + "_new"
file_names = os.listdir(folder_name)
file_names_new = file_names[::-1]
print(file_names)
print(file_names_new)
os.mkdir(new_folder_name)
for name, new_name in zip(file_names, file_names_new):
os.rename(folder_name + "/" + name, new_folder_name + "/" + new_name)
os.rmdir(folder_name)
os.rename(new_folder_name, folder_name)
This assumes that you have files saved in the directory "test"
I would store the original list. Then rename all files in the same order (e.g. 1.jpg, 2.jpg etc.). Then I'd rename all of those files into the reverse of the original list.
In that way you will not encounter duplicate file names during the renaming.
You can make use of the pathlib functions rename and iterdir for this. I think it's straightforward how to put that together.
Solution based on shutil package (os package sometimes has permissions problems) and "in place" not to waste memory if the folder is huge
import wizzi_utils as wu
import os
def reverse_names(dir_path: str, temp_file_suffix: str = '.temp_unique_suffix') -> None:
"""
"in place" solution:
go over the list from both directions and swap names
swap needs a temp variable so move first file to target name with 'temp_file_suffix'
"""
files_full_paths = wu.find_files_in_folder(dir_path=dir_path, file_suffix='', ack=True, tabs=0)
files_num = len(files_full_paths)
for i in range(files_num): # works for even and odd files_num
j = files_num - i - 1
if i >= j: # crossed the middle - done
break
file_a, file_b = files_full_paths[i], files_full_paths[j]
print('replacing {}(idx in dir {}) with {}(idx in dir {}):'.format(
os.path.basename(file_a), i, os.path.basename(file_b), j))
temp_file_name = '{}{}'.format(file_b, temp_file_suffix)
wu.move_file(file_src=file_a, file_dst=temp_file_name, ack=True, tabs=1)
wu.move_file(file_src=file_b, file_dst=file_a, ack=True, tabs=1)
wu.move_file(file_src=temp_file_name, file_dst=file_b, ack=True, tabs=1)
return
def main():
reverse_names(dir_path='./scanner_files', temp_file_suffix='.temp_unique_suffix')
return
if __name__ == '__main__':
main()
found 6 files that ends with in folder "D:\workspace\2021wizzi_utils\temp\StackOverFlow\scanner_files":
['A1.jpg', 'A2.jpg', 'A3.jpg', 'B1.jpg', 'B2.jpg', 'B3.jpg']
replacing A1.jpg(idx in dir 0) with B3.jpg(idx in dir 5):
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A1.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B3.jpg.temp_unique_suffix(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B3.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A1.jpg(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B3.jpg.temp_unique_suffix Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B3.jpg(0B)
replacing A2.jpg(idx in dir 1) with B2.jpg(idx in dir 4):
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A2.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B2.jpg.temp_unique_suffix(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B2.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A2.jpg(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B2.jpg.temp_unique_suffix Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B2.jpg(0B)
replacing A3.jpg(idx in dir 2) with B1.jpg(idx in dir 3):
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A3.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B1.jpg.temp_unique_suffix(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B1.jpg Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/A3.jpg(0B)
D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B1.jpg.temp_unique_suffix Moved to D:/workspace/2021wizzi_utils/temp/StackOverFlow/scanner_files/B1.jpg(0B)
My file structure looks like this:
- Outer folder
- Inner folder 1
- Files...
- Inner folder 2
- Files...
- …
I'm trying to count the total number of files in the whole of Outer folder. os.walk doesn't return any files when I pass it the Outer folder, and as I've only got two layers I've written it manually:
total = 0
folders = ([name for name in os.listdir(Outer_folder)
if os.path.isdir(os.path.join(Outer_folder, name))])
for folder in folders:
contents = os.listdir(os.path.join(Outer_folder, folder))
total += len(contents)
print(total)
Is there a better way to do this? And can I find the number of files in an arbitrarily nested set of folders? I can't see any examples of deeply nested folders on Stack Overflow.
By 'better', I mean some kind of built in function, rather than manually writing something to iterate - e.g. an os.walk that walks the whole tree.
Use pathlib:
Return total number of files in directory and subdirectories shows how to get just the total number.
pathlib is part of the
standard library, and should be used instead of os because it treats paths as objects with methods, not strings to be sliced.
Python 3's pathlib Module: Taming the File System
Use a condition to select only files:
[x.parent for x in f if x.is_file()]
File and subdirectory count in each directory:
from pathlib import Path
import numpy as np
p = Path.cwd() # if you're running in the current dir
# p = Path('path to to dir') # otherwise, specify a path
# creates a generator of all the files matching the pattern
f = p.rglob('*')
# optionally, use list(...) to unpack the generator
# f = list(p.rglob('*'))
# counts them
paths, counts = np.unique([x.parent for x in f], return_counts=True)
path_counts = list(zip(paths, counts))
Output:
List of tuples with path and count
[(WindowsPath('E:/PythonProjects/stack_overflow'), 8),
(WindowsPath('E:/PythonProjects/stack_overflow/.ipynb_checkpoints'), 7),
(WindowsPath('E:/PythonProjects/stack_overflow/complete_solutions/data'), 6),
(WindowsPath('E:/PythonProjects/stack_overflow/csv_files'), 3),
(WindowsPath('E:/PythonProjects/stack_overflow/csv_files/.ipynb_checkpoints'), 1),
(WindowsPath('E:/PythonProjects/stack_overflow/data'), 5)]
f = list(p.rglob('*')) unpacks the generator and produces a list of all the files.
One-liner:
Use Path.cwd().rglob('*') or Path('some path').rglob('*')
path_counts = list(zip(*np.unique([x.parent for x in Path.cwd().rglob('*')], return_counts=True)))
I will suggest you use recursion as the function below:
def get_folder_count(path):
folders = os.listdir(path)
folders = list(filter(lambda a: os.path.isdir(os.path.join(path, a)), folders))
count = len(folders)
for i in range(count):
count += get_folder_count(os.path.join(path, folders[i]))
return count
I'm extremely new to Python (and software programming/development in general). I decided to use the scenario below as my first project. The project includes 5 main personal challenges. Some of the challenges I have been able to complete (although probably not the most effecient way), and others I'm struggling with. Any feedback you have on my approach and recommendations for improvement is GREATLY appreciated.
Project Scenario = "If I doubled my money each day for 100 days, how much would I end up with at day #100? My starting amount on Day #1 is $1.00"
1.) Challenge 1 - What is the net TOTAL after day 100 - (COMPLETED, I think, please correct me if I'm wrong)
days = 100
compound_rate = 2
print('compound_rate ** days) # 2 raised to the 100th
#==Result===
1267650600228229401496703205376
2.) Challenge 2 - Print to screen the DAYS in the first column, and corresponding Daily Total in the second column. - (COMPLETED, I think, please correct me if I'm wrong)
compound_rate = 2
days_range = list(range(101))
for x in days_range:
print (str(x),(compound_rate ** int(x)))
# ===EXAMPLE Results
# 0 1
# 1 2
# 2 4
# 3 8
# 4 16
# 5 32
# 6 64
# 100 1267650600228229401496703205376
3.) Challenge 3 - Write TOTAL result (after the 100 days) to an external txt file - (COMPLETED, I think, please correct me if I'm wrong)
compound_rate = 2
days_range = list(range(101))
hundred_days = (compound_rate ** 100)
textFile = open("calctest.txt", "w")
textFile.write(str(hundred_days))
textFile.close()
#===Result====
string of 1267650600228229401496703205376 --> written to my file 'calctest.txt'
4.) Challenge 4 - Write the Calculated running DAILY Totals to an external txt file. Column 1 will be the Day, and Column 2 will be the Amount. So just like Challenge #2 but to an external file instead of screen
NEED HELP, I can't seem to figure this one out.
5.) Challenge 5 - Somehow plot or chart the Daily Results (based on #4) - NEED GUIDANCE.
I appreciate everyone's feedback as I start on my personal Python journey!
challenge 2
This will work fine, but there's no need to write list(range(101)), you can just write range(101). In fact, there's no need even to create a variable to store that, you can just do this:
for x in range(101):
print("whatever you want to go here")
challenge 3
Again, this will work fine, but when writing to a file, it is normally best to use a with statement, this means that you don't need to close the file at the end, as python will take care of that. For example:
with open("calctest.txt", "w") as f:
write(str(hundred_days))
challenge 4
Use a for loop as you did with challenge 2. Use "\n" to write a new line. Again do everything inside a with statement. e.g.
with open("calctest.txt", "w") as f:
for x in range(101):
f.write("something here \n").
(would write a file with 'something here ' written 101 times)
challenge 5
There is a python library called matplotlib, which I have never used, but I would suggest that would be where to go to in order to solve this task.
I hope this is of some help :)
You can use what you did in challenge 3 to open and close the ouput file.
In between, you have to do what you did in challenge 2 to compute the data for each day.
In stead of writing the daily result to the stream, you will have to combine it into a string. After that, you can write that string to the file, exactly like you did in challenge 3.
Challenge One:
This is the correct way.
days = 100
compound_rate = 2
print("Result after 100 days" + (compound_rate ** days))
Challenge Two
This is corrected.
compound_rate = 2
days_range = list(range(101))
for x in days_range:
print(x + (compound_rate ** x))
Challenge Three
This one is close but you didn't need to cast the result of hundred_days to a string as you can write the integer to a file and python doesn't care most of the time. Explicit casts need only to be worried about when using the data in some way other than simply printing it.
compound_rate = 2
days_range = list(range(101))
hundred_days = (compound_rate ** 100)
textFile = open("calctest.txt", "w")
textFile.write(hundred_days)
textFile.close()
Challenge Four
For this challenge, you will want to look into the python CSV module. You can write the data in two rows separated by commas very simply with this module.
Challenge Five
For this challenge, you will want to look into the python library matplotlib. This library will give you tools to work with the data in a graphical way.
Answer for challenge 1 is as follows:
l = []
for a in range(0,100):
b = 2 ** a
l.append(b)
print("Total after 100 days", sum(l))
import os, sys
import datetime
import time
#to get the current work directory, we use below os.getcwd()
print(os.getcwd())
#to get the list of files and folders in a path, we use os.listdir
print(os.listdir())
#to know the files inside a folder using path
spath = (r'C:\Users\7char')
l = spath
print(os.listdir(l))
#converting a file format to other, ex: txt to py
path = r'C:\Users\7char'
print(os.listdir(path))
# after looking at the list of files, we choose to change 'rough.py' 'rough.txt'
os.chdir(path)
os.rename('rough.py','rough.txt')
#check whether the file has changed to new format
print(os.listdir(path))
#yes now the file is changed to new format
print(os.stat('rough.txt').st_size)
# by using os.stat function we can see the size of file (os.stat(file).sst_size)
path = r"C:\Users\7char\rough.txt"
datetime = os.path.getmtime(path)
moddatetime = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(datetime))
print("Last Modified Time : ", moddatetime)
#differentiating b/w files and folders using - os.path.splitext
import os
path = r"C:\Users\7char\rough.txt"
dir(os.path)
files = os.listdir()
for file in files:
print(os.path.splitext(file))
#moving a file from one folder to other (including moving with folders of a path or moving into subforlders)
import os
char_7 = r"C:\Users\7char"
cleardata = r"C:\Users\clearadata"
operating = os.listdir(r"C:\Users\7char")
print(operating)
for i in operating:
movefrom = os.path.join(char_7,i)
moveto = os.path.join(cleardata,i)
print(movefrom,moveto)
os.rename(movefrom,moveto)
#now moving files based on length of individual charecter (even / odd) to a specified path (even or odd).
import os
origin_path = r"C:\Users\movefilehere"
fivechar_path= r"C:\Users\5char"
sevenchar_path = r"C:\Users\7char"
origin_path = os.listdir(origin_path)
for file_name in origin_pathlist:
l = len(file_name)
if l % 2 == 0:
evenfilepath = os.path.join(origin_path,file_name)
newevenfilepath = os.path.join(fivechar_path,file_name)
print(evenfilepath,newevenfilepath)
os.rename(evenfilepath,newevenfilepath)
else:
oddfilepath = os.path.join(origin_path,file_name)
newoddfilepath = os.path.join(sevenchar_path,file_name)
print(oddfilepath,newoddfilepath)
os.rename(oddfilepath,newoddfilepath)
#finding the extension in a folder using isdir
import os
path = r"C:\Users\7char"
print(os.path.isdir(path))
#how a many files .py and .txt (any files) in a folder
import os
from os.path import join, splitext
from glob import glob
from collections import Counter
path = r"C:\Users\7char"
c = Counter([splitext(i)[1][1:] for i in glob(join(path, '*'))])
for ext, count in c.most_common():
print(ext, count)
#looking at the files and extensions, including the total of extensions.
import os
from os.path import join, splitext
from collections import defaultdict
path = r"C:\Users\7char"
c = defaultdict(int)
files = os.listdir(path)
for filenames in files:
extension = os.path.splitext(filenames)[-1]
c[extension]+=1
print(os.path.splitext(filenames))
print(c,extension)
#getting list from range
list(range(4))
#break and continue statements and else clauses on loops
for n in range(2,10):
for x in range(2,n):
if n%x == 0:
print(n,'equals',x, '*', n//x)
break
else:
print(n, 'is a prime number')
#Dictionaries
#the dict() constructer builds dictionaries directly from sequences of key-value pairs
dict([('ad', 1212),('dasd', 2323),('grsfd',43324)])
#loop over two or more sequences at the same time, the entries can be paired with the zip() function.
questions = ['name', 'quest', 'favorite color']
answers = ['lancelot', 'the holy grail', 'blue']
for q, a in zip(questions, answers):
print('What is your {0}? It is {1}.'.format(q, a))
#Using set()
basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
for f in sorted(set(basket)):
print(f)
I'm trying to create a new FITS file out of two older ones using PyFITS.
import pyfits
from sys import stdout
from sys import argv
import time
file1 = argv[1]
file2 = argv[2]
hdu1 = pyfits.open(file1)
hdu2 = pyfits.open(file2)
new0 = hdu1[0]
new1 = hdu1[0]
sci1 = hdu1[0].data
sci2 = hdu2[0].data
for r in range(0, len(sci1)):
for c in range(0, len(sci1[r])):
add = sci1[r][c] + sci2[r][c]
new0.data[r][c] = add
for r in range(0, len(sci1)):
for c in range(0, len(sci1[r])):
print "(" + str(r) + ", " + str(c) + ") FirstVal = " + str(sci1[r][c]) + " || SecondVal = " + str(sci2[r][c])
print "\t New File/Add = " + str(new0.data[r][c])
All it prints out is the first value, i.e. sci1[r][c]. This means that the variable isn't being modified at all. How can I make it modify? I'm very new to using FITS.
What you have done here is make sci1 a reference to new0.data which means the assignment to new0 also changes sci1, so it is modifying the intended variable but your print loop is printing the same object twice.
If you want to have a copy instead of reference you have to use the objects copy method, in this case sci0 = new0.data.copy()
This is also not the way you are supposed to use numpy which pyfits uses to represent its images. Instead of loops you apply operations to full arrays which is in most cases easier to read and significantly faster. If you want to add two fits images represented as numpy arrays inplace:
new0.data += new1.data
print new0.data
or if you want to create a new image out of the sum of both inputs:
sum_image = new0.data + new1.data
# put it into an pyfits HDU (primary fits extension)
hdu = pyfits.PrimaryHDU(data=sum_image)
I have all filenames of a directory in a list named files. And I want to filter it so only the files with the .php extension remain.
for x in files:
if x.find(".php") == -1:
files.remove(x)
But this seems to skip filenames. What can I do about this?
How about a simple list comprehension?
files = [f for f in files if f.endswith('.php')]
Or if you prefer a generator as a result:
files = (f for f in files if f.endswith('.php'))
>>> files = ['a.php', 'b.txt', 'c.html', 'd.php']
>>> [f for f in files if f.endswith('.php')]
['a.php', 'd.php']
Most of the answers provided give list / generator comprehensions, which are probably the way you want to go 90% of the time, especially if you don't want to modify the original list.
However, for those situations where (say for size reasons) you want to modify the original list in place, I generally use the following snippet:
idx = 0
while idx < len(files):
if files[idx].find(".php") == -1:
del files[idx]
else:
idx += 1
As to why your original code wasn't working - it's changing the list as you iterator over it... the "for x in files" is implicitly creating an iterator, just like if you'd done "for x in iter(files)", and deleting elements in the list confuses the iterator about what position it is at. For such situations, I generally use the above code, or if it happens a lot in a project, factor it out into a function, eg:
def filter_in_place(func, target):
idx = 0
while idx < len(target):
if func(target[idx)):
idx += 1
else:
del target[idx]
Just stumbled across this old question. Many solutions here will do the job but they ignore a case where filename could be just ".php". I suspect that the question was about how to filter PHP scripts and ".php" may not be a php script. Solution that I propose is as follows:
>>> import os.path
>>> files = ['a.php', 'b.txt', 'c.html', 'd.php', '.php']
>>> [f for f in files if os.path.splitext(f)[1] == ".php"]