Creating a directory using part of a variable name - python

New to python and Im trying to create a set of directories, and create a file in each one, the input name will be a string starting with a > sign but I don't want the directory to contain the >.
I've tried to do the following;
seq_id = ">seq"
dirname = seq_id[1:]
print(dirname)
if not os.path.isdir('./' + dirname + '/'):
os.mkdir('./' + dirname + '/')
print("directory made")
It will not make the directory when I used the seq_id[1:] bit but it will print it. So I don't really get why it won't create the directory.
I ultimately want to build a function that would take a list of seq_ids from a file, >seq1 >seq2 >seq3 etc and create a directory for each one.
(Working with python3.5)

You must know the directories will be created at your current directory position.
If you are in folder /tmp/foo/, your script in /tmp/bar/myscript.py and you execute it with python ../bar/myscript.py the directories will be created in /tmp/foo/, not /tmp/bar/.
More of that, because you just skip possible errors with if not os.path.isdir('./' + dirname + '/'): and never prints nothing, you will not know if directory already exists.
You could do something like:
import os
strings = ['>foo', '>bar', '>baz']
def make_directories(input_list):
for string in input_list:
dirpath = os.path.join('./', string[1:])
try:
os.mkdir(dirpath)
except FileExistsError:
print('Directory {} already exists'.format(dirpath))
else:
print('Directory {} created'.format(dirpath))
make_directories(strings)
It uses os.path.join instead of homemade concatenation. It's a best practice you should always follow.
It uses try / except instead of if, again it's a best practice you should follow (Reference)
It prints things, so you know what is going on.

Related

Python. Create folder if it does not exist. But more complicated [duplicate]

This question already has answers here:
Creating 100+ folders and naming them sequentially [closed]
(5 answers)
Closed 2 years ago.
I'm making a python program using the Flask framework, but I have a problem.
I'll explain.
I need to save some images in a directory. So I have to create a directory called Folder, but first I have to check if this directory doesn't already exist.
So I should check if Folder exists, if it doesn't exist I create Folder directory, otherwise I create Folder1 directory.
But in the same way I have to check if Folder1 exists and if it already exists I see for the Folder2 directory and so on ...
After creating the directory I need to always use the same directory to save all the images.
That is, even if I terminate the program, the next time I run it must always save the other images in the directory created.
I ask for your help in doing this, because I don't know how to do it.
I tried to do this, but it doesn't work as it should:
path = "path/to/folder"
def create_directory():
global path
if(os.path.isdir(path)==False):
os.makedirs(path)
else:
cont=1
new_path = path
while(os.path.isdir(new_path)==True):
new_path = str(path)+str(cont)
cont= cont +1
os.makedirs(new_path)
Hope this code helps:
import os
# define the name of the directory to be created
path = "/root/directory1"
try:
os.mkdir(path)
except OSError:
print ("Creation of the directory %s failed" % path)
else:
print ("Successfully created the directory %s " % path)
This can use a number of error handling (in case regex fails, etc) but should get you started on the correct path:
import regex
from pathlib import Path
def mkdir(path_str):
path = Path(path_str)
# this while loop ensures you get the next folder name (the correct number appended to the base folder name)
while path.exists():
s= re.search('(\w+)(\d+)',path.name)
base,number = s.groups()
new_path_str = base + str(int(number)+1)
path = path.parent.joinpath(new_path_str)
try:
path.mkdir()
print(f'The directory {path.name} was created!')
except OSError:
print (f"Creation of the directory {path} failed")

Python: Identifying numerically names folders in a folder structure

I have the below function, that walksthe root of a given directory and grabs all subdirectories and places them into a list. This part works, sort of.
The objective is to determine the highest (largest number) numerically named folder.
Assuming that the folder contains only numerically named folders, and does not contain alphanumeric folders of files, I'm good. However, if a file, or folder is present that is not numerically named I encounter issues because the script seems to be collecting all subdirectories and files, and loast everything into the list.
I need to just find those folders whose naming is numeric, and ignore anything else.
Example folder structure for c:\Test
\20200202\
\20200109\
\20190308\
\Apples\
\Oranges\
New Document.txt
This works to walk the directory but puts everything in the list, not just the numeric subfolders.
#Example code
import os
from pprint import pprint
files=[]
MAX_DEPTH = 1
folders = ['C:\\Test']
for stuff in folders:
for root, dirs, files in os.walk(stuff, topdown=True):
for subdirname in dirs:
files.append(os.path.join(subdirname))
#files.append(os.path.join(root, subdirname)) will give full directory
#print("there are", len(files), "files in", root) will show counts of files per directory
if root.count(os.sep) - stuff.count(os.sep) == MAX_DEPTH - 1:
del dirs[:]
pprint(max(files))
Current Result of max(files):
New Document.txt
Desired Output:
20200202
What I have tried so far:
I've tried catching each element before I add it to the list, seeing if the string of the subdirname can be converted to int, and then adding it to the list. This fails to convert the numeric subdirnames to an int, and somehow (I don't know how) the New Document.txt file gets added to the list.
files=[]
MAX_DEPTH = 1
folders = ['C:\\Test']
for stuff in folders:
for root, dirs, files in os.walk(stuff, topdown=True):
for subdirname in dirs:
try:
subdirname = int(subdirname)
print("Found subdir named " + subdirname + " type: " + type(subdirname))
files.append(os.path.join(subdirname))
except:
print("Error converting " + str(subdirname) + " to integer")
pass
#files.append(os.path.join(root, subdirname)) will give full directory
#print("there are", len(files), "files in", root) will show counts of files per directory
if root.count(os.sep) - stuff.count(os.sep) == MAX_DEPTH - 1:
del dirs[:]
return (input + "/" + max(files))
I've also tried appending everything to the list and then creating a second list (ie, without the try/except) using the below, but I wind up with an empty list. I'm not sure why, and I'm not sure where/how to start looking. Using 'type' on the list before applying the following shows that everything in the list is a str type.
list2 = [x for x in files if isinstance(x,int) and not isinstance(x,bool)]
I'm going to go ahead and answer my own question here:
Changing the method entirely helped, and made it significantly faster, and simpler.
#the find_newest_date function looks for a folder with the largest number and assumes that is the newest data
def find_newest_date(input):
intlistfolders = []
list_subfolders_with_paths = [f.name for f in os.scandir(input) if f.is_dir()]
for x in list_subfolders_with_paths:
try:
intval = int(x)
intlistfolders.append(intval)
except:
pass
return (input + "/" + str(max(intlistfolders)))
Explanation:
scandir is 3x faster than walk. directory performance
scandir also allows the use of f.name to pull out just the folder
names, or f.path to get paths.
So, use scandir to load up the list with all the subdirs.
Iterate over the list, and try to convert each value to an integer.
I don't know why it wouldn't work in the earlier example, but it
works in this case.
The first part of the try statement converts to an integer.
If conversion fails, the except clause is run, and 'pass' is
essentially a null statement. It does nothing.
Then, finally, join the input directory with the string
representation of the maximum numeric value (ie most recently dated
folder in this case).
The function is called with:
folder_named_path = find_newest_date("C:\\Test") or something similar.
Try matching dirs with a regular expression.num = r”[0-9]+” is your regular expression. Something like re.findall(num,subdirname) returns to you a matching string that is one or more Numbers.

python script expected an indent block in if statement

I'm trying to write a basic backup script from one folder to another, and I got it to work - but the directory structure was not being copied over, just the files. I'm trying to copy in the subfolder as well, so that, for example, c:\temp\docs\file.txt goes to d:\temp\docs\file.txt instead of just d:\temp\file.txt
My issue exists in indentation with my if/else statement, but everything looks good to me. What am I doing wrong?
import datetime, time, string, os, shutil
COPY_FROM_LOCATION = 'C:\\xampp\\htdocs\\projects'
folder_date = time.strftime("%Y-%m-%d")
BACKUP_TO_LOCATION = 'D:\\BACKUP\\' + folder_date
#Create a new directory in D:\BACKUP based on today's date so the folder you're trying to copy to actually exists:
if not os.path.exists(BACKUP_TO_LOCATION):
os.makedirs(BACKUP_TO_LOCATION)
#copy function
def backup(source_folder, target_folder):
for subdir, dirs, files in os.walk(source_folder):
if subdir == source_folder :
new_target_folder = target_folder
else:
folder_name = subdir.split("C:\\xampp\\htdocs\\projects\\",1)[-1]
new_target_folder = target_folder + "\\" + folder_name
for file in files:
print "backing up: " + folder_name
shutil.copy2(os.path.join(subdir, file), new_target_folder)
backup(COPY_FROM_LOCATION,BACKUP_TO_LOCATION)
Here's the error I'm getting:
File "backup.py", line 15
new_target_folder = target_folder
^
IndentationError: expected an indented block
You're intermixing tabs and spaces.
Use one or the other, not both. Preferably spaces.
This error typically means there is an error in indentation. Check you don't mix tabs and spaces.
You can use https://www.pylint.org/ to detect them or if it something simple paste the code at http://pep8online.com, it will show you what you can enhance.
what's up with the weirdness with the space before the semi-colon? I've not seen it done that way before, that appears to be where this script is choking up.
change
if subdir == source_folder :
with
if subdir == source_folder:

Beginner Python 3--os.path and WinError2

import os
searchFolder = input('Which folder would you like to search?')
def search(folder):
for foldername, subfolders, filenames in os.walk(folder):
for filename in filenames:
if os.path.getsize(filename) > 1000:
print(str(os.path.abspath(filename)) + 'is ' + str(os.path.getsize(filename)))
else:
continue
search(searchFolder)
This program is meant to ask the user for a string, iterate over the files in that directory, and print the abs path and file size of every item over a certain size. I'm getting a FileNotFoundError: [WinError 2] when I run this code, on any directory. I'm inputting the directory with escaped backslashes. I think this is such a rudimentary error on my part that this is all the info anyone would need but let me know if there's anything else that would be helpful. Thanks!
In the filename for loop you have only passed the filename but not the complete path. If you write:
if os.path.getsize(foldername+"/"+filename) > 1000:
This works for linux. For Windows you need to use \ or \\instead of /. So now you understand why it isn't working. You should use the full filepath or relative path while adding a path.
Working code in linux:
import os
searchFolder = input('Which folder would you like to search? ')
def search(folder):
for foldername, subfolders, filenames in os.walk(folder):
for filename in filenames:
if os.path.getsize(foldername+"/"+filename) > 1000:
print(str(os.path.abspath(filename)) + ' is ' + str(os.path.getsize(foldername+"/"+filename)))
else:
continue
search(searchFolder)
Input() will return the string that the user writes. You don't have to escape backslashes. So just input it as C:\path\to\my\folder\. It's when you write windows paths in your python source code that you must escape your backslashes or use r"raw string".
You can use os.path.isdir() to check that python actually accepts the path, and print an error if the path could not be found.
searchFolder = input('Which folder would you like to search?')
if os.path.isdir(searchFolder):
search(searchFolder)
else:
print("the folder %s was not found" % searchFolder)
I tested the code and it works fine, I used for my test. ./
Python accepts both path types:
path = "C:/" # unix
and
path = "C:\\" # windows
for input try ./ , which will search the directory the program is in.
So, you have two options, relative pathing or absolute pathing.
More on pathing.
Although as was mentioned, for anything outside of the programs directory you need to correct the line
if os.path.getsize(filename) > 1000:
to
if os.path.getsize(foldername+"/"+filename) > 1000:
Whenever you want to insert any path, just add an r before the path. This is Python's raw string notation. i.e; backslashes are not handled in any special way in a string literal prefixed with r
So, if you want to add a path to a file called foo in C:\Users\pep\Documents
Just give your path as
my_path = r'C:\Users\pep\Documents\foo'
You don't need to bother escaping any backslashes now.

How can I scan through a directory in python?

I have a python script that is trying to compare two files to each other and output the difference. However I am not sure what exactly is going on as when I run the script it gives me an error as
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\api\\API_TEST\\Apis.os\\*.*'
I dont know why it is appending * . * at the end of the file extention.
This is currently my function:
def CheckFilesLatest(self, previous_path, latest_path):
for filename in os.listdir(latest_path):
previous_filename = os.path.join(previous_path, filename)
latest_filename = os.path.join(latest_path, filename)
if self.IsValidOspace(latest_filename):
for os_filename in os.listdir(latest_filename):
name, ext = os.path.splitext(os_filename)
if ext == ".os":
previous_os_filename = os.path.join(previous_filename, os_filename)
latest_os_filename = os.path.join(latest_filename, os_filename)
if os.path.isfile(latest_os_filename) == True:
# If the file exists in both directories, check if the files are different; otherwise mark the contents of the latest file as added.
if os.path.isfile(previous_os_filename) == True:
self.GetFeaturesModified(previous_os_filename, latest_os_filename)
else:
self.GetFeaturesAdded(latest_os_filename)
else:
if os.path.isdir(latest_filename):
self.CheckFilesLatest(previous_filename, latest_filename)
Any thoughts on why it cant scan the directory and look for an os file for example?
It is failing on line:
for os_filename in os.listdir(latest_filename):
The code first gets called from
def main():
for i in range(6, arg_length, 2):
component = sys.argv[i]
package = sys.argv[i+1]
previous_source_dir = os.path.join(previous_path, component, package)
latest_source_dir = os.path.join(latest_path, component, package)
x.CheckFilesLatest(previous_source_dir, latest_source_dir)
x.CheckFilesPrevious(previous_source_dir, latest_source_dir)
Thank you
os.listdir() requires that the latest_path argument be a directory as you have stated. However, latest_path is being passed in as an argument. Thus, you need to look at the code that actually creates latest_path in order to determine why the '.' is being put in. Since you are calling it recursively, first check the original call (the first time). It would appear that your base code that calls CheckFilesLatest() is trying to set up the search command to find all files within the directory 'C:\api\API_TEST\Apis.os' You would need to split out the file indicator first and then do the check.
If you want to browse a directory recursively, using os.walk would be better and simpler than your complex handling with recursive function calls. Take a look at the docs: http://docs.python.org/2/library/os.html#os.walk

Categories