How to clean a file path?

How to clean a file path? - python

Is there a cleaner, shorter, perhaps more elegant or Pythonic way to do this?
Unique constraint: The same code has to work on both Python 3.x and MicroPython. This means that a lot of the nice tools available in Python 3.x cannot be used. Hence the "back to basics" feel to the code.
It attempts to clean-up some malformed input. That's not super important. Also, yes, it has to behave on both Windows and Linux. I tested on Windows and MicroPython on an RP2040 processor (RaspberryPi Pico). Works fine.
Warning: If you run this code on your system it will create over twenty directories. Make sure you run it on a junk directory.
import os
def create_path(path:str) -> str:
"""Given a path with a file name, create any missing directories
Does NOT create files.
:param path: "some/subdirectory/file.txt"
:return: The new full path, ready to open the file
"""
result = os.getcwd()
# Cleaning up an leading and multiple slashes
while "//" in path:
path = path.replace("//", "/")
if len(path) > 0:
path = path if path[0] != "/" else path[1:]
elements = path.split("/")
for element in elements:
if element == "":
break
else:
result += "/" + element
if "." in element:
break # It's a file; we are done
else:
# It's a directory, does it exist?
try:
os.listdir(result)
except:
# It does not, create it
os.mkdir(result)
# This is necessary to remove a leading double slash
# when used in MicroPython
return result.replace("//", "/")
if __name__ == "__main__":
# Tests
# WARNING: This will create over twenty directories!
# Does not create the file, just the path
print(create_path("/"))
print(create_path("//"))
print(create_path("///"))
print(create_path("file_01.txt"))
print(create_path("/file_02.txt"))
print(create_path("//file_03.txt"))
print(create_path("///file_04.txt"))
print(create_path("dir_00"))
print(create_path("/dir_01"))
print(create_path("//dir_02"))
print(create_path("dir_03/"))
print(create_path("dir_04//"))
print(create_path("dir_05///"))
print(create_path("/dir_06"))
print(create_path("/dir_07/"))
print(create_path("//dir_08/"))
print(create_path("//dir_09//"))
print(create_path("dir_10/file_05.txt"))
print(create_path("/dir_11/file_06.txt"))
print(create_path("//dir_12/file_07.txt"))
print(create_path("//dir_13//file_08.txt"))
print(create_path("//dir_14//file_09.txt/"))
print(create_path("//dir_15//file_10.txt//"))
print(create_path("dir_16/dir_116/file_11.txt"))
print(create_path("/dir_17/dir_117/file_12.txt"))
print(create_path("//dir_18/dir_118/file_13.txt"))
print(create_path("//dir_19///dir_119/file_14.txt"))
print(create_path("//dir_20///dir_120///file_15.txt"))
print(create_path("//dir_21//dir_121//file_16.txt/"))
print(create_path("//dir_22//dir_122//file_17.txt/"))

Related

Cant execute same script in different folder

I have a pretty simple script running, it works in one folder but doesnt work in the other. Have been going at this for a couple hours ):. Permissions for both files are the same, script is the same, nothing python side. Seems very technical. I have tried allowing full control to the files I am attempting to interact with and the duplicated python script aswell. I tested the duplicate script in C:/ and Program Files (x86) with no resolution. The script only seems to work from one folder. Nothing is different from both scripts.
Script I am attempting to copy into a new folder and use:
import os
import sys
import shutil
def parse(p):
q = p
return q
#per line in textbox, create element in list
zz = ((parse(sys.argv[1]).replace("'", "")).split("\n"))
zz = list(filter(("").__ne__, zz))
last_element = zz[-1:]
last_element = (last_element[0]).split("[[")
zz = zz[:-1]
zz.append(last_element[0])
last_element = last_element[1]
if last_element == "product_url.txt":
os.chdir(r"C:\Cactus (2022)\supported_websites\XXX")
else:
os.chdir(r'C:\Program Files (x86)\Cactus (2022)\supported_websites\XXX\XXX')
a_file = open('%s' % last_element, "w")
for x in zz:
if x == "":
pass
else:
a_file.write("%s\n" % x)
Calling from C#:
MessageBox.Show(richTextBox4.Text);
panel1.Visible = false;
string task_information;
task_information = richTextBox4.Text + #"[[product_url.txt";
ProcessStartInfo rtInfo = new ProcessStartInfo(#"C://Program Files (x86)//Cactus (2022)//repo//python.exe");
rtInfo.FileName = "C://Program Files (x86)//Cactus (2022)//repo//python.exe";
rtInfo.Arguments = "C://Cactus (2022)//modifytextfilelines.py '" + task_information + "'";
rtInfo.UseShellExecute = false;
rtInfo.CreateNoWindow = true;
Process.Start(rtInfo);
Only way it works:
MessageBox.Show(richTextBox4.Text);
panel1.Visible = false;
string task_information;
task_information = richTextBox4.Text + #"[[product_url.txt";
ProcessStartInfo rtInfo = new ProcessStartInfo(#"C://Program Files (x86)//Cactus (2022)//repo//python.exe");
rtInfo.FileName = "C://Program Files (x86)//Cactus (2022)//repo//python.exe";
rtInfo.Arguments = "C://Cogs//modifytextfilelines.py '" + task_information + "'";
rtInfo.UseShellExecute = false;
rtInfo.CreateNoWindow = true;
Process.Start(rtInfo);
Let me emphasize, the script is exactly the same, copy and pasted from my "Cogs" folder. It doesnt work at all if I attempt to copy and paste it to a new folder and modify my Arguments line to that directory for C#.
Edit:
Did more testing, seems it is the space in "Cactus (2022)", I replaced the folder name with XXX in the pasted code below... I copy and pasted my Cogs folder into C:/ and it worked fine, I renamed it to "REPO", I changed this to "RE PO" and it stopped working. So a syntax error is the issue underneath arguments.

How can I make these functions work following OOP Principles?

I have several functions/methods in a class that are kind of connected. I am building a class that mimics terminal commands and links. However, someone told me this is not proper OOP. How can I separate these methods to work independently. Methods shouldn't call other methods. Correct?
class directory:
#FILES, LINKS AND DIRECTORIES
current_path = []
hold_files = [
'test1.txt', 'test2.py',
{'/desktop': ['computer.txt','tfile.doc',
{'/peace':{
'/pictures': [
'hello.gif',
'smile.gif',
'run.gif'
]},
'/work':[
'file1.txt',
'file2.txt'
]
}]
}
]
#recursively delete folder (if dot in)
def delete(itself):
#if dictionary, call self, else delete
del itself
return
## HELPER METHODS
# Join list together to produce new link, basically return the added folder to the link
def concatenate(self):
new_link ="".join(current_path)
return new_link
#strip slashes and place in list
def adjust_link(self, paths):
new_string = ""
# shorten link, if someone uses cd .., basically go back to previous folder
if paths == "cd ..":
current_path.pop()
#extend link, if someone is cding into another folder, remove /'s and append to separate list
elif "cd " in paths:
paths = paths[3:]
for slash in paths:
if slash == "/":
current.append(new_string)
new_string = ""
else:
new_string+=slash
# This shouldn't be here as OOP must be separated but this calls the other function to concatenate a new link
stripped = concatenate()
return stripped
#returns link
def link(self, paths):
address_location = adjust_link(paths)
return address_location
directory.link("cd desktop/peace")
directory.link("cd pictures")
directory.link("cd ..")
directory.delete()
Thank you.
*Also, this is not a refactoring question. I already asked on stack exchange code review and they told me to come here. Code does not work.
Edit 2: why won't "directory.link()" work?

"I have a program here and I want to convert it to OOP" is not usually how it's done: "I have a problem I want to solve using OOP" is usually the approach. It looks like you are creating something that will traverse an internal directory structure. So a skeleton might look like:
class DirectoryTraverser:
def __init__(self, directory_tree):
self.hold_files = directory_tree
self.current_path = []
def... # all your other functions
# then to use it might look like:
# create a directory traversal object with directory tree
dt = DirectoryTraverser(hold_files)
dt.link("cd desktop/peace")
dt.link("cd pictures")
dt.link("cd ..")
dt.delete()

Why my function isn't stopping at "return" line?

I'm trying to write a function that will search for "Searched" directory in the directory tree and return path to it, it should stop when the directory is found, but it isn't, where is my mistake?
import os
searched = "NodeBook"
def find(Path, searched):
print("Searching in " + os.path.normpath(Path))
for filePath in os.listdir(Path):
if ((filePath == searched) and (os.path.isdir(os.path.join(Path, filePath)))) :
print("Found")
print(filePath)
print(os.path.join(Path, filePath))
return os.path.join(Path, filePath)
elif (os.path.isdir(filePath)) :
find(os.path.join(Path, filePath), searched)
find( "./", searched)
I expect something like that :
Searching in .
Searching in nodeLearning
Searching in nodeParse
Searching in Screeps
Found
NodeBook
But i have :
Searching in .
Searching in nodeLearning
Searching in nodeParse
Searching in Screeps
Found
NodeBook
./Screeps\NodeBook
Searching in testpython
Searching in testReact
Searching in testReact\testreact
It goes through all subdirectories.

You have a few small issues.
Bug 1: you look at isdir(filePath) instead of isdir(os.path.join(Path, filePath)). This can cause errors if you have a file that is not a directory with the same name as a directory in your starting location. For example
/tmp/a <-- dir
/tmp/b <-- dir
/tmp/b/a <-- file
would give an OSError
Bug 2: You don't stop if you find a match in a recursive call
You can fix this in a variety of ways, chose to do this by checking the return in your recursive call.
Bug 3: I think this may go forever if it encounters symlinks that form a loop. Didn't fix, but you should decide how you would handle it.
I also renamed a few things for clarity.
import os
def find_subdir(base_dir, search):
print("Searching in " + os.path.normpath(base_dir))
for name in os.listdir(base_dir):
path = os.path.join(base_dir, name)
if not os.path.isdir(path):
continue
if name == search:
return path
sub_search = find_subdir(path, search)
if sub_search is not None:
return sub_search
return None # For clarity
result = find_subdir( "./", "NodeBook")
if result is not None:
print("Found")
print(result)

Here the function is calling itself:
elif (os.path.isdir(filePath)) :
find(...)
Okay, but this is happening in a loop, so after this call returns, the loop will continue. You should rethink the logic: maybe you can check the return value and then either return it if it indicates a valid path, or continue looping otherwise.
For example, right now the function returns None when nothing has been found, so you can check if the return value is None:
ret = find(...)
if ret is not None:
return ret
# continue looping otherwise

How to find a unassigned drive letter on windows with python

I needed to find a free drive letter on windows from a python script. Free stands for not assigned to any physically or remote device.
I did some research and found a solution here on stackoverflow (cant remember the exact link):
# for python 2.7
import string
import win32api
def getfreedriveletter():
""" Find first free drive letter """
assigneddrives = win32api.GetLogicalDriveStrings().split('\000')[:-1]
assigneddrives = [item.rstrip(':\\').lower() for item in assigneddrives]
for driveletter in list(string.ascii_lowercase[2:]):
if not driveletter in assigneddrives:
return driveletter.upper() + ':'
This works fine for all physically drives and connected network drives. But not for currently disconnected drives.
How can I get all used drive letter, also the temporary not used ones?

Creating a child process is relatively expensive, and parsing free-form text output isn't the most reliable technique. You can instead use PyWin32 to call the same API functions that net use calls.
import string
import win32api
import win32wnet
import win32netcon
def get_free_drive():
drives = set(string.ascii_uppercase[2:])
for d in win32api.GetLogicalDriveStrings().split(':\\\x00'):
drives.discard(d)
# Discard persistent network drives, even if not connected.
henum = win32wnet.WNetOpenEnum(win32netcon.RESOURCE_REMEMBERED,
win32netcon.RESOURCETYPE_DISK, 0, None)
while True:
result = win32wnet.WNetEnumResource(henum)
if not result:
break
for r in result:
if len(r.lpLocalName) == 2 and r.lpLocalName[1] == ':':
drives.discard(r.lpLocalName[0])
if drives:
return sorted(drives)[-1] + ':'
Note that this function returns the last available drive letter. It's a common practice to assign mapped and substitute drives (e.g. from net.exe and subst.exe) from the end of the list and local system drives from the beginning.

As i will pass the found letter to an external script which will run the Winshell cmd 'subst /d letter'. I must not pass a currently not mounted drive, as it will remove the network-drive mapping.
The only way I found, was the result of the winshellcmd 'net use' to find unavailable drives.
Here is my solution, if you have a better way, please share it with me:
# for python 2.7
import string
import win32api
from subprocess import Popen, PIPE
def _getnetdrives():
""" As _getfreedriveletter can not find unconnected network drives
get these drives with shell cmd 'net use' """
callstr = 'net use'
phandle = Popen(callstr, stdout=PIPE)
presult = phandle.communicate()
stdout = presult[0]
# _stderr = presult[1]
networkdriveletters = []
for line in stdout.split('\n'):
if ': ' in line:
networkdriveletters.append(line.split()[1] + '\\')
return networkdriveletters
def getfreedriveletter():
""" Find first free drive letter """
assigneddrives = win32api.GetLogicalDriveStrings().split('\000')[:-1]
assigneddrives = assigneddrives + _getnetdrives()
assigneddrives = [item.rstrip(':\\').lower() for item in assigneddrives]
for driveletter in list(string.ascii_lowercase[2:]): #array starts from 'c' as i dont want a and b drive
if not driveletter in assigneddrives:
return driveletter.upper() + ':'

Script/utility to rewrite all svn:externals in repository trunk

Say that one wishes to convert all absolute svn:externals URLS to relative URLS throughout their repository.
Alternatively, if heeding the tip in the svn:externals docs ("You should seriously consider using explicit revision numbers..."), one might find themselves needing to periodically pull new revisions for externals in many places throughout the repository.
What's the best way to programmatically update a large number of svn:externals properties?
My solution is posted below.

Here's my class to extract parts from a single line of an svn:externals property:
from urlparse import urlparse
import re
class SvnExternalsLine:
'''Consult https://subversion.apache.org/docs/release-notes/1.5.html#externals for parsing algorithm.
The old svn:externals format consists of:
<local directory> [revision] <absolute remote URL>
The NEW svn:externals format consists of:
[revision] <absolute or relative remote URL> <local directory>
Therefore, "relative" remote paths always come *after* the local path.
One complication is the possibility of local paths with spaces.
We just assume that the remote path cannot have spaces, and treat all other
tokens (except the revision specifier) as part of the local path.
'''
REVISION_ARGUMENT_REGEXP = re.compile("-r(\d+)")
def __init__(self, original_line):
self.original_line = original_line
self.pinned_revision_number = None
self.repo_url = None
self.local_pathname_components = []
for token in self.original_line.split():
revision_match = self.REVISION_ARGUMENT_REGEXP.match(token)
if revision_match:
self.pinned_revision_number = int(revision_match.group(1))
elif urlparse(token).scheme or any(map(lambda p: token.startswith(p), ["^", "//", "/", "../"])):
self.repo_url = token
else:
self.local_pathname_components.append(token)
# ---------------------------------------------------------------------
def constructLine(self):
'''Reconstruct the externals line in the Subversion 1.5+ format'''
tokens = []
# Update the revision specifier if one existed
if self.pinned_revision_number is not None:
tokens.append( "-r%d" % (self.pinned_revision_number) )
tokens.append( self.repo_url )
tokens.extend( self.local_pathname_components )
if self.repo_url is None:
raise Exception("Found a bad externals property: %s; Original definition: %s" % (str(tokens), repr(self.original_line)))
return " ".join(tokens)
I use the pysvn library to iterate recursively through all of the directories possessing the svn:externals property, then split that property value by newlines, and act upon each line according to the parsed SvnExternalsLine.
The process must be performed on a local checkout of the repository. Here's how pysvn (propget) can be used to retrieve the externals:
client.propget( "svn:externals", base_checkout_path, recurse=True)
Iterate through the return value of this function, and and after modifying the property on each directory,
client.propset("svn:externals", new_externals_property, path)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to clean a file path? - python

Related

Cant execute same script in different folder

How can I make these functions work following OOP Principles?

Why my function isn't stopping at "return" line?

How to find a unassigned drive letter on windows with python

Script/utility to rewrite all svn:externals in repository trunk

Categories

Resources