I've got a set of file directories that I am manipulating with python. However, all I care about is the last two levels of the directory. So if I had
"topdirectory/sub1/subsub1/subsubsub1/target"
"topdirectory/sub1/target"
The necesary returned strings would be
"subsubsub1/target"
and
"sub1/target"
I know python has a split string type method, but how can I tell it to only grab the LAST 2 components separated by delimeters?
Edit : Sorry guys, I should have explained that this is not REALLY a directory/file setup, but a timeseries DB that very closely resembles one. I figured it would just be easier to explain that way. The paths are essentially directories/files, but since it is a database, using the OS utilites wouldn't have any effect.
The os.path module contains a split function for this. It returns the dirname and the basename. Run it twice and you have the last two bases.
Obviously, you want some checking that there are two or more bases as well.
Try
"topdirectory/sub1/subsub1/subsubsub1/target".rsplit('/',2)[-2:]
This approach works for any string in general.
But as stated in the comments, if you refer to the system path, I'd rather use os module as suggested by Sean Perry. Note that on different operating system, delimiter can be different, etc.
Related
So I have tried to read the solutions to Python duplicating a forwardslash from my code so it can find the file and most of the questions seem to indicate adding r' solves the problem.
In most of my code this works. But for this file path it is still duplicating all of the forwardslashes. Does anyone know why this would be the case?
I also tried using pathlib.Path to string together my path and it has produced the same result
For privacy I have removed the true file path but it is still replicating the issue. This is in my Jupyter Notebook.
"Raw strings" are the exact same type as regular strings, just a different way of entering them as input. Because their in-memory representation is identical, their "rawness" doesn't persist past the parser and change the way they behave later.
Thus, they still print the same way when repr()ed as any other string: You'll note that the representation didn't include the r'...' sigils, but was only '...'. As the way to represent r'\' as a non-raw-string is '\\', so the interpreter was correct to do so.
There was an absent file path that needed to be included
Suppose I have two file paths as strings in Python, as an example, let's say they are these two:
C:/Users/testUser/Program/main.py
C:/Users/testUser/Program/data/somefile.txt
Is there a way, using the os module, to generate a relative URL based off of the first one? For example, feeding the two above to produce:
data/somefile.txt
I realize this is possible with string manipulation, by splitting off the files at the ends and cutting the first string out of the second, but is there a more robust way, probably using the python os module?
Thanks to MPlanchard in the comment below, here is the full answer:
import os
string1 = "C:/Users/testUser/Program/main.py"
string2 = "C:/Users/testUser/Program/data/somefile.txt"
os.path.relpath(string2, os.path.dirname(string1))
Basically when I have a python file like:
python-code.py
and use:
import (python-code)
the interpreter gives me syntax error.
Any ideas on how to fix it? Are dashes illegal in python file names?
You should check out PEP 8, the Style Guide for Python Code:
Package and Module Names Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
Since module names are mapped to file names, and some file systems are case insensitive and truncate long names, it is important that module names be chosen to be fairly short -- this won't be a problem on Unix, but it may be a problem when the code is transported to older Mac or Windows versions, or DOS.
In other words: rename your file :)
One other thing to note in your code is that import is not a function. So import(python-code) should be import python-code which, as some have already mentioned, is interpreted as "import python minus code", not what you intended. If you really need to import a file with a dash in its name, you can do the following::
python_code = __import__('python-code')
But, as also mentioned above, this is not really recommended. You should change the filename if it's something you control.
TLDR
Dashes are not illegal but you should not use them for 3 reasons:
You need special syntax to import files with dashes
Nobody expects a module name with a dash
It's against the recommendations of the Python Style Guide
If you definitely need to import a file name with a dash the special syntax is this:
module_name = __import__('module-name')
Curious about why we need special syntax?
The reason for the special syntax is that when you write import somename you're creating a module object with identifier somename (so you can later use it with e.g. somename.funcname). Of course module-name is not a valid identifier and hence the special syntax that gives a valid one.
You don't get why module-name is not valid identifier?
Don't worry -- I didn't either. Here's a tip to help you: Look at this python line: x=var1-var2. Do you see a subtraction on the right side of the assignment or a variable name with a dash?
PS
Nothing original in my answer except including what I considered to be the most relevant bits of information from all other answers in one place
The problem is that python-code is not an identifier. The parser sees this as python minus code. Of course this won't do what you're asking. You will need to use a filename that is also a valid python identifier. Try replacing the - with an underscore.
On Python 3 use import_module:
from importlib import import_module
python_code = import_module('python-code')
More generally,
import_module('package.subpackage.module')
You could probably import it through some __import__ hack, but if you don't already know how, you shouldn't. Python module names should be valid variable names ("identifiers") -- that means if you have a module foo_bar, you can use it from within Python (print foo_bar). You wouldn't be able to do so with a weird name (print foo-bar -> syntax error).
Although proper file naming is the best course, if python-code is not under our control, a hack using __import__ is better than copying, renaming, or otherwise messing around with other authors' code. However, I tried and it didn't work unless I renamed the file adding the .py extension. After looking at the doc to derive how to get a description for .py, I ended up with this:
import imp
try:
python_code_file = open("python-code")
python_code = imp.load_module('python_code', python_code_file, './python-code', ('.py', 'U', 1))
finally:
python_code_file.close()
It created a new file python-codec on the first run.
I'm trying to get a multi-line comment to use variables in PyYAML but not sure if this is even possible.
So, in YAML, you can assign a variable like:
current_host: &hostname myhost
But it doesn't seem to expand in the following:
test: |
Hello, this is my string
which is running on *hostname
Is this at all possible or am I going to have to use Python to parse it?
The anchors (&some_id) and references (*some_id) mechanism is essentially meant to provide the possibility to share complete nodes between parts of the tree representation that is a YAML text. This is e.g. necessary in order to have one and the same complex item (sequence/list resp. mapping/dict) that occurs in a list two times load as one and same item (instead of two copies with the same values).
So yes, you need to do the parsing in Python. You could start with the mechanism I provided in this answer and change the test
if node.value and node.value.startswith(self.d['escape'])
to find the escape character in any place in the scalar and take appropriate action.
You can find the answer here.
Just use a + between lines and your strings need to be enclosed in 's.
What is the idomatic/canoncial/best way to get a sibling file (name) in Python 2.7?
That is, if there is file like 'C:\\path\\file.bin' or '/path/file.bin',
how to get 'C:\\path\\anothername.anotherext' or '/path/anothername.anotherext'.
String manipulation, searching for last path separator and replacing the part after that, would of course work, but this seems awfully crude.
In other words, what is idiomatic Python equivalent(s) of these Java snippets:
File sibling = new File(file.getParent(), siblingName);
Or a bit longer for pathname strings:
String siblingPathName = new File(new File(filePathName).getParent(), siblingName).toString();
Note: Java being used above is not relevant to the question.
Note2: If Python 3 has a new way, it would be nice to know that too, even though I'm using Python 2.7 atm.
Use os.path.dirname to get the directory of the file, and then os.path.join to add the sibling's name.
os.path.join(os.path.dirname(f), siblingname)