What is the idomatic/canoncial/best way to get a sibling file (name) in Python 2.7?
That is, if there is file like 'C:\\path\\file.bin' or '/path/file.bin',
how to get 'C:\\path\\anothername.anotherext' or '/path/anothername.anotherext'.
String manipulation, searching for last path separator and replacing the part after that, would of course work, but this seems awfully crude.
In other words, what is idiomatic Python equivalent(s) of these Java snippets:
File sibling = new File(file.getParent(), siblingName);
Or a bit longer for pathname strings:
String siblingPathName = new File(new File(filePathName).getParent(), siblingName).toString();
Note: Java being used above is not relevant to the question.
Note2: If Python 3 has a new way, it would be nice to know that too, even though I'm using Python 2.7 atm.
Use os.path.dirname to get the directory of the file, and then os.path.join to add the sibling's name.
os.path.join(os.path.dirname(f), siblingname)
Related
Im going through the book Dive into Python 3 and in one of its parts theres a .py file which contains a function that has a triple quoted string that as the book explains it, is a document.
When I go into the python shell and import said file and write
print(humansize.approximate_size.__doc__)
it gives me back the said triple quoted string.
I decided Id give it a try myself and created a triple quoted string right under the other one. Saved the file and ran the same code in the python shell - but only the first one appeared. Why is that?
Do i need to install some separate tool to document code ?
Thank you!
I have a python program that reads a csv file, makes some changes, then writes to an HTML file. The issue is a block of code where I'm trying to search for a string assigned to one variable, then replace it with another string assigned to another variable. I am able to read a line in the csv file that looks like this:
Link:,www.google.com
And I am successful in writing an html file with the following:
<tr><td>Link:</td><td>www.google.com</td></tr>
Essentially I want to go further with an added step to find www.google.com between the anchor tags and replace it with "GOOGLE".
I've researched 'find and replace' functions built into python, and I came up with the substitution function inside the regular expressions module (re.sub()). This might not be the best way to do it and I'm trying to figure out if there's a better function/module out there I should look into.
python
for line in file:
newHTML.write(re.sub(var1,var2,line,flags=re.MULTILINE), end='')
newHTML.write(re.sub(var3,var4,line,flags=re.MULTILINE), end='')
The error I am receiving is:
newHTML.write(re.sub(var1,var2,line,flags=re.MULTILINE), end='')
TypeError: write() takes no keyword arguments
If I comment out this code, the rest of the program runs fine albeit without finding and replacing these variables.
Perhaps re.sub() doesn't go well with write()?
The error says what the problem is: as #furas commented, write() is not the same as print(), and doesn't accept the end='' keyword argument. file.write() by default doesn't include newlines if you don't explicitly put any \n's, so it should work if you change the line to:
newHTML.write(re.sub(var1,var2,line,flags=re.MULTILINE))
Also, regex and HTML aren't the best of friends... Your case is simple enough that using regex is fine, but you mentioned looking for a better module to generate HTML. This SO question had some good suggestions in the answers. Notable mentions for creating HTML templates on there were xml.etree, jinja2 (Flask's default engine), and yattag.
MyDir = os.getcwd().split(os.sep)[-1]
command = re.search(r"(MyDir)", body).group(1)
etc
hi guys,
am trying to have a python script (on windows)
search my outlook email body for certain words using regex
that works fine, coupled with the rest of the script (not shown)
but the minute i want it to search for a variable, ie MyDir it does nothing when i fire off an email to
myself with the word: documents in the body of the email (documents, being the directory the script is located on this occasion; though should populate the variable with whatever top level directory the script is being run from)
now i have read and seen that re.escape is a method to consider, and have copied lots of different variations, and examples
and adapted it to my scenario, but none have worked, i have built the regex as a string also, still no joy
is there anything in my MyDir "variable" that is throwing the regex search off?
am stumped, its my first python script, so am sure am doing something wrong - or maybe i cant use os.getcwd().split(os.sep)[-1] inside regex and have it not look at the variable but the literal string!
thanks for any help, as i have read through similar regex+variable posts on here but havent worked for me
:)
Try:
command = re.search("(" + re.escape(MyDir) + ")", body).group(1)
You searching for the string MyDir not the variable MyDir. You could use str.format
command = re.search(r"({})".format(MyDir), body).group(1)
Basically when I have a python file like:
python-code.py
and use:
import (python-code)
the interpreter gives me syntax error.
Any ideas on how to fix it? Are dashes illegal in python file names?
You should check out PEP 8, the Style Guide for Python Code:
Package and Module Names Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
Since module names are mapped to file names, and some file systems are case insensitive and truncate long names, it is important that module names be chosen to be fairly short -- this won't be a problem on Unix, but it may be a problem when the code is transported to older Mac or Windows versions, or DOS.
In other words: rename your file :)
One other thing to note in your code is that import is not a function. So import(python-code) should be import python-code which, as some have already mentioned, is interpreted as "import python minus code", not what you intended. If you really need to import a file with a dash in its name, you can do the following::
python_code = __import__('python-code')
But, as also mentioned above, this is not really recommended. You should change the filename if it's something you control.
TLDR
Dashes are not illegal but you should not use them for 3 reasons:
You need special syntax to import files with dashes
Nobody expects a module name with a dash
It's against the recommendations of the Python Style Guide
If you definitely need to import a file name with a dash the special syntax is this:
module_name = __import__('module-name')
Curious about why we need special syntax?
The reason for the special syntax is that when you write import somename you're creating a module object with identifier somename (so you can later use it with e.g. somename.funcname). Of course module-name is not a valid identifier and hence the special syntax that gives a valid one.
You don't get why module-name is not valid identifier?
Don't worry -- I didn't either. Here's a tip to help you: Look at this python line: x=var1-var2. Do you see a subtraction on the right side of the assignment or a variable name with a dash?
PS
Nothing original in my answer except including what I considered to be the most relevant bits of information from all other answers in one place
The problem is that python-code is not an identifier. The parser sees this as python minus code. Of course this won't do what you're asking. You will need to use a filename that is also a valid python identifier. Try replacing the - with an underscore.
On Python 3 use import_module:
from importlib import import_module
python_code = import_module('python-code')
More generally,
import_module('package.subpackage.module')
You could probably import it through some __import__ hack, but if you don't already know how, you shouldn't. Python module names should be valid variable names ("identifiers") -- that means if you have a module foo_bar, you can use it from within Python (print foo_bar). You wouldn't be able to do so with a weird name (print foo-bar -> syntax error).
Although proper file naming is the best course, if python-code is not under our control, a hack using __import__ is better than copying, renaming, or otherwise messing around with other authors' code. However, I tried and it didn't work unless I renamed the file adding the .py extension. After looking at the doc to derive how to get a description for .py, I ended up with this:
import imp
try:
python_code_file = open("python-code")
python_code = imp.load_module('python_code', python_code_file, './python-code', ('.py', 'U', 1))
finally:
python_code_file.close()
It created a new file python-codec on the first run.
I've got a set of file directories that I am manipulating with python. However, all I care about is the last two levels of the directory. So if I had
"topdirectory/sub1/subsub1/subsubsub1/target"
"topdirectory/sub1/target"
The necesary returned strings would be
"subsubsub1/target"
and
"sub1/target"
I know python has a split string type method, but how can I tell it to only grab the LAST 2 components separated by delimeters?
Edit : Sorry guys, I should have explained that this is not REALLY a directory/file setup, but a timeseries DB that very closely resembles one. I figured it would just be easier to explain that way. The paths are essentially directories/files, but since it is a database, using the OS utilites wouldn't have any effect.
The os.path module contains a split function for this. It returns the dirname and the basename. Run it twice and you have the last two bases.
Obviously, you want some checking that there are two or more bases as well.
Try
"topdirectory/sub1/subsub1/subsubsub1/target".rsplit('/',2)[-2:]
This approach works for any string in general.
But as stated in the comments, if you refer to the system path, I'd rather use os module as suggested by Sean Perry. Note that on different operating system, delimiter can be different, etc.