Why is r' still duplicating the forward slashes in my code?

Why is r' still duplicating the forward slashes in my code? - python

So I have tried to read the solutions to Python duplicating a forwardslash from my code so it can find the file and most of the questions seem to indicate adding r' solves the problem.
In most of my code this works. But for this file path it is still duplicating all of the forwardslashes. Does anyone know why this would be the case?
I also tried using pathlib.Path to string together my path and it has produced the same result
For privacy I have removed the true file path but it is still replicating the issue. This is in my Jupyter Notebook.

"Raw strings" are the exact same type as regular strings, just a different way of entering them as input. Because their in-memory representation is identical, their "rawness" doesn't persist past the parser and change the way they behave later.
Thus, they still print the same way when repr()ed as any other string: You'll note that the representation didn't include the r'...' sigils, but was only '...'. As the way to represent r'\' as a non-raw-string is '\\', so the interpreter was correct to do so.

There was an absent file path that needed to be included

Related

Exporting multiple csv files with dynamic naming

I created about 200 csv files in Python and now need to download them all.
I created the files from a single file using:
g = df.groupby("col")
for n,g in df.groupby('col'):
g.to_csv(n+'stars'+'.csv')
When I try to use this same statement to export to my machine I get a syntax error and I'm not sure what I'm doing wrong:
g = df.groupby("col")
for n,g in df.groupby('col'):
g.to_csv('C:\Users\egagne\Downloads\'n+'stars'+'.csv'')
Error:
File "<ipython-input-27-43a5bfe55259>", line 3
g.to_csv('C:\Users\egagne\Downloads\'n+'stars'+'.csv'')
^
SyntaxError: invalid syntax
I'm in Jupyter lab, so I can download each file individually but I really don't want to have to do that.

You're possibly mixing up integers and strings, and the use of backslash in literals is dangerous anyway. Consider using the following
import os
inside the loop
f_name = os.path.join('C:', 'users', ' egagne', 'Downloads', str(n), 'stars.csv')
g.to_csv(f_name)
with os.path.join taking care of the backslashes for you.

g.to_csv('C:\Users\egagne\Downloads\'n+'stars'+'.csv'')
needs to be
g.to_csv('C:\\Users\\egagne\\Downloads\\'+n+'stars.csv').
There were two things wrong -- the backslash is an escape character so if you put a ' after it, it will be treated as part of your string instead of a closing quote as you intended it. Using \\ instead of a single \ escapes the escape character so that you can include a backslash in your string.
Also, you did not pair your quotes correctly. n is a variable name but from the syntax highlighting in your question it is clear that it is part of the string. Similarly you can see that stars and .csv are not highlighted as part of a string, and the closing '' should be a red flag that something has gone wrong.
Edit: I addressed what is causing the problem but Ami Tavory's answer is the right one -- though you know this is going to run on windows it is a better practice to use os.path.join() with directory names instead of writing out a path in a string. str(n) is also the right way to go if you are at all unsure about the type of n.

How to open a file in its default program with python

I want to open a file in python 3.5 in its default application, specifically 'screen.txt' in Notepad.
I have searched the internet, and found os.startfile(path) on most of the answers. I tried that with the file's path os.startfile(C:\[directories n stuff]\screen.txt) but it returned an error saying 'unexpected character after line continuation character'. I tried it without the file's path, just the file's name but it still didn't work.
What does this error mean? I have never seen it before.
Please provide a solution for opening a .txt file that works.
EDIT: I am on Windows 7 on a restricted (school) computer.

It's hard to be certain from your question as it stands, but I bet your problem is backslashes.
[EDITED to add:] Or actually maybe it's something simpler. Did you put quotes around your pathname at all? If not, that will certainly not work -- but once you do, you will find that then you need the rest of what I've written below.
In a Windows filesystem, the backslash \ is the standard way to separate directories.
In a Python string literal, the backslash \ is used for putting things into the string that would otherwise be difficult to enter. For instance, if you are writing a single-quoted string and you want a single quote in it, you can do this: 'don\'t'. Or if you want a newline character, you can do this: 'First line.\nSecond line.'
So if you take a Windows pathname and plug it into Python like this:
os.startfile('C:\foo\bar\baz')
then the string actually passed to os.startfile will not contain those backslashes; it will contain a form-feed character (from the \f) and two backspace characters (from the \bs), which is not what you want at all.
You can deal with this in three ways.
You can use forward slashes instead of backslashes. Although Windows prefers backslashes in its user interface, forward slashes work too, and they don't have special meaning in Python string literals.
You can "escape" the backslashes: two backslashes in a row mean an actual backslash. os.startfile('C:\\foo\\bar\\baz')
You can use a "raw string literal". Put an r before the opening single or double quotes. This will make backslashes not get interpreted specially. os.startfile(r'C:\foo\bar\baz')
The last is maybe the nicest, except for one annoying quirk: backslash-quote is still special in a raw string literal so that you can still say 'don\'t', which means you can't end a raw string literal with a backslash.

The recommended way to open a file with the default program is os.startfile. You can do something a bit more manual using os.system or subprocess though:
os.system(r'start ' + path_to_file')
or
subprocess.Popen('{start} {path}'.format(
start='start', path=path_to_file), shell=True)
Of course, this won't work cross-platform, but it might be enough for your use case.

For example I created file "test file.txt" on my drive D: so file path is 'D:/test file.txt'
Now I can open it with associated program with that script:
import os
os.startfile('d:/test file.txt')

Grab the last 2 of a split string, python

I've got a set of file directories that I am manipulating with python. However, all I care about is the last two levels of the directory. So if I had
"topdirectory/sub1/subsub1/subsubsub1/target"
"topdirectory/sub1/target"
The necesary returned strings would be
"subsubsub1/target"
and
"sub1/target"
I know python has a split string type method, but how can I tell it to only grab the LAST 2 components separated by delimeters?
Edit : Sorry guys, I should have explained that this is not REALLY a directory/file setup, but a timeseries DB that very closely resembles one. I figured it would just be easier to explain that way. The paths are essentially directories/files, but since it is a database, using the OS utilites wouldn't have any effect.

The os.path module contains a split function for this. It returns the dirname and the basename. Run it twice and you have the last two bases.
Obviously, you want some checking that there are two or more bases as well.

Try
"topdirectory/sub1/subsub1/subsubsub1/target".rsplit('/',2)[-2:]
This approach works for any string in general.
But as stated in the comments, if you refer to the system path, I'd rather use os module as suggested by Sean Perry. Note that on different operating system, delimiter can be different, etc.

Vexing Python syntax error

I am writing a python script using version 2.7.3. In the script a line is
toolsDir = 'tools/'
When I run this in terminal I get SyntaxError: invalid syntax on the last character in the string 'r'. I've tried renaming the string, using " as opposed to '. If I actually go into python via bash and declare the string in one line and print it I get no error.
I checked the encoding via file -i update.py and I get text/x-python; charset=us-ascii
I have used TextWrangler, nano and LeafPad as the text editors.
I have a feeling it may be something with the encoding of one of the editors. I have had this script run before without any errors.
Any advice would be greatly appreciated.

The string is 'tools/'. toolsDir is a variable. You're free to use different terminology, of course, but you'll end up confusing people trying to help you. The only r in that line is the last character of the variable name, so I assume that's the location of the error.
Most likely you've managed to introduce a fixed-width space (character code 0xA0) instead of an ordinary space. Try deleting SP=SP (all three characters) and retyping them.

Try running the code through pylint.
You probably have a syntax error on a nearby line before this one. Try commenting this line out and see if the error moves.
You might have a whitespace error, don't forget whitespace counts in python. If you've mixed tabs and spaces anywhere in your file it can throw the syntax checker off by several lines.
If you copied and pasted lines into this from any other source you may have copied whitespace in that doesn't fit with whichever convention you used.

The error was, of course, a silly one.
In one of my imports I use try: without closing or catching the error condition. pylint did not catch this and the error message did not indicate this.
If someone in the future has this triple check all opening code for syntax errors.

How to recognize special eol character when I see it, using Python?

I'm scraping a set of originally pdf files, using Python. Having gotten them to text, I had a lot of trouble getting the line endings out. I couldn't figure out what the line separator was. The trouble is, I still don't know.
It's not a '\n', or, I don't think, '\r\n'. However, I've managed to isolate one of these special characters. I literally have it in memory, and by doing a call to my_str.replace(eol, ''), I can remove all of these characters from one of my files.
So my question is open-ended. I'm a bit lost when it comes to unicode and such. How can I identify this character in my files without resorting to something ridiculous, like serializing it and then reading it in? Is there a way I can refer to it as a code, perhaps? I can't get Python to yield what it actually IS. All I ever see if I print it, or call unicode(special_eol) is the character in its functional usage as a newline.
Please help! Thanks, and sorry if I'm missing something obvious.

To determine what specific character that is, you can use str.encode('unicode_escape') or repr() to get (in Python 2) a ASCII-printable representation of the character:
>>> print u'☃'.encode('unicode_escape')
\u2603
>>> print repr(u'☃')
u'\u2603'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.