OS Windows 10 Pro
Versions of xlwings, Excel, and Python (0.9.0, Office 365, Python 3.8.2)
I am new on using xlwings through VBA. I run the exact syntax from a tutorial webpage on both VBA and Python, but it gives error like this:
File "<string>", line 1
import sys, os; sys.path[0:0]=os.path.normcase(os.path.expandvars(r'C:\Users\User\Trial2;C:\Users\User\Trial2\Trial2.zip;C:\Users\User\Anaconda3\')).split(';'); import Trial2;Trial2.main()
SyntaxError: invalid syntax
I used original syntax for VBA, and the syntax I used for python is like this:
import xlwings as xw
##xw.sub # only required if you want to import it or run it via UDF Server
def main():
wb = xw.Book.caller()
wb.sheets[0].range("A1").value = "Hello xlwings!"
##xw.func
def hello(name):
return "hello {0}".format(name)
if __name__ == "__main__":
xw.Book("Trial2.xlsm").set_mock_caller()
main()
I barely find any clue for this problem, so I'm hoping that someone can give me a solution
I realize this is a long time after the initial question but I had the same issue and couldn't find an answer anywhere. After playing around (for much longer than I care to admit) I found the problem for me was that my .xlsm/.py file names contained a space. With no other changes, everything worked when I replaced the space with an underscore.
This is a quirk in python's string literals. Even with raw strings the backslash escapes the quote character so r"ends in quote\"" is valid. It also means that raw strings can't end in a single backslash. r"ends in slash\" is a syntax error. If you need to end a string with a backslash, you can't use raw. "ends in slash\\" is okay.
I'm not sure where the failing string comes from, but you need to change it to
import sys, os; sys.path[0:0]=os.path.normcase(os.path.expandvars('C:\\Users\\User\\Trial2;C:\\Users\\User\\Trial2\\Trial2.zip;C:\\Users\\User\\Anaconda3\\')).split(';'); import Trial2;Trial2.main()
See Python Lexical Analysis
Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character).
Related
This question already has answers here:
What exactly do "u" and "r" string prefixes do, and what are raw string literals?
(7 answers)
Closed 1 year ago.
import os
cwd = os.getcwd()
print("Current working directory: {0}".format(cwd))
# Print the type of the returned object
print("os.getcwd() returns an object of type: {0}".format(type(cwd)))
os.chdir(r"C:\Users\ghph0\AppData\Local\Programs\Python\Python39\Bootcamp\PDFs")
# Print the current working directory
print("Current working directory: {0}".format(os.getcwd()))
Hi all, I was changing my file directory so I could access specific files and was then greeted with this error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
From there I did some research and was told that converting the string to raw would fix the problem. My question is why do I convert it to raw and what does it do and why does it turn the file path into a red colour(not really important but never seen this before). Picture below:
https://i.stack.imgur.com/4oHlC.png
Many thanks to anyone that can help.
Backslashes in strings have a specific meaning in Python and are translated by the interpreter. You have surely already encountered "\n". Despite taking two letters to type, that is actually a one-character string meaning "newline". ANY backslashes in a string are interpreted that way. In your particular case, you used "\U", which is the way Python allows typing long Unicode values. "\U1F600", for example, is the grinning face emoji.
Because regular expressions often need to use backslashes for other uses, Python introduced the "raw" string. In a raw string, backslashes are not interpreted. So, r"\n" is a two-character string containing a backslash and an "n". This is NOT a newline.
Windows paths often use backslashes, so raw strings are convenient there. As it turns out, every Windows API will also accept forward slashes, so you can use those as well.
As for the colors, that probably means your editor doesn't know how to interpret raw strings.
import pandas as pd
import numpy as ny
studentPerfomance = 'C:\Users\Vignesh\Desktop\project\students-performance-in-exams\StudentsPerformance.csv'
error
File "<ipython-input-10-056bf84aaa71>", line 1
studentPerfomance = 'C:\Users\Vignesh\Desktop\project\students-performance-in-exams\StudentsPerformance.csv'
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
Use the standard slash / and not the backslash. It is not good practice to use the backslash to separate folders. I do not know why Windows is still using this as the standard way to display paths.
The problem with the backslash is related to escape sequences like \n (new line) or \t (tab).
So the solution is to replace als backslashes with a standard slash /.
import pandas as pd
import numpy as ny
studentPerfomance = 'C:/Users/Vignesh/Desktop/project/students-performance-in-exams/StudentsPerformance.csv'
The problem is that you are using a string as a path.
Just put r before your normal string it converts normal string to raw string:
studentPerfomance = r'C:\Users\Vignesh\Desktop\project\students-performance-in-exams\StudentsPerformance.csv'
or
studentPerfomance = 'C:\\Users\\Vignesh\\Desktop\\project\\students-performance-in-exams\\StudentsPerformance.csv'
In general, there is nothing wrong with what you did. I'm also proud of you for not having any spaces in your path!(very unprofessional). The issue is that the backslashes(\) in your studentPerformance string are escape characters in Python. So Python escapes from the string every time it sees a \.
That said, Windows uses backslashes in system paths instead of forward slashes like Linux based operating systems, causing the users extra pain.
The best way to fix this issue is to prefix your string with an r, like so:
studentPerfomance = r'C:\Users\Vignesh\Desktop\project\students-performance-in-exams\StudentsPerformance.csv'
This tells Python to ignore the backslashes so that it does not escape the string.
This question already has answers here:
How to fix "<string> DeprecationWarning: invalid escape sequence" in Python?
(2 answers)
Closed 4 years ago.
In the given example: "\info\more info\nName"
how would I turn this into bytes
I tried using unicode-escape but that didn't seem to work :(
data = "\info\more info\nName"
dataV2 = str.encode(data)
FinalData = dataV2.decode('unicode-escape').encode('utf_8')
print(FinalData)
This is were I should get b'\info\more info\nName'
but something unexpected happens and I get DeprecationWarnings in my terminal
I'm assuming that its because of the backslashes causing a invalid sequence but I need them for this project
Backslashes before characters indicate an attempt to escape the character that follows to make it into a special character of some sort. You get the DeprecationWarning because Python is (finally) going to make unrecognized escapes an error, rather than silently treating them as a literal backslash followed by the character.
To fix, either double your backslashes (not sure if you intended a newline; if so, double double the backslash before the n):
data = "\\info\\more info\\nName"
or, if you want all the backslashes to be literal backslashes (the \n shouldn't be a newline), then you can use a raw string by prefixing with r:
data = r"\info\more info\nName"
which disables backslashes interpolation for everything except the quote character itself.
Note that if you just let data echo in the interactive interpreter, it will show the backslashes as doubled (because it implicitly uses the repr of the str, which is what you'd type to reproduce it). To avoid that, print the str to see what it would actually look like:
>>> "\\info\\more info\\nName" # repr produced by simply evaluating it, which shows backslashes doubled, but there's really only one each time
"\\info\\more info\\nName"
>>> print("\\info\\more info\\nName") # print shows the "real" contents
\info\more info\nName
>>> print("\\info\\more info\nName") # With new line left in place
\info\more info
Name
>>> print(r"\info\more info\nName") # Same as first option, but raw string means no doubling backslashes
\info\more info\nName
You can escape a backslash with another backslash.
data = "\\info\\more info\nName"
You could also use a raw string for the parts that don't need escapes.
data = r"\info\more info""\nName"
Note that raw strings don't work if the final character is a backslash.
I have a list like this
dis=('a','b','c',100)
I want it to push to a .Csv file(plan_to_prod2) ,but my folder name is a integer
my_df = pd.DataFrame(dis)
my_df.to_csv('E:\23\4\plan_to_prod2.csv')
i am getting invalid file name as error even though my file name is correct
You should use a raw string literal.
A \ followed by an integer is interpreted as a unicode character which is an invalid file name. Try print('E:\23\4\plan_to_prod2.csv') and see the output (I would have pasted it here but these characters don't show up when the answer is rendered). You can also see the problem in the error you provided in the comment.
When using raw string:
print(r'E:\23\4\plan_to_prod2.csv')
# E:\23\4\plan_to_prod2.csv
Instead of using raw string you can also use double slashes, ie print('E:\\23\\4\\plan_to_prod2.csv') but I find using raw strings much easier.
The \ character is used for escapes. So when you try to find the path you escape.
You should use / or use raw string r'' instead of \. Also, you could escape those backslashes by escaping it with an additional \.Choose whichever suits you best.
r'E:\23\4\plan_to_prod2.csv'
'E:\\23\\4\\plan_to_prod2.csv'
'E:/23/4/plan_to_prod2.csv'
I'm having issues reading Unicode text from the shell into Python. I have a test document with the following metadata atrribute:
kMDItemAuthors = (
"To\U0304ny\U0308 Sta\U030ark"
)
I see this when I run mdls -name kMDItemAuthors path/to/the/file
I am attempting to get this data into usable form within a Python script. However, I cannot get the Unicode represented text into actual Unicode in Python.
Here's what I am currently doing:
import unicodedata
import subprocess
import os
os.environ['LANG'] = 'en_US.UTF-8'
cmd = 'mdls -name kMDItemAuthors path/to/the/file'
proc = subprocess.Popen(cmd,
shell=True,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
(stdout, stderr) = proc.communicate()
u = unicode(stdout, 'utf8')
a = unicodedata.normalize('NFC', u)
Now, when I print(a), I get the exact same string representation is above. I have tried normalizing with all of the options (NFC, NFD, NFKC, NFKD), all with the same result.
The weirder thing is, when I try this code:
print('To\U0304ny\U0308 Sta\U030ark')
I get the following error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-7: truncated \UXXXXXXXX escape
So, when that sub-string is within the variable, there's no problem, but as a raw string, it creates an issue.
I had felt pretty strong in my understanding of Python and Unicode, but now the shell has broken me. Any help would be greatly appreciated.
PS. I am running all this in Python 2.7.X
You have multiple problems here.
Like all escape sequences, Python only interprets the \U sequence in string literals in your source code. If a file actually has a \ followed by a U in it, Python isn't going to treat that as anything other than a \ and a U, any more than it'll treat a \ followed by an n as a newline. If you want to unescape them manually, you can, by using the unicodeescape codec. (But note that this will treat your file as ASCII, not UTF-8. If you actually have both UTF-8 and \U sequences, you will have to decode it as UTF8, then encode it with unicodeescape, then decode it back with unicodeescape.)
A Python \U sequence requires 8 digits, not 4. If you only have 4, you have to use \u. So, whatever program generated this string, it can't be parsed with unicodeescape. You might be able to hack it into shape by some quick&dirty workaround like s.replace(r'\U', r'\U0000') or s.replace('r\U', r'\u'), or you may have to write a simple parser for it.
In your test, you're trying to use \U escapes in a string literal. You can only do that in Unicode string literals, like print(u'To\U0304ny\U0308 Sta\U030ark'). (If you do that, of course, you'll get the previous error again.)
Also, since this appears to be a Mac, you probably shouldn't be doing os.environ['LANG'] = 'en_US.UTF-8'. If Python sees that it's on OS X, it assumes everything is UTF-8. Anything you do to try to force UTF-8 will probably do nothing, and could in theory confuse it so it doesn't notice it's on OS X. Unless you're trying to work around a driver program that intentionally sets the locale to "C" before calling your script, you're usually better off not doing this.
as mentioned in the other answers just slightly more direct code example
>>> s="To\U0304ny\U0308 Sta\U030ark"
>>> s
'To\\U0304ny\\U0308 Sta\\U030ark'
>>> s.replace("\\U","\\u").decode("unicode-escape")
u'To\u0304ny\u0308 Sta\u030ark'
>>> print s.replace("\\U","\\u").decode("unicode-escape")
Tōnÿ Stårk
>>>
\U is for characters outside the BMP, i.e. it takes 8 hex digits. For characters within the BMP use \u.
>>> print u'To\u0304ny\u0308 Sta\u030ark'
Tōnÿ Stårk
3>> print('To\u0304ny\u0308 Sta\u030ark')
Tōnÿ Stårk