Unicode error on python 3 using winsound [duplicate] - python

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed last year.
The folder I want to get to is called python and is on my desktop.
I get the following error when I try to get to it
>>> os.chdir('C:\Users\expoperialed\Desktop\Python')
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

You need to use a raw string, double your slashes or use forward slashes instead:
r'C:\Users\expoperialed\Desktop\Python'
'C:\\Users\\expoperialed\\Desktop\\Python'
'C:/Users/expoperialed/Desktop/Python'
In regular Python strings, the \U character combination signals an extended Unicode codepoint escape.
You can hit any number of other issues, for any of the other recognised escape sequences, such as \a, \t, or \x.
Note that as of Python 3.6, unrecognized escape sequences can trigger a DeprecationWarning (you'll have to remove the default filter for those), and in a future version of Python, such unrecognised escape sequences will cause a SyntaxError. No specific version has been set at this time, but Python will first use SyntaxWarning in the version before it'll be an error.
If you want to find issues like these in Python versions 3.6 and up, you can turn the warning into a SyntaxError exception by using the warnings filter error:^invalid escape sequence .*:DeprecationWarning (via a command line switch, environment variable or function call):
Python 3.10.0 (default, Oct 15 2021, 22:25:32) [Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import warnings
>>> '\expoperialed'
'\\expoperialed'
>>> warnings.filterwarnings('default', '^invalid escape sequence .*', DeprecationWarning)
>>> '\expoperialed'
<stdin>:1: DeprecationWarning: invalid escape sequence '\e'
'\\expoperialed'
>>> warnings.filterwarnings('error', '^invalid escape sequence .*', DeprecationWarning)
>>> '\expoperialed'
File "<stdin>", line 1
'\expoperialed'
^^^^^^^^^^^^^^^
SyntaxError: invalid escape sequence '\e'

This usually happens in Python 3. One of the common reasons would be that while specifying your file path you need "\\" instead of "\". As in:
filePath = "C:\\User\\Desktop\\myFile"
For Python 2, just using "\" would work.

f = open('C:\\Users\\Pooja\\Desktop\\trolldata.csv')
Use '\\' for python program in Python version 3 and above..
Error will be resolved..

All the three syntax work very well.
Another way is to first write
path = r'C:\user\...................' (whatever is the path for you)
and then passing it to os.chdir(path)

I had the same error.
Basically, I suspect that the path cannot start either with "U" or "User" after "C:\".
I changed my directory to "c:\file_name.png" by putting the file that I want to access from python right under the 'c:\' path.
In your case, if you have to access the "python" folder, perhaps reinstall the python, and change the installation path to something like "c:\python". Otherwise, just avoid the "...\User..." in your path, and put your project under C:.

Related

Changing file path and need for raw? [duplicate]

This question already has answers here:
What exactly do "u" and "r" string prefixes do, and what are raw string literals?
(7 answers)
Closed 1 year ago.
import os
cwd = os.getcwd()
print("Current working directory: {0}".format(cwd))
# Print the type of the returned object
print("os.getcwd() returns an object of type: {0}".format(type(cwd)))
os.chdir(r"C:\Users\ghph0\AppData\Local\Programs\Python\Python39\Bootcamp\PDFs")
# Print the current working directory
print("Current working directory: {0}".format(os.getcwd()))
Hi all, I was changing my file directory so I could access specific files and was then greeted with this error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
From there I did some research and was told that converting the string to raw would fix the problem. My question is why do I convert it to raw and what does it do and why does it turn the file path into a red colour(not really important but never seen this before). Picture below:
https://i.stack.imgur.com/4oHlC.png
Many thanks to anyone that can help.
Backslashes in strings have a specific meaning in Python and are translated by the interpreter. You have surely already encountered "\n". Despite taking two letters to type, that is actually a one-character string meaning "newline". ANY backslashes in a string are interpreted that way. In your particular case, you used "\U", which is the way Python allows typing long Unicode values. "\U1F600", for example, is the grinning face emoji.
Because regular expressions often need to use backslashes for other uses, Python introduced the "raw" string. In a raw string, backslashes are not interpreted. So, r"\n" is a two-character string containing a backslash and an "n". This is NOT a newline.
Windows paths often use backslashes, so raw strings are convenient there. As it turns out, every Windows API will also accept forward slashes, so you can use those as well.
As for the colors, that probably means your editor doesn't know how to interpret raw strings.

'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed last year.
The folder I want to get to is called python and is on my desktop.
I get the following error when I try to get to it
>>> os.chdir('C:\Users\expoperialed\Desktop\Python')
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
You need to use a raw string, double your slashes or use forward slashes instead:
r'C:\Users\expoperialed\Desktop\Python'
'C:\\Users\\expoperialed\\Desktop\\Python'
'C:/Users/expoperialed/Desktop/Python'
In regular Python strings, the \U character combination signals an extended Unicode codepoint escape.
You can hit any number of other issues, for any of the other recognised escape sequences, such as \a, \t, or \x.
Note that as of Python 3.6, unrecognized escape sequences can trigger a DeprecationWarning (you'll have to remove the default filter for those), and in a future version of Python, such unrecognised escape sequences will cause a SyntaxError. No specific version has been set at this time, but Python will first use SyntaxWarning in the version before it'll be an error.
If you want to find issues like these in Python versions 3.6 and up, you can turn the warning into a SyntaxError exception by using the warnings filter error:^invalid escape sequence .*:DeprecationWarning (via a command line switch, environment variable or function call):
Python 3.10.0 (default, Oct 15 2021, 22:25:32) [Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import warnings
>>> '\expoperialed'
'\\expoperialed'
>>> warnings.filterwarnings('default', '^invalid escape sequence .*', DeprecationWarning)
>>> '\expoperialed'
<stdin>:1: DeprecationWarning: invalid escape sequence '\e'
'\\expoperialed'
>>> warnings.filterwarnings('error', '^invalid escape sequence .*', DeprecationWarning)
>>> '\expoperialed'
File "<stdin>", line 1
'\expoperialed'
^^^^^^^^^^^^^^^
SyntaxError: invalid escape sequence '\e'
This usually happens in Python 3. One of the common reasons would be that while specifying your file path you need "\\" instead of "\". As in:
filePath = "C:\\User\\Desktop\\myFile"
For Python 2, just using "\" would work.
f = open('C:\\Users\\Pooja\\Desktop\\trolldata.csv')
Use '\\' for python program in Python version 3 and above..
Error will be resolved..
All the three syntax work very well.
Another way is to first write
path = r'C:\user\...................' (whatever is the path for you)
and then passing it to os.chdir(path)
I had the same error.
Basically, I suspect that the path cannot start either with "U" or "User" after "C:\".
I changed my directory to "c:\file_name.png" by putting the file that I want to access from python right under the 'c:\' path.
In your case, if you have to access the "python" folder, perhaps reinstall the python, and change the installation path to something like "c:\python". Otherwise, just avoid the "...\User..." in your path, and put your project under C:.

Why Python (2.7) encode and decode functions failed [duplicate]

I'm really confused. I tried to encode but the error said can't decode....
>>> "你好".encode("utf8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
I know how to avoid the error with "u" prefix on the string. I'm just wondering why the error is "can't decode" when encode was called. What is Python doing under the hood?
"你好".encode('utf-8')
encode converts a unicode object to a string object. But here you have invoked it on a string object (because you don't have the u). So python has to convert the string to a unicode object first. So it does the equivalent of
"你好".decode().encode('utf-8')
But the decode fails because the string isn't valid ascii. That's why you get a complaint about not being able to decode.
Always encode from unicode to bytes.
In this direction, you get to choose the encoding.
>>> u"你好".encode("utf8")
'\xe4\xbd\xa0\xe5\xa5\xbd'
>>> print _
你好
The other way is to decode from bytes to unicode.
In this direction, you have to know what the encoding is.
>>> bytes = '\xe4\xbd\xa0\xe5\xa5\xbd'
>>> print bytes
你好
>>> bytes.decode('utf-8')
u'\u4f60\u597d'
>>> print _
你好
This point can't be stressed enough. If you want to avoid playing unicode "whack-a-mole", it's important to understand what's happening at the data level. Here it is explained another way:
A unicode object is decoded already, you never want to call decode on it.
A bytestring object is encoded already, you never want to call encode on it.
Now, on seeing .encode on a byte string, Python 2 first tries to implicitly convert it to text (a unicode object). Similarly, on seeing .decode on a unicode string, Python 2 implicitly tries to convert it to bytes (a str object).
These implicit conversions are why you can get UnicodeDecodeError when you've called encode. It's because encoding usually accepts a parameter of type unicode; when receiving a str parameter, there's an implicit decoding into an object of type unicode before re-encoding it with another encoding. This conversion chooses a default 'ascii' decoder†, giving you the decoding error inside an encoder.
In fact, in Python 3 the methods str.decode and bytes.encode don't even exist. Their removal was a [controversial] attempt to avoid this common confusion.
† ...or whatever coding sys.getdefaultencoding() mentions; usually this is 'ascii'
You can try this
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
Or
You can also try following
Add following line at top of your .py file.
# -*- coding: utf-8 -*-
If you're using Python < 3, you'll need to tell the interpreter that your string literal is Unicode by prefixing it with a u:
Python 2.7.2 (default, Jan 14 2012, 23:14:09)
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> "你好".encode("utf8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
>>> u"你好".encode("utf8")
'\xe4\xbd\xa0\xe5\xa5\xbd'
Further reading: Unicode HOWTO.
You use u"你好".encode('utf8') to encode an unicode string.
But if you want to represent "你好", you should decode it. Just like:
"你好".decode("utf8")
You will get what you want. Maybe you should learn more about encode & decode.
In case you're dealing with Unicode, sometimes instead of encode('utf-8'), you can also try to ignore the special characters, e.g.
"你好".encode('ascii','ignore')
or as something.decode('unicode_escape').encode('ascii','ignore') as suggested here.
Not particularly useful in this example, but can work better in other scenarios when it's not possible to convert some special characters.
Alternatively you can consider replacing particular character using replace().
If you are starting the python interpreter from a shell on Linux or similar systems (BSD, not sure about Mac), you should also check the default encoding for the shell.
Call locale charmap from the shell (not the python interpreter) and you should see
[user#host dir] $ locale charmap
UTF-8
[user#host dir] $
If this is not the case, and you see something else, e.g.
[user#host dir] $ locale charmap
ANSI_X3.4-1968
[user#host dir] $
Python will (at least in some cases such as in mine) inherit the shell's encoding and will not be able to print (some? all?) unicode characters. Python's own default encoding that you see and control via sys.getdefaultencoding() and sys.setdefaultencoding() is in this case ignored.
If you find that you have this problem, you can fix that by
[user#host dir] $ export LC_CTYPE="en_EN.UTF-8"
[user#host dir] $ locale charmap
UTF-8
[user#host dir] $
(Or alternatively choose whichever keymap you want instead of en_EN.) You can also edit /etc/locale.conf (or whichever file governs the locale definition in your system) to correct this.

Why does Python 3 output \xe3, an extra char?

Why does Python add \xe3 in the output of:
>>> b'Transa\xc3\xa7\xc3\xa3o'.decode('utf-8')
'Transaç\xe3o'
Expected value is:
'Transação'
Some more information about my environment:
>>> import sys
>>> print (sys.version)
3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)]
>>> sys.stdout.encoding
'cp437'
This was under Console 2 + Powershell.
You need to use a console or terminal that supports all of the characters that you want to print.
When printing in the interactive console, the characters are encoded to the correct codec for your console, with any character that is not supported using the backslashreplace error handler to keep the output readable rather than throw an exception. This is a feature of the default sys.displayhook() function:
If repr(value) is not encodable to sys.stdout.encoding with sys.stdout.errors error handler (which is probably 'strict'), encode it to sys.stdout.encoding with 'backslashreplace' error handler.
Your console can handle ç but not ã. There are several codecs that include the first character but not the last; you are using IBM codepage 437, but it is by no means the only one.
If you are running Python in the standard Windows console (cmd.exe) then be aware that Python, Unicode and that console do not mix very well. You can install the win-unicode-console package to make Python 3 use the Windows APIs to better output Unicode text; you'll need to make sure you have a font capable of displaying your Unicode text still.
I don't know for certain if that package is compatible with other Windows shells; your mileage may vary.

SyntaxError when trying to use backslash for Windows file path [duplicate]

This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 7 months ago.
I tried to confirm if a file exists using the following line of code:
os.path.isfile()
But I noticed if back slash is used by copy&paste from Windows OS:
os.path.isfile("C:\Users\xxx\Desktop\xxx")
I got a syntax error: (unicode error) etc etc etc.
When forward slash is used:
os.path.isfile("C:/Users/xxx/Desktop/xxx")
It worked.
Can I please ask why this happened? Even the answer is as simple as :"It is a convention."
Backslash is the escape symbol. This should work:
os.path.isfile("C:\\Users\\xxx\\Desktop\\xxx")
This works because you escape the escape symbol, and Python passes it as this literal:
"C:\Users\xxx\Desktop\xxx"
But it's better practice and ensures cross-platform compatibility to collect your path segments (perhaps conditionally, based on the platform) like this and use os.path.join
path_segments = ['/', 'Users', 'xxx', 'Desktop', 'xxx']
os.path.isfile(os.path.join(*path_segments))
Should return True for your case.
Because backslashes are escapes in Python. Specifically, you get a Unicode error because the \U escape means "Unicode character here; next 8 characters are a 32-bit hexadecimal codepoint."
If you use a raw string, which treats backslashes as themselves, it should work:
os.path.isfile(r"C:\Users\xxx\Desktop\xxx")
You get the problem with the 2 character sequences \x and \U -- which are python escape codes. They tell python to interpret the data that comes after them in a special way (The former inserts bytes and the latter unicode). You can get around it by using a "raw" string:
os.path.isfile(r"C:\Users\xxx\Desktop\xxx")
or by using forward slashes (as, IIRC, windows will accept either one).

Categories