How to extract features from text data set? [duplicate]

How to extract features from text data set? [duplicate] - python

This question already has answers here:
Error "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape" [duplicate]
(10 answers)
Closed 3 years ago.
I try to tokenize the text file that i get from my zip folder but i am facing this error
My Error
TypeError: expected string or bytes-like object

Add r to yourC:\Users\killer\Desktop\User1.txt so the backslash become \\ instead of \ because \U in Users is being interpreted as a start of an unicode
pd.read_csv(r"C:\Users\killer\Desktop\User1.txt")
Or you can escape it manually or just change \ to /

Try the following code:
Data = pd.read_csv("C:\Users\killer\Desktop\User1.txt", sep=", ")
Just add => , sep=", " at the end of the file you want to read.
Note that in quotation marks add what separates the text. In most cases, the text is separated by a comma "," but you can check the file by opening it with your default text reader to see what separates it.

What you are doing is right but there are some characters that can't be read (not Unicode characters). This is because the file path you have given as \U (from \User) will by default be recognized as an escape sequence character and is unknown. For a file path to be recognized as one, you have to:
A) write it with \\, for eg. "C:\\Users\\killer\\..."
B) write it with / , for eg "C:/Users/killer/..."
C) use r in front, for eg. r"C:\Users\killer\" to use it as raw text, ie, everything is text and no escape sequences, etc.

Related

Changing file path and need for raw? [duplicate]

This question already has answers here:
What exactly do "u" and "r" string prefixes do, and what are raw string literals?
(7 answers)
Closed 1 year ago.
import os
cwd = os.getcwd()
print("Current working directory: {0}".format(cwd))
# Print the type of the returned object
print("os.getcwd() returns an object of type: {0}".format(type(cwd)))
os.chdir(r"C:\Users\ghph0\AppData\Local\Programs\Python\Python39\Bootcamp\PDFs")
# Print the current working directory
print("Current working directory: {0}".format(os.getcwd()))
Hi all, I was changing my file directory so I could access specific files and was then greeted with this error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
From there I did some research and was told that converting the string to raw would fix the problem. My question is why do I convert it to raw and what does it do and why does it turn the file path into a red colour(not really important but never seen this before). Picture below:
https://i.stack.imgur.com/4oHlC.png
Many thanks to anyone that can help.

Backslashes in strings have a specific meaning in Python and are translated by the interpreter. You have surely already encountered "\n". Despite taking two letters to type, that is actually a one-character string meaning "newline". ANY backslashes in a string are interpreted that way. In your particular case, you used "\U", which is the way Python allows typing long Unicode values. "\U1F600", for example, is the grinning face emoji.
Because regular expressions often need to use backslashes for other uses, Python introduced the "raw" string. In a raw string, backslashes are not interpreted. So, r"\n" is a two-character string containing a backslash and an "n". This is NOT a newline.
Windows paths often use backslashes, so raw strings are convenient there. As it turns out, every Windows API will also accept forward slashes, so you can use those as well.
As for the colors, that probably means your editor doesn't know how to interpret raw strings.

Python Tkinter Display Image [duplicate]

This question already has answers here:
"Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3 [duplicate]
(10 answers)
Closed 1 year ago.
I'm trying to write a simple script to display an image in a window using tkinter.
I've tried to use PIL/Pillow and I've tried using the standard tkinter features but always get the same error when the script tries to read the filepath.
File "c:/Users/Sandip Dhillon/Desktop/stuff/dev_tests/imgtest2.py", line 6
photo=tk.PhotoImage(file="C:\Users\Sandip Dhillon\Pictures\DESlogo1.png")
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
Here is my code,
import tkinter as tk
window=tk.TK()
window.geometery("400x300+200+100")
photo=tk.PhotoImage(file="C:\Users\Sandip Dhillon\Pictures\DESlogo1.png")
l1=tk.Label(text="image")
l1.pack()
l2=tk.Label(image=photo)
l2.pack
window.mainloop()
Thank you!

Backslashes are escape characters in Python strings, so your string is interpreted in an interesting way.
Either:
use forward slashes: tk.PhotoImage(file="C:/Users/Sandip Dhillon/Pictures/DESlogo1.png")
use a raw string: tk.PhotoImage(file=r"C:\Users\Sandip Dhillon\Pictures\DESlogo1.png")
Both string and bytes literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and treat backslashes as literal characters.
double the slashes: tk.PhotoImage(file="C:\\Users\\Sandip Dhillon\\Pictures\\DESlogo1.png")
Escape sequence: \\: Backslash (\)

python opening a file error in windows 10 [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
what should i do to fix this error?
f = open('C:\Users\BARANLAPTOP\Desktop\test') #my python code
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

This is happening because the backslashes in your file path string our being treated as special characters. To fix this issue you need to let python know they are part of the path you can do this by converting the string into a raw string by putting a r before the start of the string or by escaping the backslashes by putting another backslash before them so all backslashes become double backslashes.

Error "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape" [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 3 years ago.
I'm trying to read a CSV file into Python (Spyder), but I keep getting an error. My code:
import csv
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)
I get the following error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 2-3: truncated \UXXXXXXXX escape
I have tried to replace the \ with \\ or with / and I've tried to put an r before "C.., but all these things didn't work.

This error occurs, because you are using a normal string as a path. You can use one of the three following solutions to fix your problem:
1: Just put r before your normal string. It converts a normal string to a raw string:
pandas.read_csv(r"C:\Users\DeePak\Desktop\myac.csv")
2:
pandas.read_csv("C:/Users/DeePak/Desktop/myac.csv")
3:
pandas.read_csv("C:\\Users\\DeePak\\Desktop\\myac.csv")

The first backslash in your string is being interpreted as a special character. In fact, because it's followed by a "U", it's being interpreted as the start of a Unicode code point.
To fix this, you need to escape the backslashes in the string. The direct way to do this is by doubling the backslashes:
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
If you don't want to escape backslashes in a string, and you don't have any need for escape codes or quotation marks in the string, you can instead use a "raw" string, using "r" just before it, like so:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")

You can just put r in front of the string with your actual path, which denotes a raw string. For example:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")

Consider it as a raw string. Just as a simple answer, add r before your Windows path.
import csv
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)

Try writing the file path as "C:\\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener" i.e with double backslash after the drive as opposed to "C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener"

Add r before your string. It converts a normal string to a raw string.

As per String literals:
String literals can be enclosed within single quotes (i.e. '...') or double quotes (i.e. "..."). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings).
The backslash character (i.e. \) is used to escape characters which otherwise will have a special meaning, such as newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter r or R. Such strings are called raw strings and use different rules for backslash escape sequences.
In triple-quoted strings, unescaped newlines and quotes are allowed, except that the three unescaped quotes in a row terminate the string.
Unless an r or R prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
So ideally you need to replace the line:
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
To any one of the following characters:
Using raw prefix and single quotes (i.e. '...'):
data = open(r'C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener')
Using double quotes (i.e. "...") and escaping backslash character (i.e. \):
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
Using double quotes (i.e. "...") and forwardslash character (i.e. /):
data = open("C:/Users/miche/Documents/school/jaar2/MIK/2.6/vektis_agb_zorgverlener")

Just putting an r in front works well.
eg:
white = pd.read_csv(r"C:\Users\hydro\a.csv")

It worked for me by neutralizing the '' by f = open('F:\\file.csv')

The double \ should work for Windows, but you still need to take care of the folders you mention in your path. All of them (except the filename) must exist. Otherwise you will get an error.

Python 3.4.1 script syntax error, arcpy & [duplicate]

This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 7 months ago.
I am used to working in python 2.7 so there was some new things like the print function being different. So excuse my ignorance. I am also pretty new to programming.
So here is my script, I keep getting errors that highlight some commas or spaces and saying there is a
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 16-17: malformed \N character escape
Code:
import arcpy
print("mosaic to new raster starting!")
env.workspace = "F:\GDAL"
arcpy.env.pyramid = "NONE"
arcpy.env.rasterStatistics = "NONE"
arcpy.env.compression = "JPEG 87"
arcpy.env.tileSize = "256 256"
print("Environment set")
RasterInput = "m_3511401_ne_11_1_20130731.jpg;m_3511401_nw_11_1_20130731.jpg;m_3511401_se_11_1_20130731.jpg;m_3511401_sw_11_1_20130731.jpg;"
print("Input set")
arcpy.MosaicToNewRaster_management(RasterInput,"F:\Pro_Projects\NAIP2013\raster.sde","MosaicFile1","","8_BIT_UNSIGNED","","3","LAST","FIRST")
print("mosaic done!")

Backslashes (used by you as Windows path separators) signal escape sequences in Python strings. Double the backslashes or use a raw string literal:
"F:\\Pro_Projects\\NAIP2013\\raster.sde"
or
r"F:\Pro_Projects\NAIP2013\raster.sde"
Windows also accepts forward slashes in paths, avoiding the issue altogether:
"F:/Pro_Projects/NAIP2013/raster.sde"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract features from text data set? [duplicate] - python

Add r to yourC:\Users\killer\Desktop\User1.txt so the backslash become \\ instead of \ because \U in Users is being interpreted as a start of an unicode pd.read_csv(r"C:\Users\killer\Desktop\User1.txt") Or you can escape it manually or just change \ to /

Related

Changing file path and need for raw? [duplicate]

Python Tkinter Display Image [duplicate]

python opening a file error in windows 10 [duplicate]

Error "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape" [duplicate]

Python 3.4.1 script syntax error, arcpy & [duplicate]

Categories

Resources