Can't escape control character "\r" when extracting file paths - python

I am trying to open each of the following files separately.
"C:\recipe\1,C:\recipe\2,C:\recipe\3,"
I attempt to do this using the following code:
import sys
import os
import re
line = "C:\recipe\1,C:\recipe\2,C:\recipe\3,"
line = line.replace('\\', '\\\\') # tried to escape control chars here
line = line.replace(',', ' ')
print line # should print "C:\recipe\1 C:\recipe\2 C:\recipe\3 "
for word in line.split():
fo = open(word, "r+")
# Do file stuff
fo.close()
print "\nDone\n"
When I run it, it gives me:
fo = open(word, "r+")
IOError: [Errno 13] Permission denied: 'C:'
So it must be a result of the '\r's in the original string not escaping correctly. I tried many other methods of escaping control characters but none of them seem to be working. What am I doing wrong?

Use a raw string:
line = r"C:\recipe\1,C:\recipe\2,C:\recipe\3,"

If for whatever reason you don't use raw string, you need to escape your single slashes by adding double slash:
line = "C:\\recipe\\1,C:\\recipe\\2,C:\\recipe\\3,"
print(line.split(','))
Output:
['C:\\recipe\\1', 'C:\\recipe\\2', 'C:\\recipe\\3', '']

Related

Python - remove spaces on the right side

I've text files in folder, but files have data as below:
I don't know how I can remove spaces from right side between CRLF. These are spaces:
33/22-BBB<there is a space to remove>CRLF
import os
root_path = "C:/Users/adm/Desktop/test"
if os.path.exists(root_path):
files = []
for name in os.listdir(root_path):
if os.path.isfile(os.path.join(root_path, name)):
files.append(os.path.join(root_path, name))
for ii in files:
with open(ii) as file:
for line in file:
line = line.rstrip()
if line:
print(line)
file.close()
Does anyone have any idea how to get rid of this?
Those are control characters, change your open() command to:
with open(ii, "r", errors = "ignore") as file:
or
# Bytes mode
with open(ii, "rb") as file:
or
# '\r\n' is CR LF. See link at bottom
with open(ii, "r", newline='\r\n') as file:
Control characters in ASCII
If it is the CRLF characters you would like to remove from each string, you could use line.replace('CRLF', '').

Reading csv from FTP folder

I am trying to read csv file from FTP Folder
ftp = FTP('adr')
ftp.login(user='xxxx', passwd = 'xxxxx')
r = StringIO()
ftp.retrbinary('RETR /DataLoadFolder/xxx/xxx/xxx/'+str(file_name),r.write)
r.seek(0)
csvfile1 = csv.reader(r,delimiter=';')
input_file = [list(line) for line in csv.reader(r)] ----- Error
getting an error at last line as
new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
My csv file
Text Version
There are whites spaces at the end of each row (after 17.00)
Data starts from second row
what does the error mean? Any help would be much appreciated.
The error message simply asking how you'd want to handle the newline differently due to historical reasons, you can read the explanation here.
To solve the issue, specify the newline on StringIO like this:
r = StringIO(newline='')
According to StringIO documentation. If newline is set to None, newlines are written as \n on all platforms, but universal newline decoding is still performed when reading.
I could partially reproduce and fix. The error is caused by a line containing a bad end of line. I could reproduce by adding a line \r \n at the end of an otherway valid csv file.
A simple way to fix it is to use a filter to eliminate blank lines and clean end of lines:
def filter_bytes(fd):
for line in fd:
line = line.strip()
if len(line) != 0:
yield(line + b'\r\n')
Once this is done, your code could become:
ftp = FTP('adr')
ftp.login(user='xxxx', passwd = 'xxxxx')
r = BytesIO()
ftp.retrbinary('RETR /DataLoadFolder/xxx/xxx/xxx/'+str(file_name),r.write)
r.seek(0)
csvfile1 = csv.reader(filter_bytes(r),delimiter=';')
input_file = list(csvfile1)

How to edit a file in python 2.7.10?

I am trying to edit a file as follows in python 2.7.10 and running into below error, can anyone provide guidance on what the issue is on how to edit files?
import fileinput,re
filename = 'epivers.h'
text_to_search = re.compile("#define EPI_VERSION_STR \"(\d+\.\d+) (TOB) (r(\d+) ASSRT)\"")
replacement_text = "#define EPI_VERSION_STR \"9.130.27.50.1.2.3 (r749679 ASSRT)\""
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(text_to_search, replacement_text))
file.close()
Error:-
Traceback (most recent call last):
File "pythonfiledit.py", line 5, in <module>
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
AttributeError: FileInput instance has no attribute '__exit__'
UPDATE:
import fileinput,re
import os
import shutil
import sys
import tempfile
filename = 'epivers.h'
text_to_search = re.compile("#define EPI_VERSION_STR \"(\d+\.\d+) (TOB) (r(\d+) ASSRT)\"")
replacement_text = "#define EPI_VERSION_STR \"9.130.27.50.1.2.3 (r749679 ASSRT)\""
with open(filename) as src, tempfile.NamedTemporaryFile(
'w', dir=os.path.dirname(filename), delete=False) as dst:
# Discard first line
for line in src:
if text_to_search.search(line):
# Save the new first line
line = text_to_search .sub(replacement_text,line)
dst.write(line + '\n')
dst.write(line)
# remove old version
os.unlink(filename)
# rename new version
os.rename(dst.name,filename)
I am trying to match line define EPI_VERSION_STR "9.130.27 (TOB) (r749679 ASSRT)"
If r is a compiled regular expression and line is a line of text, the way to apply the regex is
r.match(line)
to find a match at the beginning of line, or
r.search(line)
to find a match anywhere. In your particular case, you simply need
line = r.sub(replacement, line)
though in addition, you'll need to add a backslash before the round parentheses in your regex in order to match them literally (except in a few places where you apparently put in grouping parentheses around the \d+ for no particular reason; maybe just take those out).
Your example input string contains three digits, and the replacement string contains six digits, so \d+\.\d+ will never match either of those. I'm guessing you want something like \d+(?:\.\d+)+ or perhaps very bluntly [\d.]+ if the periods can be adjacent.
Furthermore, a single backslash in a string will be interpreted by Python, before it gets passed to the regex engine. You'll want to use raw strings around regexes, nearly always. For improved legibility, perhaps also prefer single quotes or triple double quotes over regular double quotes, so you don't have to backslash the double quotes within the regex.
Finally, your usage of fileinput is wrong. You can't use it as a context manager. Just loop over the lines which fileinput.input() returns.
import fileinput, re
filename = 'epivers.h'
text_to_search = re.compile(r'#define EPI_VERSION_STR "\d+(?:\.\d+)+ \(TOB\) \(r\d+ ASSRT\)"')
replacement_text = '#define EPI_VERSION_STR "9.130.27.50.1.2.3 (r749679 ASSRT)"'
for line in fileinput.input(filename, inplace=True, backup='.bak'):
print(text_to_search.sub(replacement_text, line))
In your first attempt, line.replace() was a good start, but it doesn't accept a regex argument (and of course, you don't close() a file you opened with with ...). In your second attempt, you are checking whether the line is identical to the regex, which of course it isn't (just like the string "two" isn't equivalent to the numeric constant 2).
Read the file, use re.sub to substitute, then write the new contents back:
with open(filename) as f:
text = f.read()
new_text = re.sub(r'#define EPI_VERSION_STR "\d+\(?:.\d+\)+ \(TOB\) \(r\d+ ASSRT\)"',
'#define EPI_VERSION_STR "9.130.27.50.1.2.3 (r749679 ASSRT)"',
text)
with open(filename, 'w') as f:
f.write(new_text)

how do you open .txt file in python in one line

I'm trying to open .txt file and am getting confused with which part goes where. I also want that when I open the text file in python, the spaces removed.And when answering could you make the file name 'clues'.
My first try is:
def clues():
file = open("clues.txt", "r+")
for line in file:
string = ("clues.txt")
print (string)
my second try is:
def clues():
f = open('clues.txt')
lines = [line.strip('\n') for line in open ('clues.txt')]
The thrid try is:
def clues():
f = open("clues.txt", "r")
print f.read()
f.close()
Building upon #JonKiparsky It would be safer for you to use the python with statement:
with open("clues.txt") as f:
f.read().replace(" ", "")
If you want to read the whole file with the spaces removed, f.read() is on the right track—unlike your other attempts, that gives you the whole file as a single string, not one line at a time. But you still need to replace the spaces. Which you need to do explicitly. For example:
f.read().replace(' ', '')
Or, if you want to replace all whitespace, not just spaces:
''.join(f.read().split())
This line:
f = open("clues.txt")
will open the file - that is, it returns a filehandle that you can read from
This line:
open("clues.txt").read().replace(" ", "")
will open the file and return its contents, with all spaces removed.

How to use the regex to parse the entire file and determine the matches were found , rather then reading each line by line?

Instead on reading each and every line cant we just search for the string in the file and replace it... i am trying but unable to get any idea how to do thth?
file = open(C:\\path.txt,"r+")
lines = file.readlines()
replaceDone=0
file.seek(0)
newString="set:: windows32\n"
for l in lines:
if (re.search("(address_num)",l,re.I) replaceDone==0:
try:
file.write(l.replace(l,newString))
replaceDone=1
except IOError:
file.close()
Here's an example you can adapt that replaces every sequence of '(address_num)' with 'set:: windows32' for a file:
import fileinput
import re
for line in fileinput.input('/home/jon/data.txt', inplace=True):
print re.sub('address_num', 'set:: windows32', line, flags=re.I),
This is not very memory efficient but I guess it is what you are looking for:
import re
text = open(file_path, 'r').read()
open(file_path, 'w').write(re.sub(old_string, new_string, text))
Read the whole file, replace and write back the whole file.

Categories