I would like to unescape unicode characters in the source file:
source = open('source.csv', 'r')
target = open('target.csv', 'w')
target.write(source.read().decode('unicode_escape').encode('utf-8'))
But the result file contains extra line breaks. For example, the text
u'\u0417a\u0439\u043c\u044b \u0412ce\u043c \u0436e\u043ba\u044e\u0449\u0438\u043c!\nO\u0434o\u0431\u0440e\u043d\u0438e 98%'
is replaced with
u'Зaймы Вceм жeлaющим!
Oдoбрeниe 98%'
Understand that there is line break symbol \n in the source text, but I would like to keep it as is without actual conversion to line break.
You're almost there:
for line in source:
line = line.rstrip('\n')
line = line.decode('unicode_escape').replace(u'\n', u'\\n').encode('utf8')
target.write(line + '\n')
Related
I am learning Python on an app called SoloLearn, got to solve this exercise and I cannot see the solution or see the comments, I don't need to solve it to continue but I'd like to know how to do it.
Book Titles: You have been asked to make a special book categorization program, which assigns each book a special code based on its title.
The code is equal to the first letter of the book, followed by the number of characters in the title.
For example, for the book "Harry Potter", the code would be: H12, as it contains 12 characters (including the space).
You are provided a books.txt file, which includes the book titles, each one written on a separate line.
Read the title one by one and output the code for each book on a separate line.
For example, if the books.txt file contains:
Some book
Another book
Your program should output:
S9
A12
Recall the readlines() method, which returns a list containing the lines of the file.
Also, remember that all lines, except the last one, contain a \n at the end, which should not be included in the character count.
I tried:
file = open("books.txt","r")
for line in file:
for i in range(len(file.readlines())):
title = line[0]+str(len(line)-1)
print(titulo)
title = line[0]+str(len(line)-1)
print(title)
file.close
I also tried with range() and readlines() but I don't know how to solve it
This uses readlines():
with open('books.txt') as f: # Open file
for line in f.readlines(): # Iterate through lines
if line[-1] == '\n': # Check if there is '\n' at end of line
line = line[:-1] # If there is, ignore it
print(line[0], len(line), sep='') # Output first character and length
But I think splitlines() is easier, as it doesn't have the trailing '\n':
with open('books.txt') as f: # Open file
for line in f.read().splitlines(): # Iterate through lines
# No need to check for trailing '\n'
print(line[0], len(line), sep='') # Output first character and length
You can use "with" to handle file oppening and closing.
Use rstrip to get rid of '\n'.
with open('books.txt') as f:
lines = file.readlines()
for line in lines:
print(line[0] + str(len(line.rstrip())))
This is the same:
file = open('books.txt')
lines = file.readlines()
for line in lines:
print(line[0] + str(len(line.rstrip())))
file.close()
I am trying to find a line starts with specific string and replace entire line with new string
I tried this code
filename = "settings.txt"
for line in fileinput.input(filename, inplace=True):
print line.replace('BASE_URI =', 'BASE_URI = "http://example.net"')
This one not replacing entire line but just a matching string. what is best way to replace entire line starting with string ?
You don't need to know what old is; just redefine the entire line:
import sys
import fileinput
for line in fileinput.input([filename], inplace=True):
if line.strip().startswith('BASE_URI ='):
line = 'BASE_URI = "http://example.net"\n'
sys.stdout.write(line)
Are you using the python 2 syntax. Since python 2 is discontinued, I will try to solve this in python 3 syntax
suppose you need to replace lines that start with "Hello" to "Not Found" then you can do is
lines = open("settings.txt").readlines()
newlines = []
for line in lines:
if not line.startswith("Hello"):
newlines.append(line)
else:
newlines.append("Not Found")
with open("settings.txt", "w+") as fh:
for line in newlines:
fh.write(line+"\n")
This should do the trick:
def replace_line(source, destination, starts_with, replacement):
# Open file path
with open(source) as s_file:
# Store all file lines in lines
lines = s_file.readlines()
# Iterate lines
for i in range(len(lines)):
# If a line starts with given string
if lines[i].startswith(starts_with):
# Replace whole line and use current line separator (last character (-1))
lines[i] = replacement + lines[-1]
# Open destination file and write modified lines list into it
with open(destination, "w") as d_file:
d_file.writelines(lines)
Call it using this parameters:
replace_line("settings.txt", "settings.txt", 'BASE_URI =', 'BASE_URI = "http://example.net"')
Cheers!
I try to remove a whitespace line in my file; But it is not removing my white space line.
def removeWhiteSpaceLine():
fp = open("singleDataFile.csv")
for i, line in enumerate(fp, 1):
if i == 2:
line.strip()
fp.close()
My sample file is like:(i want to remove the 2nd line which is white space)
Name,Address,Age
John,Melbourne,28
Kati,Brisbane,35
.....
line.strip() do not modify line but returns a new string.
Calling line.strip() alone on its line has no effects. You must reassign the result to your variable:
line = line.strip()
However, it looks like you shouldn't use strip anyway:
strip()
Return a copy of the string with leading and trailing characters removed.
To me, it's unclear what you're asking:
1) fp = open("singleDataFile.csv") opens the file in read only mode. If you expect to update the file, that won't work. If you want to modify the file, open it in write mode ("w" or "r+").
2) Maybe you don't want to modify the file but only ignore the second line? In that case, you should add all the lines in a list and ignore that second line.
with open("singleDataFile.csv", "r+") as f:
content = f.readlines() # read content
f.seek(0) # go back to the begin of the file
for i, line in enumerate(content):
if i != 1: # the condition could also be if line.strip()
f.write(line) # write all lines except second line
f.truncate() # end of file
Try like this, Here modifying the same file like rewriting same contents to same file except blank lines:
with open("singleDataFile.csv","r") as f:
lines=f.readlines()
with open("singleDataFile.csv","w") as f:
for line in lines:
if line.strip():
f.write(line)
try using pandas library.
pd.colname.str.strip()
Just a basic question. I know how to read information from a file etc but how would I go about only including the lines that are in between certain lines?
Say I have this :
Information Included in file but before "beginning of text"
" Beginning of text "
information I want
" end of text "
Information included in file but after the "end of text"
Thank you for any help you can give to get me started.
You can read the file in line by line until you reach the start-markerline, then do something with the lines (print them, store them in a list, etc) until you reach the end-markerline.
with open('myfile.txt') as f:
line = f.readline()
while line != ' Beginning of text \n':
line = f.readline()
while line != ' end of text \n':
# add code to do something with the line here
line = f.readline()
Make sure to exactly match the start- and end-markerlines. In your example they have a leading and trailing blank.
Yet another way to do it, is to use two-argument version of iter():
start = '" Beginning of text "\n'
end = '" end of text "\n'
with open('myfile.txt') as f:
for line in iter(f.readline, start):
pass
for line in iter(f.readline, end):
print line
see https://docs.python.org/2/library/functions.html#iter for details
I would just read the file line by line and check each line if it matches beginning or end string. The boolean readData then indicates if you are between beginning and end and you can read the actual information to another variable.
# Open the file
f = open('myTextFile.txt')
# Read the first line
line = f.readline()
readData=false;
# If the file is not empty keep reading line one at a time
# until the file is empty
while line:
# Check if line matches beginning
if line == "Beginning of text":
readData=true;
# Check if line matches end
if line == "end of text"
readData=false;
# We are between beginning and end
if readData:
(...)
line = f.readline()
f.close()
Hi I already have the search function sorted out:
def searchconfig():
config1 = open("config.php", "r")
b='//cats'
for num, line in enumerate(config1,0):
if b in line:
connum = num + 1
return connum
config1.close()
This will return the line number of //cats, I then need to take the data underneath it put it in a tempoary document, append new data under the //cats and then append the data in the tempoary document to the original? how would i do this? i know that i would have to use 'a' instead of 'r' when opening the document but i do not know how to utilise the line number.
I think, the easiest way would be to read the whole file into a list of strings, work on that list and write it back afterwards.
# Read all lines of the file into a list of strings
with open("config.php", "r") as file:
lines = list(file)
file.close()
# This gets the line number for the first line containing '//cats'
# Note that it will throw an StopIteration exception, if no such line exists...
linenum = (num for (num, line) in enumerate(lines) if '//cats' in line).next()
# insert a line after the line containing '//cats'
lines.insert(linenum+1, 'This is a new line...')
# You could also replace the line following '//cats' like
lines[linenum+1] = 'New line content...'
# Write back the file (in fact this creates a new file with new content)
# Note that you need to append the line delimiter '\n' to every line explicitely
with open("config.php", "w") as file:
file.writelines(line + '\n' for line in lines)
file.close()
Using "a" as mode for open would only let you append ath the end of the file.
You could use "r+" for a combined read/write mode, but then you could only overwrite some parts of the file, there is no simple way to insert new lines in the middle of the file using this mode.
You could do it like this. I am creating a new file in this example as it is usually safer.
with open('my_file.php') as my_php_file:
add_new_content = ['%sNEWCONTENT' %line if '//cat' in line
else line.strip('\n')
for line in my_php_file.readlines()]
with open('my_new_file.php', 'w+') as my_new_php_file:
for line in add_new_content:
print>>my_new_php_file, line