Replace a text in a text file in python [duplicate] - python

This question already has answers here:
How to search and replace text in a file?
(22 answers)
Closed 8 years ago.
I am trying to replace text in a text file (let's just call the file test.txt), here is the example text:
Math is good
Math is hard
Learn good math
be good at math
DO\SOMETHING\NOW
I want something like:
Math is good
Math is hard
Learn good science
be good at science
DO\SOMETHING\NOW
I am trying to use fileinput in the following way
import fileinput
file = fileinput.input("Path/To/File/text.txt", inplace=True)
for line in file:
print(line.replace("math", "science"))
The problem is that the print function automatically attaches "\n" so it skips a new line. I tried replacing with using "sys.stdout.write(line.replace("math", "science")) but that outputed numbers in the text file. So how do I do this efficiently so that I get the results I want. Should I just use open and go line by line and checking if the word "math" pops up and replace the word? Any help would be greatly appreciated!

You can tell print() not to write a newline by setting the end keyword argument to an empty string:
print(line.replace("math", "science"), end='')
The default value for end is '\n'.
Alternatively you could remove the newline from line:
print(line.rstrip('\n').replace("math", "science"))

Related

Removing multiple bits of text in paranthesis from a text file using python [duplicate]

This question already has answers here:
How to input a regex in string.replace?
(7 answers)
Closed 2 years ago.
I'm extremely new to coding and python so bear with me.
I want to remove all text that is in parenthesis from a text file. There are multiple sets of parenthesis with varying lengths of characters inside. From another similar post on here, I found
re.sub(r'\([^()]*\)', '', "sample.txt")
which is supposed to remove characters between () but does absolutely nothing. It runs but I get no error code.
I've also tried
intext = 'C:\\Users\\S--\\PycharmProjects\\pythonProject1\\sample.txt'
outtext = 'C:\\Users\\S--\\PycharmProjects\\pythonProject1\\EDITEDsample.txt'
with open("sample.txt", 'r') as f, open(outtext, 'w') as fo:
for line in f:
fo.write(line.replace('\(.*?\)', '').replace('(', " ").replace(')', " "))
which successfully removes the parenthesis but nothing inbetween them.
How do I get the characters between the parenthesis out?
EDIT: I was asked for a sample of sample.txt, these are it's contents:
Example sentence (first), end of sentence. Example Line (second), end
of sentence (end).
As you can see here, the function sub does not receive a filename as parameter, but actually it receives the text on which to work.
>>> re.sub(r'\([^()]*\)', '', "123(456)789")
'123789'
As for your second attempt, notice that string.replace does not take in REGEX expressions, only literal strings.

after writing text in csv, opening it doesn't read the '\n' char [duplicate]

This question already has answers here:
Adding a line-terminator in pandas ends up adding another \r
(2 answers)
Closed 2 years ago.
I create a csv file in wich I put some lyrics of songs, using this:
with io.open('songs.csv', 'a+',encoding='utf-8') as file:
writer = csv.writer(file , dialect='excel')
writer.writerow(input_row)
where input_row is a list with [artist , lyrics]
Now when I open my csv, I notice that everywhere there was '\n' and '\r':
For example:
RAW TEXT:
I went walking in the garden
I was tripping on snakes
And I ain't asking for your loving
I'm just asking what your love is gonna take
Text from the pandas dataframe after reading the csv:
"\r\n\r\r\nI went walking in the garden\r\nI was tripping on snakes\r\nAnd I ain't asking for your loving\r\nI'm just asking what your love is gonna take\r\n\r\n
(btw I'm using Pycharm and in the overwiev of the dataset, those escape chars are not visible , so I have some words attached.)
I'm cleaning the column using
data['lyrics'] = data['lyrics'].replace(r'\\[n]', ' ',regex = True)
data['lyrics'] = data['lyrics'].replace(r'\\[r]', ' ', regex=True)
but when I print the text, nothing change.
Am I doing something wrong or is not a problem and I can simply ignore it??
Apparently Pandas has problems correctly guessing the type of line endings (Unix/Linux - \n, Windows - \r\n). Try the suggested here: Adding a line-terminator in pandas ends up adding another \r solution of passing a file object to read_csv instead of passing the filename:
with open('songs.csv', 'r',encoding='utf-8') as file:
df = pandas.read_csv(file)
Try this
data['lyrics'] = data['lyrics'].str.strip()

How to change a word if it includes a certain letter [duplicate]

This question already has answers here:
Replacing instances of a character in a string
(17 answers)
Closed 5 years ago.
I tried to replace vowels and add il to them using this code but it didnt work, help!
line=input("What do you want to say?\n")
line = line.replace('e', 'ile')
line = line.replace('o', 'ilo')
line = line.replace('a', 'ila')
line = line.replace('i', 'ili')
line = line.replace('u', 'ilu')
line = line.replace('y', 'ily')
print (line)
But if you type a long sentence it stop working correctly.
could someone please help me?
Want to print "Hello world"
it prints:
Hililellililo wililorld
when should print Hilellilo Wilorld
Try replacing any occurrence of the letters you want with regex. Like this i.e:
import re
re.sub(r'[eE]', 'i$0', "Hello World")
You can replace any letter you want putting them inside the square brackets.
Additionally, that 'i$0' is the literal character 'i' and $0 the letter that was matched.
"Hello world".replace('e', 'ie')
But your question is not very clear, may be you mean something different.
Whenever you do multiple replacements after each other, you always need to be careful with the order in which you do them.
In your case put this replacement first:
line = line.replace('i', 'ili')
Otherwise it replaces the i's in the replacements that have been done before.
When you need to do many replacements it is often better to use an approach that avoids these problems.
One of them can be using regular expressions, as already proposed. Another is scanning the text from start to end for items to replace and replace each item when you find it during the scan and continue scanning after the replacement.

How to avoid new line in readline() function in python 3x? [duplicate]

This question already has answers here:
How to print without a newline or space
(26 answers)
Closed 7 years ago.
I am a new to python programming. I've followed the "Learn python the hard way" book but that is based on python 2x.
def print_one_line(line_number,f):
print(line_number,f.readline())
In this function every time it prints A line and a new line.
1 gdfgty
2 yrty
3 l
I read the documentary that if i put a , (comma) after readline()
then it won't print a new \n.
Here is the documentary:
Why are there empty lines between the lines in the file? The
readline() function returns the \n that's in the file at the end of
that line. This means that print's \n is being added to the one
already returned by readline() fuction. To change this behavior simply add a ,
(comma) at the end of print so that it doesn't print its own .
When I run the file with python 2x then it is OK, but when I do it in python 3x then the newline is printed. How to avoid that newline in python 3x?
Since your content already contains the newlines you want, tell the print() function not to add any by using the optional end argument:
def print_one_line(line_number,f):
print(line_number,f.readline(), end='')
Beside the other ways, you could also use:
import sys
sys.stdout.write(f.readline())
Works with every Python version to date.
Rather than skipping the newline on output, you can strip it from the input:
print(line_number, f.readline().rstrip('\n'))

How to remove the "\n" from the end of my string [duplicate]

This question already has answers here:
How can I remove a trailing newline?
(27 answers)
Closed 8 years ago.
I had a list that read from a text file into an array but now they all have "\n" on the end. Obviously you dont see it when you print it because it just takes a new line. I want to remove it because it is causing me some hassle.
database = open("database.txt", "r")
databaselist = database.readlines()
thats the code i used to read from the file. I am a total noob so please dont use crazy technical talk otherwise it will go straight over my head
"string with or without newline\n".rstrip('\n')
Using rstrip with \n avoids any unwanted side-effect except that it will remove multiple \n at the end, if present.
Otherwise, you need to use this less elegant function:
def rstrip1(s, c):
return s[:-1] if s[-1]==c else s
Use str.rstrip to remove the newline character at the end of each line:
databaselist = [line.rstrip("\n") for line in database.readlines()]
However, I recommend that you make three more changes to your code to improve efficiency:
Remove the call to readlines. Iterating over a file object yields its lines one at a time.
Remove the "r" argument to open since the function defaults to read-mode. This will not improve the speed of your code, but it will make it less redundant.
Most importantly, use a with-statement to open the file. This will ensure that it is closed automatically when you are done.
In all, the new code will look like this:
with open("database.txt") as database:
databaselist = [line.rstrip("\n") for line in database]

Categories