Python putting all words on same line - python

I am trying to remove some words (located on a text file) on another text file. Although my code seems to work, I noticed that it stopped removing words at a certain point in the text file. I then checked the text file, and noticed that Python is writing all the words on the same line, and this line has a limit of characters to it, resulting in the process stopping. How can I circumvent that?
Here is my code:
# text file list to array
with open('function_words.txt') as functionFile:
functionWords = [word for line in functionFile for word in line.split()]
# delete the word on the text file if it matches one of the array
with open("results.txt", "r+") as newfile:
newfile.write(' '.join(i for i in toPrint.read().split() if i not in functionWords))
Thanks in advance and please let me know if you need more details.

you would need to put the "\n" after you join the string if you want each line separate in the new file. Note the + "\n" below.
with open("results.txt", "r+") as newfile:
newfile.write(' '.join(i for i in toPrint.read().split() if i not in functionWords)+ "\n")
alt. you could create a list of the lines you want to write and write newFile using the writelines() methods. Something like:
newFile.writelines(my_list_of_lines_to_write)

Related

Python: Inserting content of entire textdocument after a certain string in 2nd document

Im pretty new to Python, but I've been trying to get into some programming in my free time. Currently, im dealing with the following problem:
I have 2 documents, 1 and 2. Both have text in them.
I want to search document 1 for a specific string. When I locate that string, I want to insert all the content of document 2 in a line after the specific string.
Before insertion:
Document 1 content:
text...
SpecificString
text...
After insertion:
Document 1 content:
text...
SpecificString
Document 2 content
text...
I've been trying different methods, but none are working, and keep deleting all content from document 1 and replacing it. Youtube & Google haven't yielded any desireble results, maybe im just looking in the wrong places.
I tried differnet things, this is 1 example:
f1 = '/Users/Win10/Desktop/Pythonprojects/oldfile.txt'
f2 = '/Users/Win10/Desktop/Pythonprojects/newfile.txt'
searchString=str("<\module>")
with open(f1, "r") as moduleinfo, open(f2, "w") as newproject:
new_contents = newproject.readlines()
#Now prev_contents is a list of strings and you may add the new line to this list at any position
if searchString in f1:
new_contents.insert(0,"\n")
new_contents.insert(0,moduleinfo)
#new_file.write("\n".join(new_contents))
The code simply deleted the content of document 1.
You can find interesting answers (How do I write to the middle of a text file while reading its contents?, Can you write to the middle of a file in python?, Adding lines after specific line)
By the way, an interesting way is to iterate the file in a read mode to find the index where the insert must be. Afterwards, overwrite the file with new indexing:
a) File2 = File2[:key_index] + File1 + File 2[key_index:]
Another option explained by Adding lines after specific line:
with open(file, "r") as in_file:
buf = in_file.readlines()
with open(file, "w") as out_file:
for line in buf:
if line == "YOUR SEARCH\n":
line = line + "Include below\n"
out_file.write(line)
Please tell us your final approach.
KR,
You have to import the second file in append mode instead of writing mode. Write mode override the document. Append mode add text to the end of the file, but you can move the pointer to the wanted location for writing and append the text there.
You can enter append mode by replacing the 'w' with 'a'.
Thanks for your input, it put me on the right track. I ended up going with the following:
f2 = '/Users/Win10/Desktop/Pythonprojects/newfile.txt'
f1 = '/Users/Win10/Desktop/Pythonprojects/oldfile.txt'
with open(f2) as file:
original = file.read()
with open(f1) as input:
myinsert = input.read()
newfile = original.replace("</Module>", "</Module>\n"+myinsert)
with open(f2, "w") as replaced:
replaced.write(newfile)
text from oldfile is inserted into newfile in a new line, under the "/Module" string. I'll be following up, if I find better solutions. Again, thank you for your answers.

How to import and print seperate txt file in spyder?

I am trying to import several text files into my Spyder file, which I want to add to a list later on.
why does
test1 = open("test1.txt")
result in test1 as "TextIOWrapper"? How would I bring the content over into the python file?
Thanks in advance
You need to read the lines into your list after opening it. For example, the code should be:
with open('test1.txt') as f:
test1= f.readlines()
The above code will read the contents of your text file into the list test1. However, if the data in your text file is separated over multiple lines, the escape char '\n' will be included in your list.
To avoid this, use the below refined code:
test1= [line.rstrip('\n') for line in open('test1.txt')]

Checking if string is in text file is not working

I am writing in python 3.6 and am having trouble making my code match strings in a short text document. this is a simple example of the exact logic that is breaking my bigger program:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
print(file.read().splitlines())
if 'bah' not in file.read().splitlines():
print("fail")
with the text document formatted like so:
bah
gah
fah
dah
mah
and it is indeed printing out fail each time I run this. Am I using the incorrect method of reading the data from the text document?
the issue is that you're printing print(file.read().splitlines())
so it exhausts the file, and the next call to file.read().splitlines() returns an empty list...
A better way to "grep" your pattern would be to iterate on the file lines instead of reading it fully. So if you find the string early in the file, you save time:
with open(PATH, 'r') as f:
for line in f:
if line.rstrip()=="bah":
break
else:
# else is reached when no break is called from the for loop: fail
print("fail")
The small catch here is not to forget to call line.rstrip() because file generator issues the line with the line terminator. Also, if there's a trailing space in your file, this code will still match the word (make it strip() if you want to match even with leading blanks)
If you want to match a lot of words, consider creating a set of lines:
lines = {line.rstrip() for line in f}
so your in lines call will be a lot faster.
Try it:
PATH = "C:\\Users\\JoshLaptop\\PycharmProjects\\practice\\commented.txt"
file = open(PATH, 'r')
words = file.read().splitlines()
print(words)
if 'bah' not in words:
print("fail")
You can't read the file two times.
When you do print(file.read().splitlines()), the file is read and the next call to this function will return nothing because you are already at the end of file.
PATH = "your_file"
file = open(PATH, 'r')
words = ['bah', 'dah', 'gah', "fah", 'mah']
if 'bah' not in (file.read().splitlines()) :
print("fail")
as you can see output is not 'fail' you must use one 'file.read().splitlines()' in code or save it in another variable otherwise you have an 'fail' message

Python: How to write a list of strings on separate lines but without a blank line

EDIT: See bottom of post for the entire code
I am new to this forum and I have an issue that I would be grateful for any help solving.
Situation and goal:
- I have a list of strings. Each string is one word, like this: ['WORD', 'LINKS', 'QUOTE' ...] and so on.
- I would like to write this list of words (strings) on separate lines in a new text file.
- One would think the way to do this would be by appending the '\n' to every item in the list, but when I do that, I get a blank line between every list item. WHY?
Please have a look at this simple function:
def write_new_file(input_list):
with open('TEKST\\TEKST_ny.txt', mode='wt') as output_file:
for linje in input_list:
output_file.write(linje + '\n')
This produces a file that looks like this:
WORD
LINKS
QUOTE
If I remove the '\n', then the file looks like this:
WORDLINKSQUOTE
Instead, the file should look like this:
WORD
LINKS
QUOTE
I am obviously doing something wrong, but after a lot of experimenting and reading around the web, I can't seem to get it right.
Any help would be deeply appreciated, thank you!
Response to link to thread about write() vs. writelines():
Writelines() doesn't fix this by itself, it produces the same result as write() without the '\n'. Unless I add a newline to every list item before passing it to the writelines(). But then we're back at the first option and the blank lines...
I tried to use one of the answers in the linked thread, using '\n'.join() and then write(), but I still get the blank lines.
It comes down to this: For some reason, I get two newlines for every '\n', no matter how I use it. I am .strip()'ing the list items of newline characters to be sure, and without the nl everything is just one massive block of texts anyway.
On using another editor: I tried open the txt-file in windows notepad and in notepad++. Any reason why these programs wouldn't display it correctly?
EDIT: This is the entire code. Sorry for the Norwegian naming. The purpose of the program is to read and clean up a text file and return the words first as a list and ultimately as a new file with each word on a new line. The text file is a list of Scrabble-words, so it's rather big (9 mb or something). PS: I don't advocate Scrabble-cheating, this is just a programming exercise :)
def renskriv(opprinnelig_ord):
nytt_ord = ''
for bokstav in opprinnelig_ord:
if bokstav.isupper() == True:
nytt_ord = nytt_ord + bokstav
return nytt_ord
def skriv_ny_fil(ny_liste):
with open('NSF\\NSF_ny.txt', 'w') as f:
for linje in ny_liste:
f.write(linje + '\n')
def behandle_kildefil():
innfil = open('NSF\\NSF_full.txt', 'r')
f = innfil.read()
kildeliste = f.split()
ny_liste = []
for item in kildeliste:
nytt_ord = renskriv(item)
nytt_ord = nytt_ord.strip('\n')
ny_liste.append(nytt_ord)
skriv_ny_fil(ny_liste)
innfil.close()
def main():
behandle_kildefil()
if __name__ == '__main__':
main()
I think there must be some '\n' among your lines, try to skip empty lines.
I suggest you this code.
def write_new_file(input_list):
with open('TEKST\\TEKST_ny.txt', 'w') as output_file:
for linje in input_list:
if not linje.startswith('\n'):
output_file.write(linje.strip() + '\n')
You've said in the comments that python is writing two carriage return ('\r') characters for each line feed ('\n') character you write. It's a bit bizaare that python is replacing each line feed with two carriage returns, but this is a feature of opening a file in text mode (normally the translation would be to something more useful). If instead you open your file in binary mode then this translation will not be done and the file should display as you wish in Notepad++. NB. Using binary mode may cause problems if you need characters outside the ASCII range -- ASCII is basically just latin letters (no accents), digits and a few symbols.
For python 2 try:
filename = "somefile.txt"
with open(filename, mode="wb") as outfile:
outfile.write("first line")
outfile.write("\n")
outfile.write("second line")
Python 3 will be a bit more tricky. For each string literal you wish you write you must prepend it with a b (for binary). For each string you don't have immediate access to, or don't wish to change to a binary string, then you must encode it using the encode() method on the string. eg.
filename = "somefile.txt"
with open(filename, mode="wb") as outfile:
outfile.write(b"first line")
outfile.write(b"\n")
some_text = "second line"
outfile.write(some_text.encode())

Edit and save file

I need to edit my file and save it so that I can use it for another program . First I need to put "," in between every word and add a word at the end of every line.
In order to put "," in between every word , I used this command
for line in open('myfile','r+') :
for word in line.split():
new = ",".join(map(str,word))
print new
I'm not too sure how to overwrite the original file or maybe create a new output file for the edited version . I tried something like this
with open('myfile','r+') as f:
for line in f:
for word in line.split():
new = ",".join(map(str,word))
f.write(new)
The output is not what i wanted (different from the print new) .
Second, I need to add a word at the end of every line. So, i tried this
source = open('myfile','r')
output = open('out','a')
output.write(source.read().replace("\n", "yes\n"))
The code to add new word works perfectly. But I was thinking there should be an easier way to open a file , do two editing in one go and save it. But I'm not too sure how. Ive spent a tremendous amount of time to figure out how to overwrite the file and it's about time I seek for help
Here you go:
source = open('myfile', 'r')
output = open('out','w')
output.write('yes\n'.join(','.join(line.split()) for line in source.read().split('\n')))
One-liner:
open('out', 'w').write('yes\n'.join(','.join(line.split() for line in open('myfile', 'r').read().split('\n')))
Or more legibly:
source = open('myfile', 'r')
processed_lines = []
for line in source:
line = ','.join(line.split()).replace('\n', 'yes\n')
processed_lines.append(line)
output = open('out', 'w')
output.write(''.join(processed_lines))
EDIT
Apparently I misread everything, lol.
#It looks like you are writing the word yes to all of the lines, then spliting
#each word into letters and listing those word's letters on their own line?
source = open('myfile','r')
output = open('out','w')
for line in source:
for word in line.split():
new = ",".join(word)
print >>output, new
print >>output, 'y,e,s'
How big is this file?
Maybe You could create a temporary list which would just contain everything from file you want to edit. Every element could represent one line.
Editing list of strings is pretty simple.
After Your changes you can just open Your file again with
writable = open('configuration', 'w')
and then put changed lines to file with
file.write(writable, currentLine + '\n')
.
Hope that helps - even a little bit. ;)
For the first problem, you could read all the lines in f before overwriting f, assuming f is opened in 'r+' mode. Append all the results into a string, then execute:
f.seek(0) # reset file pointer back to start of file
f.write(new) # new should contain all concatenated lines
f.truncate() # get rid of any extra stuff from the old file
f.close()
For the second problem, the solution is similar: Read the entire file, make your edits, call f.seek(0), write the contents, f.truncate() and f.close().

Categories