How to split words in each line? - python

I wrote a code to get each word in a line of a text file. Please see below.
fname = input('Enter your file name:')
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
word=[]
for word in fhand:
word = word.split()
print(word)
I run this code and I got what I want it. But, when I just run print(word), it only shows words of the last line. I think I didn't define word in the beginning. Then I added word = [] but the results are the same.

in your loop -
for word in fhand:
you reassign the variable 'word' everytime it loops. Since you made 'word' a list, you would need to append to it the new lines instead of reassigning it. Also, there is probably some issue with using 'word' twice, once as a list and once in your for loop counter. Try something like-
word = []
fname = input('Enter your file name:')
with open(fname, 'r') as file:
for line in file:
word.append(line.split())

Related

read words from file, line by line and concatenate to paragraph

I have a really long list of words that are on each line. How do I make a program that takes in all that and print them all side by side?
I tried making the word an element of a list, but I don't know how to proceed.
Here's the code I've tried so far:
def convert(lst):
return([i for item in lst for i in item.split()])
lst = [''' -The list of words come here- ''']
print(convert(lst))
If you already have the words in a list, you can use the join() function to concatenate them. See https://docs.python.org/3/library/stdtypes.html#str.join
words = open('your_file.txt').readlines()
separator = ' '
print(separator.join(words))
Another, a little bit more cumbersome method would be to print the words using the builtin print() function but suppress the newline that print() normally adds automatically to the end of your argument.
words = open('your_file.txt').readlines()
for word in words:
print(word, end=' ')
Try this, and example.txt just has a list of words going down line by line.
with open("example.txt", "r") as a_file:
sentence = ""
for line in a_file:
stripped_line = line.strip()
sentence = sentence + f"{stripped_line} "
print(sentence)
If your input file is really large and you cant fit it all in memory, you can read the words lazy and write them to disk instead of holding the whole output in memory.
# create a generator that yields each individual line
lines = (l for l in open('words'))
with open("output", "w+") as writer:
# read the file line by line to avoid memory issues
while True:
try:
line = next(lines)
# add to the paragraph in the out file
writer.write(line.replace('\n', ' '))
except StopIteration:
break
You can check the working example here: https://replit.com/#bluebrown/readwritewords#main.py

does for keyword understand that a string is an iteration?

Question :
" Open the file romeo.txt and read it line by line. For each line, split the line into a list of words using the split() method. The program should build a list of words. For each word on each line check to see if the word is already in the list and if not append it to the list. When the program completes, sort and print the resulting words in alphabetical order."
Code:
fname = input("Enter file name: ")
fh = open(fname)
hh = list()
for sen in fh:
sen=sen.split()
for element in sen:
if element not in hh:
hh.append(element)
hh.sort()
print(hh)
I want to make sure that I understood the code. So first we took the file name then opened it then we created an empty list then we split the strings into a list and then we checked if the elements in sen is in the empty list we created and then we appended it and printed.
Also, I have a question when using the for keyword, does the for keyword understand that each word in the file is an iteration even before splitting it??
Python str.split() documentation: https://docs.python.org/3/library/stdtypes.html#str.split
fname = input("Enter file name: ") #-- user enters name of a file
fh = open(fname) #-------------------- open the file
hh = list() #------------------------- create an empty list
for sen in fh: #---------------------- loop through lines in the file
sen=sen.split() #----------------- split the line into words
for element in sen: #------------- loop through words in the line
if element not in hh: #------- if word is not in the list of unique words
hh.append(element) #------ add the word to the list
hh.sort() #--------------- organize the list
print(hh) #--------------------------- print the list of unique words
hh will be a list of all the unique words in the file.
The best way to work with files in Python is to use Context Managers.
Python Context Manager documentation: https://docs.python.org/3/library/contextlib.html
You should probably use:
filename = input("Enter file name: ")
unique_words = list()
with open(filename, "r") as file: # 'with' Context Manager
for line in file:
line = line.split()
for word in line:
if word not in unique_words:
unique_words.append(word)
unique_words.sort()
print(unique_words)

How to create a list that contains the first word of every line in a file (Python 3)

The file is called "emotion_words" which I want the first word of each line for.
I want to use a nested for loop, but I am not sure how.
Would I do this
emotions=open("emotion_words.txt","r+")
content = emotions.read()
for line in content.split(' ',1):
And add an append function before the second for loop?
with open("emotion_words.txt","r+") as f:
for line in f:
first_word_in_line = line.split(" ")[0]
fileref = open ("emotion_words.txt","r")
line = fileref.readlines()
emotions = []
for words in line:
word = words.split()
emotions.append(word[0])
print (emotions)
If I understand you question correctly, this should work for you:
words = []
emotions = open("emotion_words.txt", "r+")
for l in emotions:
first_word = l.split()[0]
words.append(first_word)
After that you have your words in a 'words' list.

How to open a file in python, read the comments ("#"), find a word after the comments and select the word after it?

I have a function that loops through a file that Looks like this:
"#" XDI/1.0 XDAC/1.4 Athena/0.9.25
"#" Column.4: pre_edge
Content
That is to say that after the "#" there is a comment. My function aims to read each line and if it starts with a specific word, select what is after the ":"
For example if I had These two lines. I would like to read through them and if the line starts with "#" and contains the word "Column.4" the word "pre_edge" should be stored.
An example of my current approach follows:
with open(file, "r") as f:
for line in f:
if line.startswith ('#'):
word = line.split(" Column.4:")[1]
else:
print("n")
I think my Trouble is specifically after finding a line that starts with "#" how can I parse/search through it? and save its Content if it contains the desidered word.
In case that # comment contain str Column.4: as stated above, you could parse it this way.
with open(filepath) as f:
for line in f:
if line.startswith('#'):
# Here you proceed comment lines
if 'Column.4' in line:
first, remainder = line.split('Column.4: ')
# Remainder contains everything after '# Column.4: '
# So if you want to get first word ->
word = remainder.split()[0]
else:
# Here you can proceed lines that are not comments
pass
Note
Also it is a good practice to use for line in f: statement instead of f.readlines() (as mentioned in other answers), because this way you don't load all lines into memory, but proceed them one by one.
You should start by reading the file into a list and then work through that instead:
file = 'test.txt' #<- call file whatever you want
with open(file, "r") as f:
txt = f.readlines()
for line in txt:
if line.startswith ('"#"'):
word = line.split(" Column.4: ")
try:
print(word[1])
except IndexError:
print(word)
else:
print("n")
Output:
>>> ['"#" XDI/1.0 XDAC/1.4 Athena/0.9.25\n']
>>> pre_edge
Used a try and except catch because the first line also starts with "#" and we can't split that with your current logic.
Also, as a side note, in the question you have the file with lines starting as "#" with the quotation marks so the startswith() function was altered as such.
with open('stuff.txt', 'r+') as f:
data = f.readlines()
for line in data:
words = line.split()
if words and ('#' in words[0]) and ("Column.4:" in words):
print(words[-1])
# pre_edge

Why doesn't this code check the validity of a word in a text file?

Hey guys I'm having a bit of trouble for checking the validity of a word in scrabble. For my code, It just goes to the "else" statement so I dont think it even checked the dictionary (which is a seperate text) file anyway. Can anyone correct my errors?
def dictionary_check(user_word):
dictionary = open("dictionary.txt", "r")
for line in dictionary:
line = line.split()
if user_word.upper() == line:
print("word is valid")
return
print("never heard of this word")
myTiles = ["B","S","N","O","E","U","T"]
user_word = "BEN"
dictionary_check(user_word)
In your code you have:
line = line.split()
if user_word.upper() == line:
line.split() outputs a list of words, not a single word. If user_word is a string, your if statement will never be True. Instead, you should be iterating over all the words and not just the lines that they are on:
def dictionary_check(user_word):
dictionary = open("dictionary.txt", "r")
for line in dictionary:
line = line.split()
for word in line:
if user_word.upper() == word: #if word found say its its found
print("word is valid")
return #end the function because it doesn't need to keep going
print("never heard of this word") #it went through all the words and it wasn't found
Of course, there are much faster ways to do this task than to open a file each time you need to check. You can load all of the words into memory at the start of your file, so each check for a word being in the dictionary is done in constant time:
#loading words into memory
dictionary = set()
with open("dictionary.txt", 'r') as f:
for line in dictionary:
for word in line.split():
dictionary.add(word)
Now you can check if any word is in the dictionary with:
if word in dictionary:
...

Categories