While loop runs differently after the first loop - python

Doing this exercise from ThinkPython and wanting to do a little extra, trying to modify the exercise function (avoid) to prompt the user repeatedly and perform the calculation to find how many words in a text file (fin) contain the user inputted letters (avoidprompt). It works the first time but after it prompts the user for input again it always returns an answer of 0 words.
Feel like the most likely issue is I'm misunderstanding how to use the while loop in this context since it works the first time but doesn't after that. Is there a better way?
fin = open('[location of text file here]')
line = fin.readline()
word = line.strip()
def avoid(word, forbidden):
for letter in word:
if letter in forbidden:
return False
return True
def avoidprompt():
while(True):
n = 0
forbidden = input('gimmie some letters n Ill tell u how many words have em. \n')
for line in fin:
if avoid(line, forbidden) == False:
n = n+1
print('\n There are ' + str(n) + " words with those letters. \n")

When you open a file and do for line in file, you've consumed the entire file.
There are two easy solutions:
1) Go back to the start of the file in each iteration of your while(True) loop, by doing fin.seek(0)
2) Just store the file contents in a list, by replacing the first line of your script with fin = open('file.txt').readlines()

I believe you need to do something along these lines:
def avoidprompt():
while(True):
n = 0
fin.seek(0)
forbidden = input('gimmie some letters n Ill tell u how many words have em. \n')
for line in fin:
if avoid(line, forbidden) == False:
n = n+1
print('\n There are ' + str(n) + " words with those letters. \n")
Seek sets your pointer back to a specific line in an open file and since you aleady iterated through the file once, your cursor needs to be brought back to the top of the file in order to reread words
You can see this other stack overflow for more details here
Hope this helps! You used the loop just fine

Related

Using random letters to pair into words from a txt.file in Python

I was hoping to get some guidance through my assignment, I'm completely new to programming and have no experience whatsoever.
Assignment: We are given a txt.file containing an amount of actual words and my assignment is to write a program that asks the user for a set between 4 - 9 random letters to input. Make the program run these random letters through the txt.file and find all the words that it can create with the words the user put in. It then has to write out all the words it can find containing all the letters the user typed in, print it out in a line in alphabetical order. Afterwards it has to print out how many words contain all the 9 letters the user has put in as a list.
If my explanation is messy please tell me and I will try to explain better for you. Im not looking for anyone to solve the assignment for me, I just want some proper guidance on what I can do to proceed to the next step because right now it just feels like a dead end.
This is my code so far:
import os
chars = []
nine = input("Nians bokstäver: ")
lenght = len(nine)
#print(lenght) #Denna ska tas bort senare
while (lenght < 4) or (lenght > 9):
lenght = len(nine)
print("Fel antal, försök igen!")
nine = input("Nians bokstäver: ")
if (lenght >= 4) and (lenght <= 9):
lenght = len(nine)
chars.append(nine)
file = open("/Users/************/Desktop/Programmering 1/Inlämningsuppgift/ny_svenskaord.txt", "r")
while file:
line = file.readline()
print(line)
if line == "":
break
Thank you,
Artin
It sounds like you are building an Anagram solver. There are a few ways you can go about this. Due to it being a homework assignment I won't give any actual answers but some routes to look into.
The pythonic way to do it is using the builtin method string.rfind() where string is one word from your target file. Python Docs
The most robust way would be to use Regex to find the characters and sequence, but involves learning an entirely new language.Python Docs.
The easiest way for a programming beginner is also the ugliest way but it works with string slices and loops to iterate over each letter in the input, grab a single line from from the file using file.readline() then splitting it into its characters and comparing the two. Might looks something like
...
def Anagram(input, filename):
with open(filename, 'r') as target: # opens the file at the file path given in file name
word = target.readline() # reads one line of the file
while word: # While word is defined (has a value), loop!
hit_list = [0] # unfortunate naming, list of words the input compares too and a counter at index[0]
for i in len(input): # for letter in input
hits = 0 # If a letter is the same as the string, add one
'''
Comparison code goes here
'''
if hits == len(word): # if you have the same amount of hits as letters in the word its an anagram.
hit_list[1] = hit_list[1] + 1 # adds one to the counter
hit_list.append(word) # adds the word at the end of the list
word = target.readline() # reads a new line from the file
return hit_list # returns the list you made
The above example has huge holes and problems but would work in a very narrow set of circumstances. Some of the issues include the fact that its never checked that the input is actually a letter, that the file contains letters, that the file uses one word per line, that there are no line end character, ect ect. Things to think about when writing or reading other peoples code.
Have fun with the project Artin!

I am struggling with reading specific words and lines from a text file in python

I want my code to be able to find what the user has asked for and print the 5 following lines. For example if the user entered "james" into the system i want it to find that name in the text file and read the 5 lines below it. Is this even possible? All i have found whilst looking through the internet is how to read specific lines.
So, you want to read a .txt file and you want to read, let's say the word James and the 5 lines after it.
Our example text file is as follows:
Hello, this is line one
The word James is on this line
Hopefully, this line will be found later,
and this line,
and so on...
are we at 5 lines yet?
ah, here we are, the 5th line away from the word James
Hopefully, this should not be found
Let's think through what we have to do.
What We Have to Do
Open the text file
Find the line where the word 'James' is
Find the next 5 lines
Save it to a variable
Print it
Solution
Let's just call our text file info.txt. You can call it whatever you want.
To start, we must open the file and save it to a variable:
file = open('info.txt', 'r') # The 'r' allows us to read it
Then, we must save the data from it to another variable, we shall do it as a list:
file_data = file.readlines()
Now, we iterate (loop through) the line with a for loop, we must save the line that 'James' is on to another variable:
index = 'Not set yet'
for x in range(len(file_data)):
if 'James' in file_data[x]:
index = x
break
if index == 'Not set yet':
print('The word "James" is not in the text file.')
As you can see, it iterates through the list, and checks for the word 'James'. If it finds it, it breaks the loop. If the index variable still is equal to what it was originally set as, it obviously has not found the word 'James'.
Next, we should find the five lines next and save it to another variable:
five_lines = [file_data[index]]
for x in range(5):
try:
five_lines.append(file_data[index + x + 1])
except:
print(f'There are not five full lines after the word James. {x + 1} have been recorded.')
break
Finally, we shall print all of these:
for i in five_lines:
print(i, end='')
Done!
Final Code
file = open('info.txt', 'r') # The 'r' allows us to read it
file_data = file.readlines()
index = 'Not set yet'
for x in range(len(file_data)):
if 'James' in file_data[x]:
index = x
break
if index == 'Not set yet':
print('The word "James" is not in the text file.')
five_lines = [file_data[index]]
for x in range(5):
try:
five_lines.append(file_data[index + x + 1])
except:
print(f'There are not five full lines after the word James. {x + 1} have been recorded.')
break
for i in five_lines:
print(i, end='')
I hope that I have been helpful.
Yeah, sure. Say the keyword your searching for ("james") is keywrd and Xlines is the number of lines after a match you want to return
def readTextandSearch1(txtFile, keywrd, Xlines):
with open(txtFile, 'r') as f: #Note, txtFile is a full path & filename
allLines = f.readlines() #Send all the lines into a list
#with automatically closes txt file at exit
temp = [] #Dim it here so you've "something" to return in event of no match
for iLine in range(0, len(allLines)):
if keywrd in allLines[iLine]:
#Found keyword in this line, want the next X lines for returning
maxScan = min(len(allLines),Xlines+1) #Use this to avoid trying to address beyond end of text file.
for iiLine in range(1, maxScan):
temp.append(allLines[iLine+iiLine]
break #On assumption of only one entry of keywrd in the file, can break out of "for iLine" loop
return temp
Then by calling readTextandSearch1() with appropriate parameters, you'll get a list back that you can print at your leisure. I'd take the return as follows:
rtn1 = readTextAndSearch1("C:\\Docs\\Test.txt", "Jimmy", 6)
if rtn1: #This checks was Jimmy within Test.txt
#Jimmy was in Test.txt
print(rtn1)

Output of shuffling a string is printed on 2 lines of output

I've decide to practice solving anagrams, something I'm very bad at. I got 1000 most common words of English language, filtered those under 5 letters and over 9 and then wrote a simple script:
import random
from random import randint
words = []
file = 'new_words.txt'
with open(file) as f:
for line in f:
words.append(line)
while True:
i = randint(0, (len(words)-1))
question_list = list(words[i])
random.shuffle(question_list)
print(f"{''.join(question_list)}")
print(f'Please solve the anagram. Type "exit" to quit of "next" to pass:\n')
answer = input()
if answer == 'exit':
break
elif answer == 'pass':
pass
elif answer == words[i]:
print('Correct!')
elif answer != words[i]:
print('Epic fail...\n\n')
Now for some reason the output of the line print(f"{''.join(question_list)}") is printed over 2 lines like so:
o
nzcieger
Which is an anagram for 'recognize'. It also prints random numbers on letters per line:
ly
erla
Sometimes the whole anagram is printed properly.
I can't figure out what's causing this.
EDIT: Link to my filtered file with words
You have to strip the newlines out of the word.
Basically, your text file is actually formatted as a word on each line and a '\n' character at the end of each line. When you call random.shuffle(question_list), your are shuffling the characters of the word along with the newline character, so the newline is also shuffled! When you print out the 'shuffled' word, Python prints the word out with the newline, so you get the word randomly split between two lines.
In order to solve this issue, you can probably use the str.strip() function (but that would require you cast the list to a string) or just add this to below question_list = list(words[i]):
for character in question_list:
if character == '\n':
question_list.remove(character)
You can add this below question_list = list(words[i]): to remove the new line character read from the file for each of the line.
question_list.remove('\n')
Since the new line character occur only one time in the list we don't need to iterate over the list. It will just remove the first occurrence of '\n' from the list.
Or
del question_list[-1]
cause we know the exact index location of \n

Slice variable from specified letter to specified letter in line that varies in length

New to the site so I apologize if I format this incorrectly.
So I'm searching a file for lines containing
Server[x] ip.ip.ip.ip response=235ms accepted....
where x can be any number greater than or equal to 0, then storing that information in a variable named line.
I'm then printing this content to a tkinter GUI and its way too much information for the window.
To resolve this I thought I would slice the information down with a return line[15:30] in the function but the info that I want off these lines does not always fall between 15 and 30.
To resolve this I tried to make a loop with
return line[cnt1:cnt2]
checked cnt1 and cnt2 in a loop until cnt1 meets "S" and cnt2 meets "a" from accepted.
The problem is that I'm new to Python and I cant get the loop to work.
def serverlist(count):
try:
with open("file.txt", "r") as f:
searchlines = f.readlines()
if 'f' in locals():
for i, line in enumerate(reversed(searchlines)):
cnt = 90
if "Server["+str(count)+"]" in line:
if line[cnt] == "t":
cnt += 1
return line[29:cnt]
except WindowsError as fileerror:
print(fileerror)
I did a reversed on the line reading because the lines I am looking for repeats over and over every couple of minutes in the text file.
Originally I wanted to scan from the bottom and stop when it got to server[0] but this loop wasn't working for me either.
I gave up and started just running serverlist(count) and specifying the server number I was looking for instead of just running serverlist().
Hopefully when I understand the problem with my original loop I can fix this.
End goal here:
file.txt has multiple lines with
<timestamp/date> Server[x] ip.ip.ip.ip response=<time> accepted <unneeded garbage>
I want to cut just the Server[x] and the response time out of that line and show it somewhere else using a variable.
The line can range from Server[0] to Server[999] and the same response times are checked every few minutes so I need to avoid duplicates and only get the latest entries at the bottom of the log.
Im sorry this is lengthy and confusing.
EDIT:
Here is what I keep thinking should work but it doesn't:
def serverlist():
ips = []
cnt = 0
with open("file.txt", "r") as f:
for line in reversed(f.readlines()):
while cnt >= 0:
if "Server["+str(cnt)+"]" in line:
ips.append(line.split()) # split on spaces
cnt += 1
return ips
My test log file has server[4] through server[0]. I would think that the above would read from the bottom of the file, print server[4] line, then server[3] line, etc and stop when it hits 0. In theory this would keep it from reading every line in the file(runs faster) and it would give me only the latest data. BUT when I run this with while cnt >=0 it gets stuck in a loop and runs forever. If I run it with any other value like 1 or 2 then it returns a blank list []. I assume I am misunderstanding how this would work.
Here is my first approach:
def serverlist(count):
with open("file.txt", "r") as f:
for line in f.readlines():
if "Server[" + str(count) + "]" in line:
return line.split()[1] # split on spaces
return False
print serverlist(30)
# ip.ip.ip.ip
print serverlist(";-)")
# False
You can change the index in line.split()[1] to get the specific space separated string of the line.
Edit: Sure, just remove the if condition to get all ip's:
def serverlist():
ips = []
with open("file.txt", "r") as f:
for line in f.readlines():
if line.strip().startswith("Server["):
ips.append(line.split()[1]) # split on spaces
return ips

Python Regex: User Inputs Multiple Search Terms

What my code is suppose to do is take in user input search terms then iterate through a tcp dump file and find every instance of that term by packet. The src IP acts as the header to each packet in my output.
So I am having an issue with the fileIn being seemingly erased when it iterates through the first term. So when the program goes to look at the second user input search term it obviously can't find anything. Here is what I have:
import re
searchTerms = []
fileIn = open('ascii_dump.txt', 'r')
while True:
userTerm = input("Enter the search terms (End to stop): ")
if userTerm == 'End':
break
else:
searchTerms.append(userTerm)
ipPattern = re.compile(r'((?:\d{1,3}\.){3}\d{1,3})')
x = 0
while True:
print("Search Term is:", searchTerms[x])
for line in fileIn:
ipMatch = ipPattern.search(line)
userPattern = re.compile(searchTerms[x])
userMatch = userPattern.search(line)
if ipMatch is not None:
print(ipMatch.group())
if userMatch is not None:
print(userMatch.group())
x += 1
if x >= len(searchTerms):
break
This happens because you opened the file object as an iterator which is consumed in the first past through the for loop.
During the second time through the loop, the for line in fileIn will not be evaluated since the iterator fileIn has already been consumed.
A quick fix is to do this:
lines = open('ascii_dump.txt', 'r').readlines()
then in your for loop, change the for line in fileIn to:
for line in lines:
Having said this, you should rewrite your code to do all regex matches in a single pass using the regex or operator.
You need to "rewind" the file after the for line in fileIn loop:
...
fileIn.seek(0);
x += 1

Categories