importing random words from a file without duplicates Python - python

I'm attempting to create a program which selects 10 words from a text file which contains 10+ words. For the purpose of the program when importing these 10 words from the text file, I must not import the same words twice! Currently I'm utilising a list for this however the same words seem to appear. I have some knowledge of sets and know they cannot hold the same value twice. As of now I'm clueless on how to solve this any help would be much appreciated. THANKS!
please find relevant code below! -(p.s. FileSelection is basically open file dialog)
def GameStage03_E():
global WordList
if WrdCount >= 10:
WordList = []
for n in range(0,10):
FileLines = open(FileSelection).read().splitlines()
RandWrd = random.choice(FileLines)
WordList.append(RandWrd)
SelectButton.destroy()
GameStage01Button.destroy()
GameStage04_E()
elif WrdCount <= 10:
tkinter.messagebox.showinfo("ERROR", " Insufficient Amount Of Words Within Your Text File! ")

Make WordList a set:
WordList = set()
Then update that set instead of appending:
WordList.update(set([RandWrd]))
Of course WordList would be a bad name for a set.
There are a few other problems though:
Don't use uppercase names for variables and functions (follow PEP8)
What happens if you draw the same word twice in your loop? There is no guarantee that WordList will contain 10 items after the loop completes, if words may appear multiple times.
The latter might be addressed by changing your loop to:
while len(WordList) < 10:
FileLines = open(FileSelection).read().splitlines()
RandWrd = random.choice(FileLines)
WordList.update(set([RandWrd]))
You would have to account for the case that there don't exist 10 distinct words after all, though.
Even then the loop would still be quite inefficient as you might draw the same word over and over and over again with random.choice(FileLines). But maybe you can base something useful off of that.

not sure i understand you right, but ehehe,
line 3: "if wrdcount" . . where dit you give wrdcount a value ?
Maybe you intent something along the line below?:
wordset = {}
wrdcount = len(wordset)
while wrdcount < 10:
# do some work to update the setcode here
# when end-of-file break

Related

Return a random set of words from system dictionary that meet specific criteria

I'd like to help my young child broaden his vocabulary. The plan is to parse the dictionary (in this case, MacOS), append the words to a list and if those list items meets specific criteria they are added to another list... perhaps a little messier than it needs to be!
I'd like to see just five randomly chosen words be printed. I've managed most of it already but get an error when trying to pick a random item to show...
IndexError: list index out of range
And the code thus far...
import random
word_file = "/usr/share/dict/words"
WORDS = open(word_file).read().splitlines()
for x in WORDS:
myRawList = []
myRawListWithCount = []
# filters out words that don't start with 'a"
if x.startswith("a"):
myRawList.append(x)
# word len. cannot exceed 5
for y in myRawList:
if (len(y)) <= 5:
myRawListWithCount.append(y)
# the line that causes an error. Simpler/shorter lists for names etc seem to work OK with the command.
print(random.choice(myRawListWithCount))
Try changing the scope of your lists. It's possible that it doesn't like you accessing it in the way you currently are with two different scopes.
myRawList = []
myRawListWithCount = []
for x in WORDS:
# filters out words that don't start with 'a"
if x.startswith("a"):
myRawList.append(x)
# word len. cannot exceed 5
for y in myRawList:
if (len(y)) <= 5:
myRawListWithCount.append(y)
# the line that causes an error. Simpler/shorter lists for names etc seem to work OK with the command.
print(random.choice(myRawListWithCount))
You set up a new empty list at each step in the loop.
Also the way you generate the second list is extremely inefficient (you read again all found words for each new word).
myRawList = []
myRawListWithCount = []
for x in WORDS:
# filters out words that don't start with 'a"
if x.startswith("a"):
myRawList.append(x)
# word len. cannot exceed 5
if (len(x)) <= 5:
myRawListWithCount.append(x)
print(random.choice(myRawListWithCount))
Example output: amino
Another idea of optimization, as a dictionary is sorted, you could break out of the loop as soon as you find a word not starting with a (you would then need a separate loop to create the second list)
I may have misunderstood your goal. If you just need five words chosen at random from the file, why not use a list comprehension to build up a list of all words from the file that meet your criteria, then use random.sample to pull out a sample of five words?
import random
word_file = "/usr/share/dict/words"
with open(word_file, "r") as f:
print(random.sample([word for line in f.readlines()
for word in [line.strip()]
if word.startswith("a") and len(word) <= 5],
5))

Counting Occurences of words in file

Just a preface, I have read--far too many--of the posts here about the same topic, and none of them quite cover the specific guidelines I'm under. I'm supposed to create an algorithm that counts the occurrence of each word in a text file, and display each as such:
"The: 4
Jump: 2
Fox: 6".
The terms I'm under is to use the skills we learned in our beginner python class, which means we cannot use dictionary, counters, sets or lists. (basically anything that would help shorten our code, tbh). I'm not the best at python so I've been struggling... pretty hard, to say the least. The closest I've gotten was scrabbling my old notes together from my previous class and finding a demo code that I reformatted.
wordsinlist = "words.txt"
word=input("Enter word to be searched:")
count = 0
with open("words.txt", 'r') as wordlist:
for line in wordlist:
words = line.split()
for i in words:
if(i==word):
count=count+1
print("Occurrences of the word:")
print(count)
The issue with this is that I need my code to display all of the words and their occurences at once, with no search input. There's definitely a way to do this, but I'm not the sharpest tool in the shed, and I've been going at it for like 5 hours now haha.
It definitely needs to look a little closer to this:
#Output
The: 112
History: 29
Learning: 25
Any help or hints are much appreciated! Thank you in advance! I know its a dumb question, these online classes are really frustrating.
without lists (or similar) I think is impossible...probably you're allowed to use lists , that is basic python!!
If you need to count the occurrance of all words, you don't need to insert them with input method, right?
So this is one simple solution:
with open("words.txt", 'r') as fp:
lines = fp.readlines()
lines_1 = [element.strip() for element in lines]
lines_2 = list(set(lines_1))
for w in lines_2:
for l in lines_1:
if(l==w):
count=count+1
print("Occurrences of {} : {}".format(w,count))
count = 0

Deciphering script in Python issue

Cheers, I am looking for help with my small Python project. Problem says, that program has to be able to decipher "monoalphabetic substitute cipher", while we have complete database, which words will definetely (at least once) be ciphered.
I have tried to create such a database with words, that are ciphered:
lst_sample = []
n = int(input('Number of words in database: '))
for i in range(n):
x = input()
lst_sample.append(x)
Way, that I am trying to "decipher" is to observe words', let's say structure, where different letter I am assigning numbers based on their presence in word (e.g. feed = 0112, hood = 0112 are the same, because it is combination of three different letters in such a combination). I am using subprogram pattern() for it:
def pattern(word):
nextNum = 0
letternNums = {}
wordPattern = []
for letter in word:
if letter not in letterNums:
letternNums[letter] = str(nextNum)
nextNum += 1
wordPattern.append(letterNums[letter])
return ''.join(wordPattern)
Right after, I have made database of ciphered words:
lst_en = []
q = input('Insert ciphered words: ')
if q == '':
print(lst_en)
else:
lst_en.append(q)
With such a databases I could finally create process to deciphering.
for i in lst_en:
for q in lst_sample:
x = p
word = i
if pattern(x) == pattern(word):
print(x)
print(word)
print()
If words in database lst_sample have different letter length (e.g. food, car, yellow), there is no problem to assign decrypted words, even when they have the same length, I can sort them based on their different structure: (e.g. puff, sort).
The main problem, which I am not able to solve, comes, when word has the same length and structure (e.g. jane, word).
I have no idea how to solve this problem, while keeping such an script architecture as described above. Is there any way, how that could be solved using another if statement or anything similar? Is there any way, how to solve it with infortmation that words in lst_sample will for sure be in ciphered text?
Thanks for all help!

Getting count of certain word in txt file in Python?

i'm trying to get number of word in certain txt file.
I've tried this but it's not working due to "AttributeError: 'list' object has no attribute 'split'":
words = 0
for wordcount in textfile.readlines().split(":"):
if wordcount == event.getPlayer().getName():
words += 1
Is there any easier or less complicated way to do this?
Here's my text file:
b2:PlayerName:Location{world=CraftWorld{name=world},x=224.23016231506807,y=71.0,z=190.2291303186236,pitch=31.349741,yaw=-333.30002}
What I want is to search for "PlayerName" which is players name and if player has 5 entries (actually, if word "PlayerName" has been five times written to file) it will add +5 to words.
P.S. I'm not sure if this is good for security, because it's an multiplayer game, so it could be many nicknames starting with "PlayerName" such as "PlayerName1337" or whatever, will this cause problem?
Should work
words = 0
for wordcount in textfile.read().split(":"):
if wordcount == event.getPlayer().getName():
words += 1
Here's the difference: .readlines() produces a list and .read() produces a string that you can split into list.
Better approach that won't count wrong things:
words = 0
for line in textfile.readlines():
# I assume that player name position is fixed
word = line.split(':')[1]
if word == event.getPlayer().getName():
words += 1
And yes, there is a security concern if there are players with the same names or with : in their names.
The problem with equal names is that your code doesn't know to what
player a line belongs.
If there will be a colon in player's name you code will also split it.
I urge you to assign some sort of unique immutable identifier for every player and use a database instead of text files that will handle all this stuff for you.
there is an even easier way if you want to count multiple names at once... use the Counter from the collections module
from collections import Counter
counter = Counter([line.split(':') for line in textfile.readlines()])
Counter will behave like a dict, so you will count all the names at once and if you need to, you can efficiently look up the count for more than one name.
At the moment your script counts only one name at a time per loop
you can access the count like so
counter[event.getPlayer().getName()]
I bet you will eventually want to count more than one name. If you do, you should avoid reading the textfile more than once.
You can find how many times a word occurs in a string with count:
words = textfile.read().count('PlayerName')

can anyone help me with my spell check code? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
This is what i have, comments describe what im trying to do
There are words put in a text file where some words are spelt wrong and test text files aswell which are to be used to spell check.
e.g. >>> spellCheck("test1.txt")
{'exercsie': 1, 'finised': 1}
from string import ascii_uppercase, ascii_lowercase
def spellCheck(textFileName):
# Use the open method to open the words file.
# Read the list of words into a list named wordsList
# Close the file
file=open("words.txt","r")
wordsList = file.readlines()
file.close()
# Open the file whos name was provided as the textFileName variable
# Read the text from the file into a list called wordsToCheck
# Close the file
file=open(textFileName, "r")
wordsToCheck = file.readlines()
file.close()
for i in range(0,len(wordsList)): wordsList[i]=wordsList[i].replace("\n","")
for i in range(0,len(wordsToCheck)): wordsToCheck[i]=wordsToCheck[i].replace("\n","")
# The next line creates the dictionary
# This dictionary will have the word that has been spelt wrong as the key and the number of times it has been spelt wrong as the value
spellingErrors = dict(wordsList)
# Loop through the wordsToCheck list
# Change the current word into lower case
# If the current word does not exist in the wordsList then
# Check if the word already exists in the spellingErrors dictionary
# If it does not exist than add it to the dictionary with the initial value of 1.
# If it does exist in the dictionary then increase the value by 1
# Return the dictionary
char_low = ascii_lowercase
char_up = ascii_uppercase
for char in wordsToCheck[0]:
if char in wordsToCheck[0] in char_up:
result.append(char_low)
for i in wordsToCheck[0]:
if wordsToCheck[0] not in wordsList:
if wordsToCheck[0] in dict(wordsList):
dict(wordsList) + 1
elif wordsToCheck[0] not in dict(wordsList):
dict(wordsList) + wordsToCheck[0]
dict(wordsList) + 1
return dict(wordsList)
my code returns an an error
Traceback (most recent call last):
File "", line 1, in
spellCheck("test1.txt")
File "J:\python\SpellCheck(1).py", line 36, in spellCheck
spellingErrors = dict(wordsList)
ValueError: dictionary update sequence element #0 has length 5; 2 is required
So can anyone help me?
I applied PEP-8 and rewrote unpythonic code.
import collections
def spell_check(text_file_name):
# dictionary for word counting
spelling_errors = collections.defaultdict(int)
# put all possible words in a set
with open("words.txt") as words_file:
word_pool = {word.strip().lower() for word in words_file}
# check words
with open(text_file_name) as text_file:
for word in (word.strip().lower() for word in text_file):
if not word in word_pool:
spelling_errors[word] += 1
return spelling_errors
You might want to read about the with statement and defaultdict.
Your code with the ascii_uppercase and ascii_lowercase screams: Read the tutorial and learn the basics. That code is a collection of "I don't know what I'm doing but I do it anyway.".
Some more explanations concerning your old code:
You use
char_low = ascii_lowercase
There is no need for char_low because you never manipulate that value. Just use the original ascii_lowercase. Then there is the following part of your code:
for char in wordsToCheck[0]:
if char in wordsToCheck[0] in char_up:
result.append(char_low)
I'm not quite sure what you try to do here. It seems that you want to convert the words in the list to lower case. In fact, if that code would run - which it doesn't - you would append the whole lower case alphabet to resultfor every upper case character of the word in the list. Nevertheless you don't use resultin the later code, so no harm is done. It would be easy to add a print wordsToCheck[0] before the loop or a print char in the loop to see what happens there.
The last part of the code is just a mess. You access just the first word in each list - maybe because you don't know what that list looks like. That is coding by trial and error. Try coding by knowledge instead.
You don't really know what a dict does and how to use it. I could explain it here but there is this wonderful tutorial at www.python.org that you might want to read first, especially the chapter dealing with dictionaries. If you study those explanations and still don't understand it feel free to come back with a new question concerning this.
I used a defaultdict instead of a standard dictionary because it makes life easier here. If you define spelling errors as dict instead a part of my code would have to change to
if not word in word_pool:
if not word in spelling_errors:
spelling_errors[word] = 1
else:
spelling_errors[word] += 1
BTW, the code I wrote runs for me without any problems. I get a dictionary with the missing words (lower case) as keys and a count of that word as the corresponding value.

Categories