Storing words from a text file [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm new to python and am wondering is there a way to take 1 one word from an external file of 10 words and store it individually.
I'm making a words memory game where the user is shown a list of words and then it is removed after a certain amount of time and the words will appear again but one word will be different and they have to guess which word has been replaced.
The word will be randomly chosen from an external file but the external file consists of 10 words, 9 in which will be displayed first and 1 in which is stored as a substitute word.
Does anyone have any ideas?

I have used the unix dictionary here, you can take whichever you want. More resources here:
import random
from copy import copy
''' Word game '''
with open('/usr/share/dict/words','r') as w:
words = w.read().splitlines()
numWords = 10
allWords = [words[i] for i in random.sample(range(len(words)),numWords)]
hiddenWord = allWords[0]
displayWords = allWords[1:]
print displayWords
choice = str((raw_input ('Ready? [y]es\n')))
choice = choice.strip()
if choice == 'y':
indexToRemove = random.randint(0,len(displayWords))
displayWordsNew = copy(displayWords)
random.shuffle(displayWordsNew)
displayWordsNew[indexToRemove] = hiddenWord
print displayWordsNew
word = str(raw_input ('Which is the different word\n'))
if word == displayWordsNew[indexToRemove]:
print "You got it right"
print displayWords
print displayWordsNew
else:
print "Oops, you got it wrong, but it's a difficult game! The correct word was"
print displayWordsNew[indexToRemove]
Results:
["Lena's", 'Galsworthy', 'filliped', 'cadenza', 'telecasts', 'scrutinize', "candidate's", "kayak's", 'workman']
Ready?
y
["Lena's", 'workman', 'scrutinize', 'filliped', 'Latino', 'telecasts', "candidate's", 'cadenza', 'Galsworthy']
Which is the different word
telecasts
Oops, you got it wrong, but it's a difficult game! The correct word was
Latino

If you have an input file like "one word in a new line", just do this:
>>> open("C:/TEXT.txt").read()
'FISH\nMEAT\nWORD\nPLACE\nDOG\n'
Then split the string to the list:
>>> open("C:/Work/TEXT.txt").read().split('\n')
['FISH', 'MEAT', 'WORD', 'PLACE', 'DOG', '']
Oh... And strip new line in the end:
>>> open("C:/Work/TEXT.txt").read().strip().split('\n')
['FISH', 'MEAT', 'WORD', 'PLACE', 'DOG']
For replacing use random.choice from the range of the list:
>>> import random
>>> listOfWords = open("C:/Work/TEXT.txt").read().strip().split('\n')
>>> listOfWords
['FISH', 'MEAT', 'WORD', 'PLACE', 'DOG']
>>> random.choice(range(len(listOfWords)))
3
>>> listOfWords[random.choice(range(len(listOfWords)))] = 'NEW_WORD'
>>> listOfWords
['FISH', 'MEAT', 'NEW_WORD', 'PLACE', 'DOG']
And if you want to shuffle a new list:
>>> random.shuffle(listOfWords)
>>> listOfWords
['PLACE', 'NEW_WORD', 'FISH', 'DOG', 'MEAT']

I'm new to python and am wondering is there a way to take 1 one word
from an external file of 10 words and store it individually.
There's a LOT of ways to store/reference variables in/from a file.
If you don't mind a little typing, just store the variables in a .py file (remember to use proper python syntax):
# myconfig.py:
var_a = 'Word1'
var_b = 'Word2'
var_c = 'Word3'
etc...
Use the file itself as a module
from myconfig import *
(This will let you reference all the variables in the text file.)
If you only want to reference individual variables you just import the ones you want
from myconfig import var_a, var_b
(This will let you reference var_a and var_b, but nothing else)

You should try this:
foo = open("file.txt", mode="r+")
If the words are on different lines:
words = foo.readlines()
Or if the words are separated by spaces:
words = foo.read().split(" ")
Try this...

Related

Python: split a text into individual English sentences; retain the punctuation [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I am trying to make a function, takes a string/text as an argument, return list of sentences in the text. Sentence boundaries like(.,?,!) should not be removed.
I don't want it to split on abbreviations (Dr. Kg. Mr. Mrs., e.g. "Dr. Jones").
Should I make a dictionary of all abbreviations?
Given input:
input = "I think Dr. Jones is busy now. Can you visit some other day? I was really surprised!"
Expected output:
output=['I think Dr. Jones is busy now.','Can you visit some other day?','I was really surprised!']
What I've tried:
# performing somthing like this:
output = input.split('.')
# will produce
'''
['I think Dr', ' Jones is busy now', ' Can you visit some other day? I was really surprised!']
'''
# where as doing
output = input.split(' ')
# will produce
'''
['I', 'think', 'Dr.', 'Jones', 'is', 'busy', 'now.', 'Can', 'you', 'visit', 'some', 'other', 'day?', 'I', 'was', 'really', 'surprised!']
'''
Basic assumption is that the text intput is not anomalously punctuated!
A clumsy way of achieving it is as follows:
abbr = {'Dr.', 'Mr.', 'Mrs.', 'Ms.'}
sentence_ender = ['.', '?', '!']
s = "I think Dr. Jones is busy now. Can you visit some other day? I was really surprised!"
def containsAny(wrd, charList):
# The list comprehension generates a list of True and False.
# "1 in [ ... ]" returns true is the list has atleast 1 true, else false
# we are essentially testing whether the word contains the sentence ender char
return 1 in [c in wrd for c in charList]
def separate_sentences(string):
sentences = [] # will be a list of all complete sentences
temp = [] # will be a list of all words in current sentence
for wrd in string.split(' '): # the input string is split on spaces
temp.append(wrd) # append current word to temp
# The following condition checks that if the word is not an abbreviation
# yet contains any of the sentence delimiters,
# make 'space separated' sentence and clear temp
if wrd not in abbr and containsAny(wrd, sentence_ender):
sentences.append(' '.join(temp)) # combine words currently in temp
temp = [] # clear temp, for next sentence
return sentences
print(separate_sentences(s))
Should produce:
['I think Dr. Jones is busy now.', 'Can you visit some other day?', 'I was really surprised!']

Capitalize only certain specific words in a returned string? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
in my code I have this line which return a string
return item.title().lower()
item is an object, with a title, and we return that string title but in all lowercase.
How can I do something like
return item.title().lower()
but if the words (Maxine, Flora, Lindsey) are in that title, keep them uppercase.
All the other words, do lowercase
I can use an if statement but I'm not really sure how to capitalize only specific words.
like
names= ("Maxine", "Mrs. Lichtenstein", "string3")
if any(s in item.title() for s in names):
return ???
would something like that work? And what could I return?
The following should work (considering that there is no occurence of these words with first character as lowercase (eg maxine) or there is and you want it to upper):
def format(s):
s=s.lower()
for i in ('maxine', 'flora', 'lindsey'):
if i in s:
s=s[:s.find(i)]+i[0].upper()+i[1:]+s[s.find(i)+len(i):]
return s
Example:
item.title='My name is Maxine i LIVE IN Flora and I LOVE Lindsey'
>>> format(item.title)
'my name is Maxine i live in Flora and i love Lindsey'
You're on the right track to use an if statement to check the word against a list of things to keep. Try something like this:
def lowercase_some(words, exclude=[]):
# List of processed words
new_words = []
for word in words:
if word.lower() in exclude:
# If the word is in our list of ones to exclude, don't convert
new_words.append(word)
else:
new_words.append(word.lower())
return new_words
>>> lowercase_some(['TEST', 'WORDS', 'MEXICO'], ['mexico'])
['test', 'words', 'MEXICO']
This can be done in a very Python-ic way with list-comprehension:
def lowercase_some(words, exclude=[]):
return [word.lower() if word.lower() not in exclude else word for word in words]
you can use this:
names = ['Maxine', 'Flora', 'Lindsey']
if item.title() in names:
return item.title()
else:
return item.title.lower()

how do I count the number of words that occur at least once in a string? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm using Python 3.7. I have an array of unique words ...
words = ["abc", "cat", "dog"]
Then I have other strings, which may or not contain one or more instances of these words. How do I figure out the number of occurrences of unique instances of each word in each string? For example if I have
s = "bbb abc abc lll dog"
Given the above array, words, the result of counting unique words in "s" should be 2, because "abc" occurs at least once, and "dog" occurs at least once. Similarly,
s2 = "CATTL DOG mmm"
would only contain 1 unique word, "dog". The other words don't occur in the array "words".
A quick way would be:
set(words).intersection(s.split(" "))
A set comprehension is a good choice here
words = ['abc', 'cat', 'dog']
s = 'bbb abc abc lll dog'
ss = {w for w in s.split() if w in words}
ss
> {'abc', 'dog'}

How to choose a word that is in brackets from the sentence? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How to choose
[Andrey] and [21] from info?
info = "my name is [Andrey] and I am [21] years old"
result = ["[Andrey]", "[21]"];
I am sure other ways would be better. But I tried this and it worked.
If you want to extract characters inside [] without knowing its position, you can use this method:
Run a for loop through string
If you find character [
append all the next characters in a string until you find ]
you can add these strings in a list to fetch result together. Here is the code.
info = "my name is [Andrey] and I am [21] years old"
s=[] #list to collect searched result
s1="" #elements of s
for i in range(len(info)):
if info[i]=="[":
while info[i+1] != "]":
s1 += info[i+1]
i=i+1
s.append(s1)
s1=""
#make s1 empty to search for another string inside []
print s
Output will be:
['Andrey', '21']
You may choose to regex method.
Or simply use list comprehension for your use case here:
>>> print([ lst[index] for index in [3,7] ])
['[Andrey]', '[21]']
But another way, You first convert your string to list and then choose by index method with the help of itemgetter:
>>> info = "my name is [Andrey] and I am [21] years old"
>>> lst = info.split()
>>> lst
['my', 'name', 'is', '[Andrey]', 'and', 'I', 'am', '[21]', 'years', 'old']
>>> from operator import itemgetter
>>> print(itemgetter(3,7)(lst))
('[Andrey]', '[21]')

Counting the number of unique words [duplicate]

This question already has answers here:
Counting the number of unique words in a document with Python
(8 answers)
Closed 9 years ago.
I want to count unique words in a text, but I want to make sure that words followed by special characters aren't treated differently, and that the evaluation is case-insensitive.
Take this example
text = "There is one handsome boy. The boy has now grown up. He is no longer a boy now."
print len(set(w.lower() for w in text.split()))
The result would be 16, but I expect it to return 14. The problem is that 'boy.' and 'boy' are evaluated differently, because of the punctuation.
import re
print len(re.findall('\w+', text))
Using a regular expression makes this very simple. All you need to keep in mind is to make sure that all the characters are in lowercase, and finally combine the result using set to ensure that there are no duplicate items.
print len(set(re.findall('\w+', text.lower())))
you can use regex here:
In [65]: text = "There is one handsome boy. The boy has now grown up. He is no longer a boy now."
In [66]: import re
In [68]: set(m.group(0).lower() for m in re.finditer(r"\w+",text))
Out[68]:
set(['grown',
'boy',
'he',
'now',
'longer',
'no',
'is',
'there',
'up',
'one',
'a',
'the',
'has',
'handsome'])
I think that you have the right idea of using the Python built-in set type.
I think that it can be done if you first remove the '.' by doing a replace:
text = "There is one handsome boy. The boy has now grown up. He is no longer a boy now."
punc_char= ",.?!'"
for letter in text:
if letter == '"' or letter in punc_char:
text= text.replace(letter, '')
text= set(text.split())
len(text)
that should work for you. And if you need any of the other signs or punctuation points you can easily
add them into punc_char and they will be filtered out.
Abraham J.
First, you need to get a list of words. You can use a regex as eandersson suggested:
import re
words = re.findall('\w+', text)
Now, you want to get the number of unique entries. There are a couple of ways to do this. One way would be iterate through the words list and use a dictionary to keep track of the number of times you have seen a word:
cwords = {}
for word in words:
try:
cwords[word] += 1
except KeyError:
cwords[word] = 1
Now, finally, you can get the number of unique words by
len(cwords)

Categories