How to print unique words from an inputted string - python

I have some code that I intend to print out all the unique words in a string that will be inputted by a user:
str1 = input("Please enter a sentence: ")
print("The words in that sentence are: ", str1.split())
unique = set(str1)
print("Here are the unique words in that sentence: ",unique)
I can get it to print out the unique letters, but not the unique words.

String.split(' ') takes a string and creates a list of elements divided by a space (' ').
set(foo) takes a collection foo and returns a set collection of only the distinct elements in foo.
What you want is this: unique_words = set(str1.split(' '))
The default value for the split separator is whitespace. I wanted to show that you can supply your own value to this method.

Also, you can use:
from collections import Counter
str1 = input("Please enter a sentence: ")
words = str1.split(' ')
c = Counter(words)
unique = [w for w in words if c[w] == 1]
print("Unique words: ", unique)

Another way of doing it:
user_input = input("Input: ").split(' ')
duplicates = []
for i in user_input:
if i not in duplicates:
duplicates.append(i)
print(duplicates)

Related

how to replace words of a sentence with the given words

I want to replace some words in the sentence with the given words and their replacements.
In the first line of code, you get the number of words that the user gives, then the words and their replacements and a last the sentence that should change
If there is any word in the sentence given, it should change. Otherwise, it will print the word itself.
for example:
user entry:
5
hello salam
goodbye khodafez
say goftan
we ma
you shoma
we say goodbye to you tonight
output:
ma goftan khodafez to shoma tonight
I wrote this code and the problem is to find the word and change
n=int(input())
words=[]
trans=[]
dict1={}
for i in range(0,n):
word_trans=input()
word_trans = word_trans.split()
words.append(word_trans[0])
trans.append(word_trans[1])
for i in range(0,n):
dict2={words[i]:trans[i]}
dict1.update(dict2)
sentence=input()
sentence1=sentence.split()
for i in sentence1:
if i==dict1(keys):
print(dict1(key))
else:
print(i)
Here is one solution with a few minor changes to fix your use of the translation dictionary and accumulating the translated results before finally printing the resulting sentence on one line.
n = int(input("Count of replacement words:"))
words = []
trans = []
dict1 = {}
for i in range(n):
word_trans = input("Enter old new:")
word_trans = word_trans.split()
words.append(word_trans[0])
trans.append(word_trans[1])
for i in range(n):
dict1[words[i]] = trans[i]
output = []
sentence = input("Enter sentence:")
for word in sentence.split():
if word in dict1:
output.append(dict1[word])
else:
output.append(word)
print(" ".join(output))
Here is a more compact version of that code with less state:
n = int(input("Count of replacement words: "))
word_map = {}
for _ in range(n):
words = input("Enter old new: ").split()
word_map[words[0]] = words[1]
sentence = input("Enter sentence: ")
output = [word_map.get(word, word) for word in sentence.split()]
print("Result:", " ".join(output))

Count Non-Substring Overlapping

Write a program allows user to input a string. End the program is printing out:
a. How many words that repeated itself.
For example: 'This is Jake and Jake is 24 years old'
The console must print out '4' because 'is' and 'Jake' are the word that repeated
b. Remove all the repeated word. Print out the rest: 'This and 24 years old'
c. Print out which repeated words have been removed
So the idea is the user can type whatever they want, 'This is Jake and Jake is 24 years old' is just an example. The hardest part is how can console check all the repeated words without a substring?
Does this work?
Here is what I am doing.
First I grab the user input, then I convert the string to a list splitting on spaces between words. Then I count the occurence of the words, if the wordcount is greater than 1, I add it to a dictionary , where the key is the word and the value is the count of the words that exist in the string.
After printing out the repeated words, I remove the strings that find a mention in the dictionary.
Note - This code can be improved so much but I purposely did it such a manner to make it easier to understand. You should not be using this code if its a production system.
string = input("Enter your string : ")
items = {}
words = string.split(" ")
for word in words:
wordCount = words.count(word)
if(wordCount > 1):
items[word] = wordCount
print("There are {0} repeated words".format(len(items)))
updateString = ""
for item in items:
updateString =string.replace(item,"")
print(updateString)
print(items)
Updated
string = input("Enter your string : ")
items = {}
words = string.split(" ")
for word in words:
wordCount = words.count(word)
if(wordCount > 1):
items[word] = wordCount
print("There are {0} repeated words".format(len(items)))
for item in items:
string = string.replace(" {0} ".format(item)," ")
print(string)
print(items)

I need to make it so the sentence and numbers that are outputted are stored into a file

The code I have made:
asks the user for a sentence,
makes it lower case so it is not case sensitive
splits the sentence into separate words so each word can have a number assigned to it according to its position
How can I add a part to my code that saves the sentence inputted by
the user as a file, along with the numbers that get assigned to each
word?
Here is my code:
sentence = input("Please enter a sentence")
sentence = sentence.lower()
sentence = sentence.split()
positions = [sentence.index (x) +1 for x in sentence]
print(sentence)
print(positions)
Use raw_input if you want to treat everything as strings. You do not need to store the positions but, instead, get them from the wonderful enumerate function. Then write to the file like so
sentence = raw_input("Please enter a sentence: ")
sentence = sentence.lower()
sentence = sentence.split()
open('filename.txt','w').writelines(["%d-%s\n"%(i+1,x) for (i,x) in enumerate(sentence)])
sentence = input("Please enter a sentence")
sentence = sentence.lower()
sentence = sentence.split()
wordPositionDict = {}
( wordPositionDict.get(x,[]).append(i+1) for i,x in enumerate(sentence))
print wordPositionDict[word]
append all the index for each word to dict. after iterate all the words in sentence you will have one dict which key is the words and value is the list of index

Replacing and Storing

So, here is what I got:
def getSentence():
sentence = input("What is your sentence? ").upper()
if sentence == "":
print("You haven't entered a sentence. Please re-enter a sentence.")
getSentence()
elif sentence.isdigit():
print("You have entered numbers. Please re-enter a sentence.")
getSentence()
else:
import string
for c in string.punctuation:
sentence = sentence.replace(c,"")
return sentence
def list(sentence):
words = []
for word in sentence.split():
if not word in words:
words.append(word)
print(words)
def replace(words,sentence):
position = []
for word in sentence:
if word == words[word]:
position.append(i+1)
print(position)
sentence = getSentence()
list = list(sentence)
replace = replace(words,sentence)
I have only managed to get this far, my full intention is to take the sentence, seperate into words, change each word into a number e.g.
words = ["Hello","world","world","said","hello"]
And make it so that each word has a number:
So lets say that "hello" has the value of 1, the sentence would be '1 world world said 1'
And if world was 2, it would be '1 2 2 said 1'
Finally, if "said" was 3, it would be '1 2 2 1 2'
Any help would be greatly appreciated, I will then develop this code so that the sentence and such is stored into a file using file.write() and file.read() etc
Thanks
If you want just the position in which each word is you can do
positions = map(words.index,words)
Also, NEVER use built-in function names for your variables or functions. And also never call your variables the same as your functions (replace = replace(...)), functions are objects
Edit: In python 3 you must convert the iterator that map returns to a list
positions = list(map(words.index, words))
Or use a comprehension list
positions = [words.index(w) for w in words]
Does it matter what order the words are turned into numbers? Is Hello and hello two words or one? Why not something like:
import string
sentence = input() # user input here
sentence.translate(str.maketrans('', '', string.punctuation))
# strip out punctuation
replacements = {ch: str(idx) for idx, ch in enumerate(set(sentence.split()))}
# builds {"hello": 0, "world": 1, "said": 2} or etc
result = ' '.join(replacements.get(word, word) for word in sentence.split())
# join back with the replacements
Another idea (although don't think it's better than the rest), use dictionaries:
dictionary = dict()
for word in words:
if word not in dictionary:
dictionary[word] = len(dictionary)+1
Also, on your code, when you're calling "getSentence" inside "getSentence", you should return its return value:
if sentence == "":
print("You haven't entered a sentence. Please re-enter a sentence.")
return getSentence()
elif sentence.isdigit():
print("You have entered numbers. Please re-enter a sentence.")
return getSentence()
else:
...

Write a function filter_long_words() that takes a list of words and an integer n and returns the list of words that are longer than n

Whenever I run this code it just gives me a blank list, I am wondering what I am doing wrong. I am trying to print a list of words that are longer than n. When i try to run the updated code it only prints the first word from the list of words that i enter.
def filterlongword(string,number):
for i in range(len(string)):
listwords = []
if len(string[i]) > number:
listwords.append(string[i])
return listwords
def main():
words = input("Please input the list of words: ")
integer = eval(input("Please input an integer: "))
words1 = filterlongword(words,integer)
print("The list of words greater than the integer is",words1)
main()
Initialize listwords before the loop
Return listwords after the loop
Split the input string into a list of words
def filterlongword(string,number):
listwords = []
for i in range(len(string)):
if len(string[i]) > number:
listwords.append(string[i])
return listwords
And a nicer version using list comprehension:
def filterlongword(string,number):
return [word for word in string if len(word) > number]
To split the input string into a list of words, use
words = input("Please input the list of words: ").split()
even better would be just
def filterlongword(string,number):
return filter(lambda word:len(word)>number, string)
# or: return [w for w in string if len(w) > number]
def listing(guess, number):
new_list = []
for i in range(len(guess)):
if len(guess[i]) > number:
new_list.append(guess[i])
print (new_list)
list1 = input("take input: ")
list = list1.split(",")
def main():
global list, integer1
integer = input()
integer1 = int(integer)
listing(list, integer1)
main()
**try this code..this will work, use a delimiter to form a list of your input **
Your main problem is passing words as a single string rather than an iterable of strings. The secondary problem is not specifying the separator between words for the missing .split. Here is my version.
I made longwords a generator function because in actually use, one does not necessary need the sequence of long words to be a list, and I gave an example of this in the output formatting.
def longwords(wordlist, length):
return (word for word in wordlist if len(word) >= length)
def main():
words = input("Enter words, separated by spaces: ").split()
length = int(input("Minimum length of words to keep: "))
print("Words longer than {} are {}.".format(length,
', '.join(longwords(words, length))))
main()
This results in, for instance
Enter words, separated by spaces: a bb ccc dd eeee f ggggg
Minimum length of words to keep: 3
Words longer than 3 are ccc, eeee, ggggg.
Maybe you can shorten the code to the following:
def filter_long_words():
n = raw_input("Give words and a number: ").split()
return sorted(n)[1:] # sorted the List , number it is the shorter .
# called the item from the second position to ende .
print filter_long_words()
def filter_long_words(**words**,**number**):
l=[]
split = **words**.split(",")
for i in split:
if len(i) > **number**:
l.append(i)
return l
**w** = input("enter word:")
**n** = int(input("Enter number"))
filter_long_words(**w**,**n**)
TRY THIS
***
def filter_long_words(n, words):
list_of_words=[]
a= words.split(" ")
for x in a:
if len(x)>n:
list_of_words.append(x)
return list_of_words
for i in range(len(listOfWords)):
listOfInt = []
for i in range(len(listOfWords)):
listOfInt.append(len(listOfWords[i]))
print('List of word length: ',listOfInt)
def filter_long_words(lst,n):
a=[]
for i in lst:
if n<len(i):
a.append(i)
return a
To filter list of words
def filter_long_words(lst,n):
return [word for word in lst if len(word)>n]

Categories