Upper case first letter of each word in a phrase - python

Ok, I'm trying to figure out how to make a inputed phrase such as this in python ....
Self contained underwater breathing apparatus
output this...
SCUBA
Which would be the first letter of each word. Is this something to do with index? and maybe a .upper function?

This is the pythonic way to do it:
output = "".join(item[0].upper() for item in input.split())
# SCUBA
There you go. Short and easy to understand.
LE:
If you have other delimiters than space, you can split by words, like this:
import re
input = "self-contained underwater breathing apparatus"
output = "".join(item[0].upper() for item in re.findall("\w+", input))
# SCUBA

Here's the quickest way to get it done
input = "Self contained underwater breathing apparatus"
output = ""
for i in input.upper().split():
output += i[0]

#here is my trial, brief and potent!
str = 'Self contained underwater breathing apparatus'
reduce(lambda x,y: x+y[0].upper(),str.split(),'')
#=> SCUBA

Pythonic Idioms
Using a generator expression over str.split()
Optimize the inner loop by moving upper() to one call at outside of the loop.
Implementation:
input = 'Self contained underwater breathing apparatus'
output = ''.join(word[0] for word in input.split()).upper()

Another way
input = 'Self contained underwater breathing apparatus'
output = ''.join(item[0].capitalize() for item in input.split())

def myfunction(string):
return (''.join(map(str, [s[0] for s in string.upper().split()])))
myfunction("Self contained underwater breathing apparatus")
Returns SCUBA

s = "Self contained underwater breathing apparatus"
for item in s.split():
print item[0].upper()

Some list comprehension love:
"".join([word[0].upper() for word in sentence.split()])

Another way which may be more easy for total beginners to apprehend:
acronym = input('Please give what names you want acronymized: ')
acro = acronym.split() #acro is now a list of each word
for word in acro:
print(word[0].upper(),end='') #prints out the acronym, end='' is for obstructing capitalized word to be stacked one below the other
print() #gives a line between answer and next command line's return

I believe you can get it done this way as well.
def get_first_letters(s):
return ''.join(map(lambda x:x[0].upper(),s.split()))

Why no one is using regex? In JavaScript, I would use regex so I don't need to use the loop, please find Python example below.
import re
input = "Self-contained underwater & breathing apparatus google.com"
output = ''.join(re.findall(r"\b(\w)", input.upper()))
print(output)
# SCUBAGC
Please note that the above regex /\b(\w)/g uses \b word boundary and \w word so it will match position between an alphanumeric word character and non-word character so for example ā€œ&ā€ is not matched and ".com" ā€œcā€ is matched and also "s" and "c" is matched on "self-contained"
Alternative Regex using lookahead and lookbehind:
/(?!a\s)\b[\w]|&/g Excluding " a " and including "&"
/(?<=\s)[\w&]|^./g Any word character and "&" after every whitespace. This prevents matching "c" on .com but also prevents matching "c" on "self-contained"
Code snippet
Regex 1, Regex 2, Regex 3

Related

Matching a string if it contains all words of a list in python

I have a number of long strings and I want to match those that contain all words of a given list.
keywords=['special','dreams']
search_string1="This is something that manifests especially in dreams"
search_string2="This is something that manifests in special cases in dreams"
I want only search_string2 matched. So far I have this code:
if all(x in search_text for x in keywords):
print("matched")
The problem is that it will also match search_string1. Obviously I need to include some regex matching that uses \w or or \b, but I can't figure out how I can include a regex in the if all statement.
Can anyone help?
you can use regex to do the same but I prefer to just use python.
string classes in python can be split to list of words. (join can join a list to string). while using word in list_of_words will help you understand if word is in the list.
keywords=['special','dreams']
found = True
for word in keywords:
if not word in search_string1.split():
found = False
Could be not the best idea, but we could check if one set is a part of another set:
keywords = ['special', 'dreams']
strs = [
"This is something that manifests especially in dreams",
"This is something that manifests in special cases in dreams"
]
_keywords = set(keywords)
for s in strs:
s_set = set(s.split())
if _keywords.issubset(s_set):
print(f"Matched: {s}")
Axe319's comment works and is closest to my original question of how to solve the problem using regex. To quote the solution again:
all(re.search(fr'\b{x}\b', search_text) for x in keywords)
Thanks to everyone!

Find the word from the list given and replace the words so found

My question is pretty simple, but I haven't been able to find a proper solution.
Given below is my program:
given_list = ["Terms","I","want","to","remove","from","input_string"]
input_string = input("Enter String:")
if any(x in input_string for x in given_list):
#Find the detected word
#Not in bool format
a = input_string.replace(detected_word,"")
print("Some Task",a)
Here, given_list contains the terms I want to exclude from the input_string.
Now, the problem I am facing is that the any() produces a bool result and I need the word detected by the any() and replace it with a blank, so as to perform some task.
Edit: any() function is not required at all, look for useful solutions below.
Iterate over given_list and replace them:
for i in given_list:
input_string = input_string.replace(i, "")
print("Some Task", input_string)
No need to detect at all:
for w in given_list:
input_string = input_string.replace(w, "")
str.replace will not do anything if the word is not there and the substring test needed for the detection has to scan the string anyway.
The problem with finding each word and replacing it is that python will have to iterate over the whole string, repeatedly. Another problem is you will find substrings where you don't want to. For example, "to" is in the exclude list, so you'd end up changing "tomato" to "ma"
It seems to me like you seem to want to replace whole words. Parsing is a whole new subject, but let's simplify. I'm just going to assume everything is lowercase with no punctuation, although that can be improved later. Let's use input_string.split() to iterate over whole words.
We want to replace some words with nothing, so let's just iterate over the input_string, and filter out the words we don't want, using the builtin function of the same name.
exclude_list = ["terms","i","want","to","remove","from","input_string"]
input_string = "one terms two i three want to remove"
keepers = filter(lambda w: w not in exclude_list, input_string.lower().split())
output_string = ' '.join(keepers)
print (output_string)
one two three
Note that we create an iterator that allows us to go through the whole input string just once. And instead of replacing words, we just basically skip the ones we don't want by having the iterator not return them.
Since filter requires a function for the boolean check on whether to include or exclude each word, we had to define one. I used "lambda" syntax to do that. You could just replace it with
def keep(word):
return word not in exclude_list
keepers = filter(keep, input_string.split())
To answer your question about any, use an assignment expression (Python 3.8+).
if any((word := x) in input_string for x in given_list):
# match captured in variable word

Derive words from string based on key words

I have a string (text_string) from which I want to find words based on my so called key_words. I want to store the result in a list called expected_output.
The expected output is always the word after the keyword (the number of spaces between the keyword and the output word doesn't matter). The expected_output word is then all characters until the next space.
Please see the example below:
text_string = "happy yes_no!?. why coding without paus happy yes"
key_words = ["happy","coding"]
expected_output = ['yes_no!?.', 'without', 'yes']
expected_output explanation:
yes_no!?. (since it comes after happy. All signs are included until the next space.)
without (since it comes after coding. the number of spaces surronding the word doesn't matter)
yes (since it comes after happy)
You can solve it using regex. Like this e.g.
import re
expected_output = re.findall('(?:{0})\s+?([^\s]+)'.format('|'.join(key_words)), text_string)
Explanation
(?:{0}) Is getting your key_words list and creating a non-capturing group with all the words inside this list.
\s+? Add a lazy quantifier so it will get all spaces after any of the former occurrences up to the next character which isn't a space
([^\s]+) Will capture the text right after your key_words until a next space is found
Note: in case you're running this too many times, inside a loop i.e, you ought to use re.compile on the regex string before in order to improve performance.
We will use re module of Python to split your strings based on whitespaces.
Then, the idea is to go over each word, and look if that word is part of your keywords. If yes, we set take_it to True, so that next time the loop is processed, the word will be added to taken which stores all the words you're looking for.
import re
def find_next_words(text, keywords):
take_it = False
taken = []
for word in re.split(r'\s+', text):
if take_it == True:
taken.append(word)
take_it = word in keywords
return taken
print(find_next_words("happy yes_no!?. why coding without paus happy yes", ["happy", "coding"]))
results in ['yes_no!?.', 'without', 'yes']

Python 3 - How to capitalize first letter of every sentence when translating from morse code

I am trying to translate morse code into words and sentences and it all works fine... except for one thing. My entire output is lowercased and I want to be able to capitalize every first letter of every sentence.
This is my current code:
text = input()
if is_morse(text):
lst = text.split(" ")
text = ""
for e in lst:
text += TO_TEXT[e].lower()
print(text)
Each element in the split list is equal to a character (but in morse) NOT a WORD. 'TO_TEXT' is a dictionary. Does anyone have a easy solution to this? I am a beginner in programming and Python btw, so I might not understand some solutions...
Maintain a flag telling you whether or not this is the first letter of a new sentence. Use that to decide whether the letter should be upper-case.
text = input()
if is_morse(text):
lst = text.split(" ")
text = ""
first_letter = True
for e in lst:
if first_letter:
this_letter = TO_TEXT[e].upper()
else:
this_letter = TO_TEXT[e].lower()
# Period heralds a new sentence.
first_letter = this_letter == "."
text += this_letter
print(text)
From what is understandable from your code, I can say that you can use the title() function of python.
For a more stringent result, you can use the capwords() function importing the string class.
This is what you get from Python docs on capwords:
Split the argument into words using str.split(), capitalize each word using str.capitalize(), and join the capitalized words using str.join(). If the optional second argument sep is absent or None, runs of whitespace characters are replaced by a single space and leading and trailing whitespace are removed, otherwise sep is used to split and join the words.

How do I calculate the number of times a word occurs in a sentence?

So I've been learning Python for some months now and was wondering how I would go about writing a function that will count the number of times a word occurs in a sentence. I would appreciate if someone could please give me a step-by-step method for doing this.
Quick answer:
def count_occurrences(word, sentence):
return sentence.lower().split().count(word)
'some string.split() will split the string on whitespace (spaces, tabs and linefeeds) into a list of word-ish things. Then ['some', 'string'].count(item) returns the number of times item occurs in the list.
That doesn't handle removing punctuation. You could do that using string.maketrans and str.translate.
# Make collection of chars to keep (don't translate them)
import string
keep = string.lowercase + string.digits + string.whitespace
table = string.maketrans(keep, keep)
delete = ''.join(set(string.printable) - set(keep))
def count_occurrences(word, sentence):
return sentence.lower().translate(table, delete).split().count(word)
The key here is that we've constructed the string delete so that it contains all the ascii characters except letters, numbers and spaces. Then str.translate in this case takes a translation table that doesn't change the string, but also a string of chars to strip out.
wilberforce has the quick, correct answer, and I'll give the long winded 'how to get to that conclusion' answer.
First, here are some tools to get you started, and some questions you need to ask yourself.
You need to read the section on Sequence Types, in the python docs, because it is your best friend for solving this problem. Seriously, read it. Once you have read that, you should have some ideas. For example you can take a long string and break it up using the split() function. To be explicit:
mystring = "This sentence is a simple sentence."
result = mystring.split()
print result
print "The total number of words is: " + str(len(result))
print "The word 'sentence' occurs: " + str(result.count("sentence"))
Takes the input string and splits it on any whitespace, and will give you:
["This", "sentence", "is", "a", "simple", "sentence."]
The total number of words is 6
The word 'sentence' occurs: 1
Now note here that you do have the period still at the end of the second 'sentence'. This is a problem because 'sentence' is not the same as 'sentence.'. If you are going to go over your list and count words, you need to make sure that the strings are identical. You may need to find and remove some punctuation.
A naieve approach to this might be:
no_period_string = mystring.replace(".", " ")
print no_period_string
To get me a period-less sentence:
"This sentence is a simple sentence"
You also need to decide if your input going to be just a single sentence, or maybe a paragraph of text. If you have many sentences in your input, you might want to find a way to break them up into individual sentences, and find the periods (or question marks, or exclamation marks, or other punctuation that ends a sentence). Once you find out where in the string the 'sentence terminator' is you could maybe split up the string at that point, or something like that.
You should give this a try yourself - hopefully I've peppered in enough hints to get you to look at some specific functions in the documentation.
Simplest way:
def count_occurrences(word, sentence):
return sentence.count(word)
text=input("Enter your sentence:")
print("'the' appears", text.count("the"),"times")
simplest way to do it
Problem with using count() method is that it not always gives the correct number of occurrence when there is overlapping, for example
print('banana'.count('ana'))
output
1
but 'ana' occurs twice in 'banana'
To solve this issue, i used
def total_occurrence(string,word):
count = 0
tempsting = string
while(word in tempsting):
count +=1
tempsting = tempsting[tempsting.index(word)+1:]
return count
You can do it like this:
def countWord(word):
numWord = 0
for i in range(1, len(word)-1):
if word[i-1:i+3] == 'word':
numWord += 1
print 'Number of times "word" occurs is:', numWord
then calling the string:
countWord('wordetcetcetcetcetcetcetcword')
will return: Number of times "word" occurs is: 2
def check_Search_WordCount(mySearchStr, mySentence):
len_mySentence = len(mySentence)
len_Sentence_without_Find_Word = len(mySentence.replace(mySearchStr,""))
len_Remaining_Sentence = len_mySentence - len_Sentence_without_Find_Word
count = len_Remaining_Sentence/len(mySearchStr)
return (int(count))
I assume that you just know about python string and for loop.
def count_occurences(s,word):
count = 0
for i in range(len(s)):
if s[i:i+len(word)] == word:
count += 1
return count
mystring = "This sentence is a simple sentence."
myword = "sentence"
print(count_occurences(mystring,myword))
explanation:
s[i:i+len(word)]: slicing the string s to extract a word having the same length with the word (argument)
count += 1 : increase the counter whenever matched.

Categories