Python how to get a word in string via specified character? - python

a = "I only want the $1000"
print(get_word_containing("$")
output: $1000
How would I get the whole word within a string by a character within that word to work like shown above?

Your function can be as simple as follows:
def get_word_containing(string, char):
words = [word for word in string.split() if char in word]
return words
string = "I only want the $1000"
print(get_word_containing(string, "$"))
Output:
['$1000']

I will modify #Biplob function a little to print the strings instead:
def get_word_containing(myStr, char):
for x in myStr.split():
if char in x:
print(x)
mystring = "I only want the $1000"
get_word_containing(mystring, "$")

import re
a = "I only want the $1000"
list = re.findall("[$]\w+", text)
print(list)
Above code will give you array of all words starts with $ in your string

import re
def get_word_containing(myStr, char):
list = re.findall("["+char+"]\w+", myStr)
return list;
mystring = "I only $200 want the $1000"
ouput = get_word_containing(mystring, "$")
print(ouput);
So it will give us ['$200', '$1000']

Related

Python String adjust

Hello is use some method like .isupper() in a loop, or string[i+1] to find my lower char but i don't know how to do that
input in function -> "ThisIsMyChar"
expected -> "This is my char"
I´ve done it with regex, could be done with less code but my intention is readable
import re
def split_by_upper(input_string):
pattern = r'[A-Z][a-z]*'
matches = re.findall(pattern, input_string)
if (matches):
output = matches[0]
for word in matches[1:]:
output += ' ' + word[0].lower() + word[1:]
return output
else:
return input_string
print(split_by_upper("ThisIsMyChar"))
>> split_by_upper() -> "This is my char"
You could use re.findall and str.lower:
>>> import re
>>> s = 'ThisIsMyChar'
>>> ' '.join(w.lower() if i >= 1 else w for i, w in enumerate(re.findall('.[^A-Z]*', s)))
'This is my char'
You should first try by yourself. If you didn't get it done, you can do something like this:
# to parse input string
def parse(str):
result= "" + str[0];
for i in range(1, len(str)):
ch = str[i]
if ch.isupper():
result += " ";
result += ch.lower();
return result;
# input string
str = "ThisIsMyChar";
print(parse(str))
First you need to run a for loop and check for Uppercase words then when you find it just add a space at the starting, lower the word and increment it to your new string. Simple, more code is explained in comments in the code itself.
def AddSpaceInTitleCaseString(string):
NewStr = ""
# Check for Uppercase string in the input string char-by-char.
for i in string:
# If it found one, add it to the NewStr variable with a space and lowering it's case.
if i.isupper(): NewStr += f" {i.lower()}"
# Else just add it as usual.
else: NewStr += i
# Before returning the NewStr, remove all the leading and trailing spaces from it.
# And as shown in your question I'm assuming that you want the first letter or your new sentence,
# to be in uppercase so just use 'capitalize' function for it.
return NewStr.strip().capitalize()
# Test.
MyStr = AddSpaceInTitleCaseString("ThisIsMyChar")
print(MyStr)
# Output: "This is my char"
Hope it helped :)
Here is a concise regex solution:
import re
capital_letter_pattern = re.compile(r'(?!^)[A-Z]')
def add_spaces(string):
return capital_letter_pattern.sub(lambda match: ' ' + match[0].lower(), string)
if __name__ == '__main__':
print(add_spaces('ThisIsMyChar'))
The pattern searches for capital letters ([A-Z]), and the (?!^) is negative lookahead that excludes the first character of the input ((?!foo) means "don't match foo, ^ is "start of line", so (?!^) is "don't match start of line").
The .sub(...) method of a pattern is usually used like pattern.sub('new text', 'my input string that I want changed'). You can also use a function in place of 'new text', in which case the function is called with the match object as an argument, and the value returned by the function is used as the replacement string.
The expression capital_letter_pattern.sub(lambda match: ' ' + match[0].lower(), string) replaces all matches (all capital letters except at the start of the line) using a lambda function to add a space before and make the letter lowercase. match[0] means "the entirety of the matched text", which in this case is the captial letter.
You can split it via Regex using r"(?<!^)(?=[A-Z])" pattern:
import re
txt = 'ThisIsMyChar'
c = re.compile(r"(?<!^)(?=[A-Z])")
first, *rest = map(str.lower, c.split(txt))
print(f'{first.title()} {" ".join(rest)}')
Pattern explanation:
(?<!^) checks to see if it is not at the beginning.
(?=[A-Z]) checks to see there a capital letter after it.
note These are non-capturing groups.

Adding a space between string words

Write a function named string_processing that takes a list of
strings as input and returns an all-lowercase string with no
punctuation. There should be a space between each word. You do not
have to check for edge cases.
Here is my code:
import string
def string_processing(string_list):
str1 = ""
for word in string_list:
str1 += ''.join(x for x in word if x not in string.punctuation)
return str1
string_processing(['hello,', 'world!'])
string_processing(['test...', 'me....', 'please'])
My output:
'helloworld'
'testmeplease'
Expected output:
'hello world'
'test me please'
How to add a space in just between words?
You just need to keep all the words separate and then join them later with a space between them:
import string
def string_processing(string_list):
ret = []
for word in string_list:
ret.append(''.join(x for x in word if x not in string.punctuation))
return ' '.join(ret)
print(string_processing(['hello,', 'world!']))
print(string_processing(['test...', 'me....', 'please']))
Output:
hello world
test me please
Using regex, remove every non-letter and then join with a space:
import re
def string_processing(string_list):
return ' '.join(re.sub(r'[^a-zA-Z]', '', word) for word in string_list)
print(string_processing(['hello,', 'world!']))
print(string_processing(['test...', 'me....', 'please']))
Gives:
hello world
test me please
Try:
import string
def string_processing(string_list):
str1 = ""
for word in string_list:
st = ''.join(x for x in word if x not in string.punctuation)
str1 += f"{st} " #<-------- here
return str1.rstrip() #<------- here
string_processing(['hello,', 'world!'])
string_processing(['test...', 'me....', 'please'])
using regex:
import re
li = ['hello...,', 'world!']
st = " ".join(re.compile('\w+').findall("".join(li)))
The following code could help.
import string
def string_processing(string_list):
for i,word in enumerate(string_list):
string_list[i] = word.translate(str.maketrans('', '', string.punctuation)).lower()
str1 = " ".join(string_list)
return str1
string_processing(['hello,', 'world!'])
string_processing(['test...', 'me....', 'please'])
We can use the re library to process the words and add a space between them
import re
string = 'HelloWorld'
print(re.sub('([A-Z])', r' \1', string))
Output:
Hello World

Why does str.capitalize() not work as I expect?

Please, let me know if I'm not providing enough information. The goal of the program is to capitalize the first letter of every sentence.
usr_str = input()
def fix_capitalization(usr_str):
list_of_sentences = usr_str.split(".")
list_of_sentences.pop() #remove last element: ""
new_str = ''
for sentence in list_of_sentences:
new_str += sentence.capitalize() + "."
return new_str
print(fix_capitalization(usr_str))
For instance, if I input "hi. hello. hey." I expect it to output "Hi. Hello. Hey." but instead, it outputs "Hi. hello. hey."
An alternative would be to build a list of strings then concatenate them:
def fix_capitalization(usr_str):
list_of_sentences = usr_str.split(".")
output = []
for sentence in list_of_sentences:
new_sentence = sentence.strip().capitalize()
# If empty, don't bother
if new_sentence:
output.append(new_sentence)
# Finally, join everything
return ". ".join(output) +"."
You've entered the sentences with spaces between them. Now when you split the list the list at the '.' character the spaces are still remaining. I checked what the elements in the list were when you split it and the result was this.
'''
['hi', ' hello', ' hey', '']
'''

Python: Find the longest word in a string

I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.
For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.
Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.
Any suggestions? I'm lost i.e. I don't have any code.
Thanks.
EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.
EDIT2: Okay, with your help I think I'm onto something...
def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist
But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.
I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?
You can try to use regular expressions:
import re
string = "Hello I like cookies"
word_pattern = "\w+"
regex = re.compile(word_pattern)
words_found = regex.findall(string)
if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)
Finding a max in one pass is easy:
current_max = 0
for v in values:
if v>current_max:
current_max = v
But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:
current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word
A final way is to split words in a generator and find the max of what it yields (here I used the max function):
def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)
Just search for groups of non-whitespace characters, then find the maximum by length:
longest = len(max(re.findall(r'\S+',string), key = len))
For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.
def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))
It's quite simple:
def long_word(s):
n = max(s.split())
return(n)
IN [48]: long_word('a bb ccc dddd')
Out[48]: 'dddd'
found an error in a previous provided solution, he's the correction:
def longestWord(text):
current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest
I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.
An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.
Regular Expressions seems to be your best bet. First use re to split the sentence:
>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)
\S+ looks for all the non-whitespace characters and puts them in a list:
>>> string
['Hello', 'I', 'like', 'cookies']
Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:
>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']
This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.
s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c
print "Longest word:", maxWord
print "Length:", len(maxWord)
Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.
I do not want to solve your exercise for you, but here are a few pointers:
Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.
My proposal ...
import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")
import re
def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n
print(longest_word("Hey!! there, How is it going????"))
Output : there
Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.
findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.
Variable "n" finds the longest word from the given string which is now free of any undesired characters.
Variable "n" uses lambda expressions to define the key len() here.
Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.
>>>#import regular expressions for the problem.**
>>>import re
>>>#initialize a sentence
>>>sen = "fun&!! time zone"
>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.
>>>res
Out: ['fun','time','zone']
>>>n = max(res)
Out: zone
>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.
>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time
Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.
list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]
Output - Independence

String reverse in Python

Write a simple program that reads a line from the keyboard and outputs the same line where
every word is reversed. A word is defined as a continuous sequence of alphanumeric characters
or hyphen (‘-’). For instance, if the input is
“Can you help me!”
the output should be
“naC uoy pleh em!”
I just tryed with the following code, but there are some problem with it,
print"Enter the string:"
str1=raw_input()
print (' '.join((str1[::-1]).split(' ')[::-2]))
It prints "naC uoy pleh !em", just look the exclamation(!), it is the problem here. Anybody can help me???
The easiest is probably to use the re module to split the string:
import re
pattern = re.compile('(\W)')
string = raw_input('Enter the string: ')
print ''.join(x[::-1] for x in pattern.split(string))
When run, you get:
Enter the string: Can you help me!
naC uoy pleh em!
You could use re.sub() to find each word and reverse it:
In [8]: import re
In [9]: s = "Can you help me!"
In [10]: re.sub(r'[-\w]+', lambda w:w.group()[::-1], s)
Out[10]: 'naC uoy pleh em!'
My answer, more verbose though. It handles more than one punctuation mark at the end as well as punctuation marks within the sentence.
import string
import re
valid_punctuation = string.punctuation.replace('-', '')
word_pattern = re.compile(r'([\w|-]+)([' + valid_punctuation + ']*)$')
# reverses word. ignores punctuation at the end.
# assumes a single word (i.e. no spaces)
def word_reverse(w):
m = re.match(word_pattern, w)
return ''.join(reversed(m.groups(1)[0])) + m.groups(1)[1]
def sentence_reverse(s):
return ' '.join([word_reverse(w) for w in re.split(r'\s+', s)])
str1 = raw_input('Enter the sentence: ')
print sentence_reverse(str1)
Simple solution without using re module:
print 'Enter the string:'
string = raw_input()
line = word = ''
for char in string:
if char.isalnum() or char == '-':
word = char + word
else:
if word:
line += word
word = ''
line += char
print line + word
you can do this.
print"Enter the string:"
str1=raw_input()
print( ' '.join(str1[::-1].split(' ')[::-1]) )
or then, this
print(' '.join([w[::-1] for w in a.split(' ') ]))

Categories