Split string but replace with another string and get list - python

I am trying to split a string but it should be replaced to another string and return as a list. Its hard to explain so here is an example:
I have string in variable a:
a = "Hello World!"
I want a list such that:
a.split("Hello").replace("Hey") == ["Hey"," World!"]
It means I want to split a string and write another string to that splited element in the list. SO if a is
a = "Hello World! Hello Everybody"
and I use something like a.split("Hello").replace("Hey") , then the output should be:
a = ["Hey"," World! ","Hey"," Everybody"]
How can I achieve this?

From your examples it sounds a lot like you want to replace all occurrences of Hello with Hey and then split on spaces.
What you are currently doing can't work, because replace needs two arguments and it's a method of strings, not lists. When you split your string, you get a list.
>>> a = "Hello World!"
>>> a = a.replace("Hello", "Hey")
>>> a
'Hey World!'
>>> a.split(" ")
['Hey', 'World!']

x = "HelloWorldHelloYou!"
y = x.replace("Hello", "\nHey\n").lstrip("\n").split("\n")
print(y) # ['Hey', 'World', 'Hey', 'You!']
This is a rather brute-force approach, you can replace \n with any character you're not expecting to find in your string (or even something like XXXXX). The lstrip is to remove \n if your string starts with Hello.
Alternatively, there's regex :)

this functions can do it
def replace_split(s, old, new):
return sum([[blk, new for blk] in s.split(old)], [])[:-1]

It wasnt clear if you wanted to split by space or by uppercase.
import re
#Replace all 'Hello' with 'Hey'
a = 'HelloWorldHelloEverybody'
a = a.replace('Hello', 'Hey')
#This will separate the string by uppercase character
re.findall('[A-Z][^A-Z]*', a) #['Hey', 'World' ,'Hey' ,'Everybody']

You can do this with iteration:
a=a.split(' ')
for word in a:
if word=='Hello':
a[a.index(word)]='Hey'

Related

convert word to letter

Code:
mapping = {'hello':'a', 'world':'b'}
string = 'helloworld'
out = ' '.join(mapping.get(s,s) for s in string.split())
print(out)
What I want to happen is that string = 'helloworld'gets printed as ab
What I get as the output is 'helloworld'
The reason is I don't have a space between the string hello and world but I don't want a space in-between them. Can anyone help?
A crude solution in this case would be to simply replace according to the mapping.
def replace(string, mapping):
for k, v in mapping.items():
string = string.replace(k, v)
return string
You get the output helloworld because string.split() from your code returns 'helloworld', which is what you already had to start with. The split does nothing because split, by default, will split a string by a space; of which there are none in your string.
>>>'helloworld'.split()
['helloworld']
If you change the following line, you will get a b as your output.
string = 'hello world'
I am going to assume that you say that you "don't want a space in between them", you mean that you don't want a space between a and b. To achieve that, change the following line:
out = ''.join(mapping.get(s,s) for s in string.split())
This will output ab

How can I replace the first occurrence of a character in every word?

How can I replace the first occurrence of a character in every word?
Say I have this string:
hello #jon i am ##here or ###there and want some#thing in '#here"
# ^ ^^ ^^^ ^ ^
And I want to remove the first # on every word, so that I end up having a final string like this:
hello jon i am #here or ##there and want something in 'here
# ^ ^ ^^ ^ ^
Just for clarification, "#" characters always appear together in every word, but can be in the beginning of the word or between other characters.
I managed to remove the "#" character if it occurs just once by using a variation of the regex I found in Delete substring when it occurs once, but not when twice in a row in python, which uses a negative lookahead and negative lookbehind:
#(?!#)(?<!##)
See the output:
>>> s = "hello #jon i am ##here or ###there and want some#thing in '#here"
>>> re.sub(r'#(?!#)(?<!##)', '', s)
"hello jon i am ##here or ###there and want something in 'here"
So the next step is to replace the "#" when it occurs more than once. This is easy by doing s.replace('##', '#') to remove the "#" from wherever it occurs again.
However, I wonder: is there a way to do this replacement in one shot?
I would do a regex replacement on the following pattern:
#(#*)
And then just replace with the first capture group, which is all continous # symbols, minus one.
This should capture every # occurring at the start of each word, be that word at the beginning, middle, or end of the string.
inp = "hello #jon i am ##here or ###there and want some#thing in '#here"
out = re.sub(r"#(#*)", '\\1', inp)
print(out)
This prints:
hello jon i am #here or ##there and want something in 'here
How about using replace('#', '', 1) in a generator expression?
string = 'hello #jon i am ##here or ###there and want some#thing in "#here"'
result = ' '.join(s.replace('#', '', 1) for s in string.split(' '))
# output: hello jon i am #here or ##there and want something in "here"
The int value of 1 is the optional count argument.
str.replace(old, new[, count])
Return a copy of the string with all
occurrences of substring old replaced by new. If the optional argument
count is given, only the first count occurrences are replaced.
You can use re.sub like this:
import re
s = "hello #jon i am ##here or ###there and want some#thing in '#here"
s = re.sub('#(\w)', r'\1', s)
print(s)
That will result in:
"hello jon i am #here or ##there and want something in 'here"
And here is a proof of concept:
>>> import re
>>> s = "hello #jon i am ##here or ###there and want some#thing in '#here"
>>> re.sub('#(\w)', r'\1', s)
"hello jon i am #here or ##there and want something in 'here"
>>>
Was pondering for cases what if only the last char is # and you don't want to remove it, or you have specific allowed starting chars, came up with this:
>>> ' '.join([s_.replace('#', '', 1) if s_[0] in ["'", "#"] else s_ for s_ in s.split()])
"hello jon i am #here or ##there and want some#thing in 'here"
Or, suppose you want to replace # only if it is in first n characters
>>> ' '.join([s_.replace('#', '', 1) if s_.find('#') in range(2) else s_ for s_ in s.split()])
"hello jon i am #here or ##there and want some#thing in 'here"
DEMO
(?<!#)#
You can try this.
See demo.
# Python3 program to remove the # from String
def ExceptAtTheRate(string):
# Split the String based on the space
arrOfStr = string.split()
# String to store the resultant String
res = ""
# Traverse the words and
# remove the first # From every word.
for a in arrOfStr:
if(a[0]=='#'):
res += a[1:len(a)] + " "
else:
res += a[0:len(a)] + " "
return res
# Driver code
string = "hello #jon i am ##here or ###there and want some#thing in '#here"
print(ExceptAtTheRate(string))
Output:

Given a string pattern with a variable, how to match and find variable string using python?

pattern = "world! {} "
text = "hello world! this is python"
Given the pattern and text above, how do I produce a function that would take pattern as the first argument and text as the second argument and output the word 'this'?
eg.
find_variable(pattern, text) ==> returns 'this' because 'this'
You may use this function that uses string.format to build a regex with a single captured group:
>>> pattern = "world! {} "
>>> text = "hello world! this is python"
>>> def find_variable(pattern, text):
... return re.findall(pattern.format(r'(\S+)'), text)[0]
...
>>> print (find_variable(pattern, text))
this
PS: You may want to add some sanity checks inside your function to validate string format and a successful findall.
Code Demo
Not a one liner like anubhava's but using basic python knowledge:
pattern="world!"
text="hello world! this is python"
def find_variabel(pattern,text):
new_text=text.split(' ')
for x in range(len(new_text)):
if new_text[x]==pattern:
return new_text[x+1]
print (find_variabel(pattern,text))

Converting regex whitespace characters from list into string

So i want to convert regex whitespaces into a string for example
list1 = ["Hello","\s","my","\s","name","\s","is"]
And I want to convert it to a string like
"Hello my name is"
Can anyone please help.
But also if there was characters such as
"\t"
how would i do this?
list = ["Hello","\s","my","\s","name","\s","is"]
str1 = ''.join(list).replace("\s"," ")
Output :
>>> str1
'Hello my name is'
Update :
If you have something like this list1 = ["Hello","\s","my","\s","name","\t","is"] then you can use multiple replace
>>> str1 = ''.join(list).replace("\s"," ").replace("\t"," ")
>>> str1
'Hello my name is'
or if it's only \t
str1 = ''.join(list).replace("\t","anystring")
I would highly recommend using the join string function mentioned in one of the earlier answers, as it is less verbose. However, if you absolutely needed to use regex in order to complete the task, here's the answer:
import re
list1 = ["Hello","\s","my","\s","name","\s","is"]
list_str = ''.join(list1)
updated_str = re.split('\\\s', list_str)
updated_str = ' '.join(updated_str)
print(updated_str)
Output is:
'Hello my name is'
In order to use raw string notation, replace the 5th line of code with the one below:
updated_str = re.split(r'\\s', list_str)
Both will have the same output result.
You don't even need regular expressions for that:
s = ' '.join([item for item in list if item is not '\s'])
Please note that list is an invalid name for a variable in python as it conflicts with the list function.

How to remove line break in array

While running code I got an output,
vivek
Hello World!
There is a line break between "vivek" and "hello World", but I want an output without a line break
vivek
Hello World
like above
# Hello World program in Python
arr =['vivek\n','singh\n']
arr[0].replace('\n','')
print arr[0]
print "Hello World!"
Just replace the line
arr[0].replace('\n','')
by
arr[0] = arr[0].replace('\n', '')
as str.replace does only return a modified copy and not modify the original. See str.replace documentation.
Other suggestions
You could also use str.strip to remove surrounding whitespaces. A neat way to remove all surrounding whitespaces for a list is
yourlist = [strelement.strip() for strelement in yourlist]
This is called a list comprehension.
You might also want to use print as a function instead of a statement. So you use print("whatever") instead of print "whatever". The print function works with Python 2 and Python 3, whereas the statement works only in Python 2.
Then you might want to take a look at http://pep8online.com/ and https://www.python.org/dev/peps/pep-0008/
Because attr[0] is a string, string.replace just return a copy of updated string, but not change original string.
You can also remove using Regular Expression:
>>> import re
>>> result = re.sub(u"\u005cn", r"", "vivek\n Hello World!")
>>> result
'Vivek Hello World!'
You can also do split and join that will remove the new line from the list:
arr = ['vivek\n', 'singh\n']
arr = ''.join(arr).split('\n')
print(arr[0] + " Hello World!")
or
print(arr[0])
print(" Hello World!")
The question is about removing the new line from the strings in the list. split and join is the most pythonic way available to do so.
You can use regular expressions to filter out unwanted characters from your string.
import re
s = "hello\nworld"
print re.sub(r'\n',' ',s)
will return : "hello world"
So you need to just do
print re.sub(r'\n','',arr[0])

Categories