Converting regex whitespace characters from list into string - python

So i want to convert regex whitespaces into a string for example
list1 = ["Hello","\s","my","\s","name","\s","is"]
And I want to convert it to a string like
"Hello my name is"
Can anyone please help.
But also if there was characters such as
"\t"
how would i do this?

list = ["Hello","\s","my","\s","name","\s","is"]
str1 = ''.join(list).replace("\s"," ")
Output :
>>> str1
'Hello my name is'
Update :
If you have something like this list1 = ["Hello","\s","my","\s","name","\t","is"] then you can use multiple replace
>>> str1 = ''.join(list).replace("\s"," ").replace("\t"," ")
>>> str1
'Hello my name is'
or if it's only \t
str1 = ''.join(list).replace("\t","anystring")

I would highly recommend using the join string function mentioned in one of the earlier answers, as it is less verbose. However, if you absolutely needed to use regex in order to complete the task, here's the answer:
import re
list1 = ["Hello","\s","my","\s","name","\s","is"]
list_str = ''.join(list1)
updated_str = re.split('\\\s', list_str)
updated_str = ' '.join(updated_str)
print(updated_str)
Output is:
'Hello my name is'
In order to use raw string notation, replace the 5th line of code with the one below:
updated_str = re.split(r'\\s', list_str)
Both will have the same output result.

You don't even need regular expressions for that:
s = ' '.join([item for item in list if item is not '\s'])
Please note that list is an invalid name for a variable in python as it conflicts with the list function.

Related

Remove String between two characters for all occurrences

I am looking for help on string manipulation in Python 3.
Input String
s = "ID bigint,FIRST_NM string,LAST_NM string,FILLER1 string"
Desired Output
s = "ID,FIRST_NM,LAST_NM,FILLER1"
Basically, the objective is to remove anything between space and comma at all occurrences in the input string.
Any help is much appreciated
using simple regex
import re
s = "ID bigint,FIRST_NM string,LAST_NM string,FILLER1 string"
res = re.sub('\s\w+', '', s)
print(res)
# output ID,FIRST_NM,LAST_NM,FILLER1
You can use regex
import re
s = "ID bigint,FIRST_NM string,LAST_NM string,FILLER1 string"
s = ','.join(re.findall('\w+(?= \w+)', s))
print(s)
Output:
ID,FIRST_NM,LAST_NM,FILLER1

How to replace/delete a string in python

how can I replace/delete a part of a string, like this
string = '{DDBF1F} this is my string {DEBC1F}'
#{DDBF1F} the code between Parentheses is random, I only know it is made out of 6 characters
the output should be
this is my string
I tried this, I know it doesn't work, but I tried :3
string = '{DDBF1F} Hello {DEBC1F}'
string.replace(f'{%s%s%s%s%s%s}', 'abc')
print(string)
Use the re library to perform a regex replace, like this:
import re
text = '{DDBF1F} Hello {DEBC1F}'
result = re.sub(r"(\s?\{[A-F0-9]{6}\}\s?)", "", text)
print(result)
If the length of the strings within the brackets is fixed, you can use slicing to get the inner substring:
>>> string = '{DDBF1F} this is my string {DEBC1F}'
>>> string[8:-8]
' this is my string '
(string[9:-9] if you want to remove the surrounding spaces)
If hardcoding the indexes feels bad, they can be derived using str.index (if you can be certain that the string will not contain an embedded '}'):
>>> start = string.index('}')
>>> start
7
>>> end = string.index('{', start)
>>> end
27
>>> string[start+1:end]
' this is my string '
This code works
string = '{DDBF1F} this is my string {DEBC1F}'
st=string.split(' ')
new_str=''
for i in st:
if i.startswith('{') and i.endswith('}'):
pass
else:
new_str=new_str+" "+ i
print(new_str)

Slice string at last digit in Python

So I have strings with a date somewhere in the middle, like 111_Joe_Smith_2010_Assessment and I want to truncate them such that they become something like 111_Joe_Smith_2010. The code that I thought would work is
reverseString = currentString[::-1]
stripper = re.search('\d', reverseString)
But for some reason this doesn't always give me the right result. Most of the time it does, but every now and then, it will output a string that looks like 111_Joe_Smith_2010_A.
If anyone knows what's wrong with this, it would be super helpful!
You can use re.sub and $ to match and substitute alphabetical characters
and underscores until the end of the string:
import re
d = ['111_Joe_Smith_2010_Assessment', '111_Bob_Smith_2010_Test_assessment']
new_s = [re.sub('[a-zA-Z_]+$', '', i) for i in d]
Output:
['111_Joe_Smith_2010', '111_Bob_Smith_2010']
You could strip non-digit characters from the end of the string using re.sub like this:
>>> import re
>>> re.sub(r'\D+$', '', '111_Joe_Smith_2010_Assessment')
'111_Joe_Smith_2010'
For your input format you could also do it with a simple loop:
>>> s = '111_Joe_Smith_2010_Assessment'
>>> i = len(s) - 1
>>> while not s[i].isdigit():
... i -= 1
...
>>> s[:i+1]
'111_Joe_Smith_2010'
You can use the following approach:
def clean_names():
names = ['111_Joe_Smith_2010_Assessment', '111_Bob_Smith_2010_Test_assessment']
for name in names:
while not name[-1].isdigit():
name = name[:-1]
print(name)
Here is another solution using rstrip() to remove trailing letters and underscores, which I consider a pretty smart alternative to re.sub() as used in other answers:
import string
s = '111_Joe_Smith_2010_Assessment'
new_s = s.rstrip(f'{string.ascii_letters}_') # For Python 3.6+
new_s = s.rstrip(string.ascii_letters+'_') # For other Python versions
print(new_s) # 111_Joe_Smith_2010

Re format a python string after splitting

I have string like "week32_Aug_24_2016".
I want to change this string like "week32_2016_Aug_24"
I have tried this.
str = "week32_Aug_24_2016"
wk = str.split('_')
newstr = wk[0]+" "+wk[3]+" "+wk[1]+" "+wk[2]
My expected output is "week32 2016 Aug 24".
I already got that but I want to know is there any better way to do this. Suppose I have long string and no of split value is 10, then this is very long way. So I want to know a better way to arrange the split values. Thanks....
You can make use of both str.split() and str.join(), simply like this:
string = "week32_Aug_24_2016"
order = (0, 3, 1, 2)
parts = string.split('_')
new_string = ' '.join(parts[i] for i in order)
Also, note that I renamed your variable str to string to avoid shadowing built-in str class.
str = "week32_Aug_24_2016"
wk = str.split('_')
i=0;
newstr="";
while(i<len(wk)):
newstr=newstr+wk[i]+" "
i=i+1
print newstr
Something like that. If you want to keep last string in the middle it can be done.
>>> import re
>>> string = "week32_Aug_24_2016"
>>> re.sub(r'_(.*)_(.*)_(.*)', r' \3 \1 \2', string)
'week32 2016 Aug 24'
You can make use str.split() and format() as well:
str = "week32_Aug_24_2016"
wk = str.split('_')
newstr = "{} {} {} {}".format(wk[0], wk[3], wk[1], wk[2])

Insert a string before a substring of a string

I want to insert some text before a substring in a string.
For example:
str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
I want:
str = "thisissomeXXtextthatiwrote"
Assuming substr can only appear once in str, how can I achieve this result? Is there some simple way to do this?
my_str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
idx = my_str.index(substr)
my_str = my_str[:idx] + inserttxt + my_str[idx:]
ps: avoid using reserved words (i.e. str in your case) as variable names
Why not use replace?
my_str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
my_str.replace(substr, substr + inserttxt)
# 'thisissometextXXthatiwrote'
Use str.split(substr) to split str to ['thisissome', 'thatiwrote'], since you want to insert some text before a substring, so we join them with "XXtext" ((inserttxt+substr)).
so the final solution should be:
>>>(inserttxt+substr).join(str.split(substr))
'thisissomeXXtextthatiwrote'
if you want to append some text after a substring, just replace with:
>>>(substr+appendtxt).join(str.split(substr))
'thisissometextXXthatiwrote'
With respect to the question (were ´my_str´ is the variable), the right is:
(inserttxt+substr).join(**my_str**.split(substr))

Categories